The issue with two different jobs makes sense to me. I'm wondering if we could leverage some row-locking with SELECT FOR UPDATE
🤔 Any job that is updating any amount of seats could have exclusive rights to read and write updates to the seats. With that, we would have solved the problem.
@lwanko That's a good thought, SELECT FOR UPDATE could be relevant here. It may also not help though, because rows could be added to the members while we're reading the data...
This problem might also highlight the importance of batching the updates in small batches by user_id. That might go a long way to solve any potential problem.
I just queried some data about the number of users per namespace from GitLab.com. [...]
That's a cool query.
I think it may have some issues with it too though.
I can't recall exactly right now, but I don't think member_namespace_id is the root namespace id. Here's a thread that's kind of relevant to that: https://gitlab.com/groups/gitlab-org/-/work_items/18535#note_2652399973
I think it's more like just the namespace_id, but it could point to a project or group namespace.
So I think 25,550 is the most members in a single group, not the most members in an entire hierarchy on gitlab.com.
For example, if it would only take 5-10 seconds, I think it could be okay to block all membership updates. During this time, we could block the customers from changing memberships by disabling buttons and informing them of what is happening. I wonder if those preparations would cost less time than our current approach.
Generally speaking, I think how we would do all of this is the crux of the matter. It's worth thinking about for sure, but it could end up being just as much trouble as the data syncing approach.
One nice thing about the data syncing is we can start displaying seats to users in a read-only manner.
@anton I finished giving this another review. I think it is looking really good. Hope we can finish it up and get it merged soon.
Yeah, that's perfect, thanks!
Sounds like having more than 50 would be quite the edge case indeed. I like that you put the defensive measure here just in case.
@anton / @hptrs Thank you both, this makes sense. Sounds like we don't have an airtight guarantee anywhere, but they're probably always integers.
I think we can just go with it as is for now, and if we found out we get something we need to URL encode, we can fix it. I think the worst that would happen at the moment is requests we make to Jira would fail.
Sorry, I think I must not have read this very carefully when I made that comment. You're right, these come from the database, I think this is fine.
However the upstream
vulnerability_idsprobably should be? I didn't see anything in theProjects::Integrations::Jira::IssuesControllerto only allow integers. But that could be a follow up item.
Makes sense. Yeah, that might be worth looking into as a follow up.
We're not actually altering ee/lib/api/gitlab_subscriptions/subscriptions.rb, and we don't save the value of contract_overages_allowed for a POST request to that endpoint.
We don't pass the param in this test, and I think the test passes because the default for contract_overages_allowed is true.
So basically I think we need to either fix up this test coverage and/or fix the endpoint to actually save the value. If we need it right now. I'm not sure what the difference is between this and the provision endpoint.
@lwanko Thanks for your responses! I had some more comments and questions. Back to you!
I see you wrapped things in its own specific describe block for just contract_overages_allowed.
I still think the next person who comes along and wants to add a new test will have to refactor this some to do it... but maybe not too much.
This looks pretty good. I like the changes you made. Thanks for doing that.
nit: I think you can delete this:
Makes sense.
@lwanko Thanks, I have a follow up question here. What is the difference between the ee/lib/gitlab_subscriptions/api/internal/namespaces/provision.rb endpoint and the ee/lib/api/gitlab_subscriptions/subscriptions.rb endpoint?
They look kind of similar. What's the point of each? When do we use them?
@m_frankiewicz I don't think I see any problem with adding it. Maybe it could cause some confusion as people won't know what it's for and it has an interesting name?
I just wonder if we want to expose it if we don't have a reason to. It seems like it's for internal use at this point.
If you don't see any issue with exposing it from a product perspective, then I think it's fine.
We're actually passing the project's ID not the project namespace's ID
😅 .The variable is named
project_namespacebut it's created with theprojectfactory, soproject_namespace.idreturns theProjectrecord's ID.
@karichards If I change the test so it looks like this...
context 'when namespace is a project namespace' do
it 'returns a 404 error' do
project = create(:project, namespace: namespace)
project_namespace = project.project_namespace
puts project.class
puts project.id
puts project_namespace.class
puts project_namespace.id
put subscription_path(project_namespace.id), headers: internal_api_headers, params: { seats: 150 }
expect(response).to have_gitlab_http_status(:not_found)
expect(json_response).to eq('message' => '404 Namespace Not Found')
end
end
diff --git a/ee/spec/requests/api/gitlab_subscriptions/subscriptions_spec.rb b/ee/spec/requests/api/gitlab_subscriptions/subscriptions_spec.rb
index f52a755769e4..feff76df3f10 100644
--- a/ee/spec/requests/api/gitlab_subscriptions/subscriptions_spec.rb
+++ b/ee/spec/requests/api/gitlab_subscriptions/subscriptions_spec.rb
@@ -214,7 +214,13 @@ def subscription_path(namespace_id, user = nil, options = {})
context 'when namespace is a project namespace' do
it 'returns a 404 error' do
- project_namespace = create(:project, namespace: namespace).project_namespace
+ project = create(:project, namespace: namespace)
+ project_namespace = project.project_namespace
+
+ puts project.class
+ puts project.id
+ puts project_namespace.class
+ puts project_namespace.id
put subscription_path(project_namespace.id), headers: internal_api_headers, params: { seats: 150 }
... and run it, I get output like this:
Test environment set up in 1.6838129999996454 seconds
Project
8
Namespaces::ProjectNamespace
29
.
Finished in 9.7 seconds (files took 4.84 seconds to load)
1 example, 0 failures
Randomized with seed 42496
Similarly if I do this:
context 'when namespace is a project namespace' do
it 'returns a 404 error' do
project_namespace = create(:project, namespace: namespace).project_namespace
puts project_namespace.class
puts project_namespace.id
put subscription_path(project_namespace.id), headers: internal_api_headers, params: { seats: 150 }
expect(response).to have_gitlab_http_status(:not_found)
expect(json_response).to eq('message' => '404 Namespace Not Found')
end
end
Test environment set up in 1.571896999997989 seconds
Namespaces::ProjectNamespace
31
.
Finished in 9.77 seconds (files took 4.69 seconds to load)
1 example, 0 failures
Randomized with seed 20328
So I think project_namespace is a ProjectNamespace, not a Project.
@lwanko I think this is the line the ProjectNamespace specs are really testing: https://gitlab.com/gitlab-org/gitlab/-/blob/27f744dcbe01438e5c494c9ae8b682bd7928d6f6/lib/api/helpers.rb#L294
That's in the API's find_namespace function.
If I change it like this:
diff --git a/lib/api/helpers.rb b/lib/api/helpers.rb
index 289cb321f294..03cc4f0cdc13 100644
--- a/lib/api/helpers.rb
+++ b/lib/api/helpers.rb
@@ -283,7 +283,7 @@ def find_namespace(id, allow_project_namespaces: false)
# See https://gitlab.com/gitlab-org/customers-gitlab-com/-/issues/9808
::Namespace.sticking.find_caught_up_replica(:namespace, id)
- scope = allow_project_namespaces ? Namespace : Namespace.without_project_namespaces
+ scope = allow_project_namespaces ? Namespace : Namespace
scope.find_by(id: id)
else
find_namespace_by_path(id, allow_project_namespaces: allow_project_namespaces)
And run ee/spec/requests/api/gitlab_subscriptions/subscriptions_spec.rb, then all the spec failures are the ProjectNamespace related ones:
Failed examples:
rspec ./ee/spec/requests/api/gitlab_subscriptions/subscriptions_spec.rb:358 # API::GitlabSubscriptions::Subscriptions PUT :id/gitlab_subscription when authenticating with a personal access token when authenticated as an admin when namespace is a project namespace returns a 404 error
rspec ./ee/spec/requests/api/gitlab_subscriptions/subscriptions_spec.rb:216 # API::GitlabSubscriptions::Subscriptions PUT :id/gitlab_subscription when authenticated as the subscription portal when namespace is a project namespace returns a 404 error
rspec ./ee/spec/requests/api/gitlab_subscriptions/subscriptions_spec.rb:40 # API::GitlabSubscriptions::Subscriptions POST :id/gitlab_subscription when authenticated as the subscription portal when creating subscription for project namespace returns a 404
rspec ./ee/spec/requests/api/gitlab_subscriptions/subscriptions_spec.rb:170 # API::GitlabSubscriptions::Subscriptions POST :id/gitlab_subscription when authenticating with a personal access token when authenticated as an admin when creating subscription for project namespace returns a 404
Randomized with seed 41156
The POST test failure is particularly egregious, as it allows us to create a subscription on a ProjectNamespace:
4) API::GitlabSubscriptions::Subscriptions POST :id/gitlab_subscription when authenticating with a personal access token when authenticated as an admin when creating subscription for project namespace returns a 404
Got 2 failures:
4.1) Failure/Error: expect(response).to have_gitlab_http_status(:not_found)
expected the response to have status code :not_found but it was 201. The response was: {"plan":{"code":null,"name":null,"trial":false,"auto_renew":null,"upgradable":false,"exclude_guests":false},"usage":{"seats_in_subscription":10,"seats_in_use":1,"max_seats_used":0,"seats_owed":0},"billing":{"subscription_start_date":"2018-01-01","subscription_end_date":"2019-01-01","trial_ends_on":null}}
# ./ee/spec/requests/api/gitlab_subscriptions/subscriptions_spec.rb:175:in `block (6 levels) in <main>'
# ./spec/spec_helper.rb:595:in `block (2 levels) in <main>'
# ./spec/spec_helper.rb:487:in `block (3 levels) in <main>'
# ./lib/gitlab/sidekiq_sharding/validator.rb:42:in `enabled'
# ./spec/spec_helper.rb:486:in `block (2 levels) in <main>'
# ./spec/spec_helper.rb:481:in `block (3 levels) in <main>'
# ./spec/support/sidekiq_middleware.rb:9:in `with_sidekiq_server_middleware'
# ./spec/spec_helper.rb:471:in `block (2 levels) in <main>'
# ./spec/spec_helper.rb:467:in `block (3 levels) in <main>'
# ./lib/gitlab/application_context.rb:109:in `with_raw_context'
# ./spec/spec_helper.rb:467:in `block (2 levels) in <main>'
# ./spec/spec_helper.rb:438:in `block (3 levels) in <main>'
# ./lib/gitlab/ci/config/feature_flags.rb:38:in `ensure_correct_usage'
# ./spec/spec_helper.rb:437:in `block (2 levels) in <main>'
# ./spec/support/system_exit_detected.rb:7:in `block (2 levels) in <main>'
# ./spec/support/database/prevent_cross_joins.rb:107:in `block (3 levels) in <main>'
# ./spec/support/database/prevent_cross_joins.rb:61:in `with_cross_joins_prevented'
# ./spec/support/database/prevent_cross_joins.rb:107:in `block (2 levels) in <main>'
4.2) Failure/Error: expect(json_response).to eq('message' => '404 Namespace Not Found')
expected: {"message"=>"404 Namespace Not Found"}
got: {"billing"=>{"subscription_end_date"=>"2019-01-01", "subscription_start_date"=>"2018-01-01", "trial_e...e}, "usage"=>{"max_seats_used"=>0, "seats_in_subscription"=>10, "seats_in_use"=>1, "seats_owed"=>0}}
(compared using ==)
Diff:
@@ -1,3 +1,5 @@
-"message" => "404 Namespace Not Found",
+"billing" => {"subscription_end_date"=>"2019-01-01", "subscription_start_date"=>"2018-01-01", "trial_ends_on"=>nil},
+"plan" => {"auto_renew"=>nil, "code"=>nil, "exclude_guests"=>false, "name"=>nil, "trial"=>false, "upgradable"=>false},
+"usage" => {"max_seats_used"=>0, "seats_in_subscription"=>10, "seats_in_use"=>1, "seats_owed"=>0},
# ./ee/spec/requests/api/gitlab_subscriptions/subscriptions_spec.rb:176:in `block (6 levels) in <main>'
We aren't deleting that one here, but it helps make my point - a ProjectNamespace is a particular, separate entity in GitLab. It is a class all its own, and we want to make sure that the API handles it appropriately.
I think these tests make a lot of sense to have and we shouldn't delete them.
Let's either figure out why they're flaky and fix them (probably best done in a separate MR) or leave them be for now and live with the flakiness and run the pipeline until they pass.
@lwanko This is looking really good!
Thanks for making some adjustments to it based on our conversation.
Sorry, but I caught one more small thing that I think we may want to adjust: !226299 (comment 3172369650)
I am also curious about this, but it's not at all blocking to this MR: !226299 (comment 3172369643)
Do we even really want this to default to true?
Because if the point is for this table to record the data from gitlab_subscriptions, then we could get true here when we create a record, even if we somehow didn't copy the data from gitlab_subscriptions, but then we'd think that we must have copied true from gitlab_subscriptions. So maybe this should not have any default (default NULL).
None of the other columns in this table have defaults (except for the ids), not even the ones that have defaults in gitlab_subscriptions.
I think it makes the most sense to remove this:
add_column :gitlab_subscription_histories, :contract_overages_allowed, :boolean, if_not_exists: trueWhat do you think?
cc @bmarjanovic just for awareness / thoughts.
We need to deploy the CustomersDot change after the adaptations to GitLab, otherwise we would actually receive a
{"error":"Invalid JSON format"}
Really? I'm surprised. I would have thought the GitLab endpoint would silently ignore extra json it receives.
I tried modifying a couple tests for the endpoints like this and they still pass:
diff --git a/ee/spec/requests/api/gitlab_subscriptions/subscriptions_spec.rb b/ee/spec/requests/api/gitlab_subscriptions/subscriptions_spec.rb
index f52a755769e4..8cda73b30bc7 100644
--- a/ee/spec/requests/api/gitlab_subscriptions/subscriptions_spec.rb
+++ b/ee/spec/requests/api/gitlab_subscriptions/subscriptions_spec.rb
@@ -57,7 +57,7 @@ def subscription_path(namespace_id, user = nil, options = {})
context 'when the params are valid' do
it 'creates a subscription for the namespace' do
- post subscription_path(namespace.id), headers: internal_api_headers, params: params
+ post subscription_path(namespace.id), headers: internal_api_headers, params: params.merge(contract_stuff: "cool")
expect(response).to have_gitlab_http_status(:created)
expect(namespace.gitlab_subscription).to be_present
diff --git a/ee/spec/requests/gitlab_subscriptions/api/internal/subscriptions_spec.rb b/ee/spec/requests/gitlab_subscriptions/api/internal/subscriptions_spec.rb
index 932536160bda..5d2fec5383af 100644
--- a/ee/spec/requests/gitlab_subscriptions/api/internal/subscriptions_spec.rb
+++ b/ee/spec/requests/gitlab_subscriptions/api/internal/subscriptions_spec.rb
@@ -153,7 +153,7 @@ def subscription_path(namespace_id)
context 'when the params are valid' do
it 'creates a subscription for the namespace' do
- post subscription_path(namespace.id), headers: internal_api_headers, params: params
+ post subscription_path(namespace.id), headers: internal_api_headers, params: params.merge(contract: "thing")
expect(response).to have_gitlab_http_status(:created)
expect(namespace.gitlab_subscription).to be_present
Am I missing something here?
How did you determine that the GitLab endpoints will return {"error":"Invalid JSON format"} if we pass them contract_overages_allowed from CDot before we merge this and Add contract_overages_allowed to gitlab_subscri... (!226507)?
It really can't hurt to make sure we merge this MR and Add contract_overages_allowed to gitlab_subscri... (!226507) first, before we merge https://gitlab.com/gitlab-org/customers-gitlab-com/-/merge_requests/15026+, but I thought we didn't need to worry about it.