We can make this atomic easily by just using the distributed semaphore abstraction
You mean using in_lock helper? Good idea, when I wrote this I had never used in_lock before. I'll give that a try. Thank you!
Right. I tried making the find/create/update operation atomic, but failed in doing so.
Yes, this is the reason why I created a single worker to handle all tracked context events, with concurrency handling. Are we seeing any issues?
Is
vulnerability_reads.uuidalways equal tovulnerabilities_findings.uuid
I believe so, uuid for related vulnerability-vulnerability_read-vulnerability_occurrence-security_finding are all same. This is also set during ingestion and is calculated using this service. There are some discussions around how it will change with multiple branch tracking, hence the comment from @schmil.monderer here.
For now we want to keep it similar to the old preloaders.
vulnerability_reads will always be associated with an vulnerability_occurrence. This is ensured during ingestion and the table is also backfilled.
https://console.postgres.ai/gitlab/gitlab-production-sec/sessions/49885/commands/148338
@dgruzd Can you do the Advanced Search Framework review on this MR please? We are adding separate pre-loaders for the new vulnerability_reads index. The pre-loaders are all the same with some minor changes to the query. These will be used in the new reference class being introduced here. The old pre-loaders will be cleaned up as part of #592421
@bala.kumar I've addressed your comments, can you please take a look again.
Rushik Subba (e2ef59ed) at 16 Mar 13:08
Use reference class index method in type class
... and 3833 more commits
suggestion: I think we can have a common CustomAttributePolicy parent class and inherit the rules from it in other classes? wdyt?
@bala.kumar I created new preloaders now which should isolate any risks with old index now.
@schmil.monderer Would appreciate another look from you
Rushik Subba (40578357) at 16 Mar 09:53
Add ES preloaders for vulnerability_reads index
... and 1036 more commits
@bala.kumar I thought of this too, but if you look closely at this -
@records_map = {
vulnerability_occurrence_id: @records.map(&:vulnerability_occurrence_id),
project_id: @records.map(&:project_id).uniq,
vulnerability_id: @records.map(&:vulnerability_id).uniq
}
We will be running map 3 times on each and every preloader class initialization, which isn't required. So I kept it so that each class has its own mapping and is done exactly once per class.
@bala.kumar Then I think creating new preloaders for the new index index and not touching the old index would be simpler. wdyt?
@bwill I ran a query on postgres.ai to test whether we have reads associated with multiple tracked contexts for a given project -
select id,
project_id,
exists (
select 1 from vulnerability_reads vr where vr.project_id = sptc.project_id and vr.security_project_tracked_context_id = sptc.id
) has_vulnerability_reads
from security_project_tracked_contexts sptc
where project_id in (
select project_id from security_project_tracked_contexts group by project_id having count (*) > 1 limit 100
);
Now I'm wondering whether we should delete the data associated with duplicate contexts instead of migrating
Note: sbom_occurrences_vulnerabilities table is not backfilled with vulnerability_occurrence_id yet. So we first check if occurrences by occ id exist, if not we fallback to occurrences by vulnerability id.
Rushik Subba (6f0b1122) at 13 Mar 08:12
Add index on vuln_occ_id on sbom_occurrence_vulnerabilities
... and 653 more commits
@Quintasan
@julie_huang Thanks! database looks to me.
@abdwdd Can you do the database maintainer review on this MR?