Sami Hiltunen activity

Sami Hiltunen commented on epic #1745 at GitLab.com / GitLab Infrastructure Team

2026-03-19T21:09:42Z

total hours spent this week by all contributors: 40

🎉 achievements:

Claim classification is now supported in Topology Service:
- Support claim classification in Classify (gitlab-org/cells/topology-service!488 - merged)
Connecting the path-based routing handlers to Topology Service is pretty much complete but pending reviews. There's an MR applying some refactors open after which I'll push the main MR into review.
- Use Hono context through HTTP router (gitlab-org/cells/http-router!1118)

▶️ next:

There's on going work on introducing snapshot testing into HTTP router. This will make it easier to test how the path based rules match the thousands of Rails routes.
- Cover path-based routing rules with snapshot te... (gitlab-com/gl-infra/tenant-scale/cells-infrastructure/team#620)
With the implementation largely in place, define the actual path-based routing rules.
- Define the path-based routing rules (gitlab-com/gl-infra/tenant-scale/cells-infrastructure/team#610)

Sami Hiltunen commented on merge request !1010 at GitLab.org / cells / HTTP Router

2026-03-19T13:48:35Z

I've opened Documentation on HTTP routers impact on path an... (gitlab-com/gl-infra/tenant-scale/cells-infrastructure/team#706) to track this issue. Let's continue there.

Sami Hiltunen opened issue #706: Documentation on HTTP routers impact on path and token updates at GitLab.com / GitLab Infrastructure Team / GitLab Tenant Sc...

2026-03-19T13:47:37Z

Sami Hiltunen commented on merge request !490 at GitLab.org / cells / Topology Service

2026-03-19T13:38:28Z

@tkhandelwal3 this updates the documentation around Classify. Could you please take a look? 🙏

Sami Hiltunen commented on issue #704 at GitLab.com / GitLab Infrastructure Team / GitLab Tenant Scale / Cells Infrastructure team / Cells Infrastructure tea...

2026-03-19T12:18:41Z

@tkhandelwal3 I agree, and I'm of the opinion that Topology Service would be better off doing minimal validation on the routing records if it needs to do so at all. Ultimately it is up to Rails to decide what is valid data.

Sami Hiltunen pushed to project branch smh-update-classify-documentation at GitLab.org / cells / Topology Service

2026-03-19T12:00:48Z

Sami Hiltunen (a6f797e8) at 19 Mar 12:00

Update Classify RPC documentation

Sami Hiltunen opened merge request !490: Update Classify RPC documentation at GitLab.org / cells / Topology Service

2026-03-19T11:38:00Z

Update Classify RPC documentation

We've currently split the Classify RPC documentation in a Markdown file and not documented the protobuf definitions themselves. Documenting the definitions will also move the documentation into the generated files, and surface it in the editors. In addition, documenting all the types in the markdown is redundant as the protobuf definitions naturally document the types.

Move most of the RPC documentation into the proto definitions, and just keep the curl examples in the markdown.

Sami Hiltunen pushed new project branch smh-update-classify-documentation at GitLab.org / cells / Topology Service

2026-03-19T11:32:57Z

Sami Hiltunen (b4083b6f) at 19 Mar 11:32

Update Classify RPC documentation

Sami Hiltunen pushed to project branch smh-connect-topology-service at GitLab.org / cells / HTTP Router

2026-03-18T16:16:52Z

Sami Hiltunen (ebfe862b) at 18 Mar 16:16

Test falling back to JSON rules when a claim is not found

... and 20 more commits

Sami Hiltunen commented on merge request !488 at GitLab.org / cells / Topology Service

2026-03-16T16:27:03Z

I think we would need to do it beforehand, e.g. a external person is adding a claim for: !487 (merged), this claim will be used for routing, so unless we explicitly mention it in the above doc folks working in monolith won't know that there is another place they have to update this in.

@tkhandelwal3 I guess I was mostly coming from the angle that adding claims is a separate concern from using them for routing. We'd have to implement the routing logic/rules to use that new claim. That's the part when we'd need Classify to support the new claim type. We don't currently document how we'd make use of the new claim.

That's probably something we should extend the documentation to cover eventually, including how to update path rules in HTTP router. I've created Create developer documentation for defining new... (gitlab-com/gl-infra/tenant-scale/cells-infrastructure/team#701) to track this work.

On a related note, I can't think of a field that we are just claiming for global uniquness and won't be used for routing, I don't think such fields would exist, can you think of any use case for it?

Claims are purely used for routing as far as I know, so I don't think we have such cases.

Sami Hiltunen commented on merge request !488 at GitLab.org / cells / Topology Service

2026-03-16T16:27:03Z

Sure, I've opened an issue here: Replace Bucket message with Claim message (gitlab-com/gl-infra/tenant-scale/cells-infrastructure/team#700).

Sami Hiltunen commented on merge request !488 at GitLab.org / cells / Topology Service

2026-03-16T16:27:02Z

That is true, the alternative would be to use stings in the claim type too and consolidate everything to be claim, that would mean the client would need to send "1" instead 1.

@tkhandelwal3 if we type everything as strings, then there would be no practical difference to the Bucket type. The bucket types everything as a string.

Given that in the DB we are storing them as raw bytes, for my learning / understanding, why would we want to use strings or type specific attributes?

The type makes it clear what is the expected value of a field. Project ID being int64 clearly denotes this should be an int64 value. Using a string also brings questions around how is this value encoded in string. Is it decimal, hex or something else? These can of course be documented but using the appropriate type clearly signals the appropriate type of the value and prevents mistakes through compile time type checks.

Other than clarity and avoiding mistakes, using the appropriate type enables using the most efficient encoding of a given type. 9223372036854775807 is 20 bytes as a string but 8 bytes when encoded as a string.

It's also worth noting that Topology Service is already enforcing these values are integers through regex validation and parsing the integers. Using the appropriate type means to implement we don't need to manually implement type checks and parsing logic.

Given that in the DB we are storing them as raw bytes, for my learning / understanding, why would we want to use strings or type specific attributes?

For API clarity, efficiency and compile time type safety. The storage model is a different concern from how we model the API, and we can keep using the existing approach there.

Sami Hiltunen commented on merge request !488 at GitLab.org / cells / Topology Service

2026-03-16T16:27:02Z

@tkhandelwal3 to get rid of the "claim" container, we'd have to flatten the message into the in to the oneof in ClassifyRequest:

message ClassifyRequest {
  ClassifyType type = 2;
  string value = 3;

  // classification_key is the key to classify.
  oneof classification_key {
    string organization_path = 1;
    int64 organization_id = 2;
    string namespace_path = 3;
    int64 namespace_id    = 4;
    // rest of claims ...
  }
}

Flattening the fields here loses the Claim message and would mean we can't reuse the definition elsewhere. If it is a separate message, we can use it here, and also in the write side RPCs that create claims where it would replace Bucket.

The first option in your example on the other hand seems to put the claim type in the ClassifyType enum, and I'm assuming we'd keep having the string value = 3 for the value. The downside with this is that it's essentially saying that all claims are strings, and we lose the ability to type them with the most fitting type.

Extending a bit we could end up with something like below.

message ClassifyRequest {
  ClassifyType type = 2;
  oneof value {
    string string_value = 2;
    int64 int_value = 3;
  }
}

This would allow typing the claims as strings, integers or whatever is the appropriate type. The downside here is that the API doesn't really say what is the right value to use for each claim. I can set the type as organization_path and send an int_value. Modeling the API with a oneof that associates a type with the namespace directly as done in this MR prevents that sort of mistakes.

Sami Hiltunen opened issue #701: Create developer documentation for defining new path-based routing rules at GitLab.com / GitLab Infrastructure Team / GitLab...

2026-03-16T16:26:03Z