total hours spent this week by all contributors: 40
I've opened Documentation on HTTP routers impact on path an... (gitlab-com/gl-infra/tenant-scale/cells-infrastructure/team#706) to track this issue. Let's continue there.
@tkhandelwal3 this updates the documentation around Classify. Could you please take a look?
@tkhandelwal3 I agree, and I'm of the opinion that Topology Service would be better off doing minimal validation on the routing records if it needs to do so at all. Ultimately it is up to Rails to decide what is valid data.
Sami Hiltunen (a6f797e8) at 19 Mar 12:00
Update Classify RPC documentation
Update Classify RPC documentation
We've currently split the Classify RPC documentation in a Markdown file and not documented the protobuf definitions themselves. Documenting the definitions will also move the documentation into the generated files, and surface it in the editors. In addition, documenting all the types in the markdown is redundant as the protobuf definitions naturally document the types.
Move most of the RPC documentation into the proto definitions, and just keep the curl examples in the markdown.
Sami Hiltunen (b4083b6f) at 19 Mar 11:32
Update Classify RPC documentation
Sami Hiltunen (ebfe862b) at 18 Mar 16:16
Test falling back to JSON rules when a claim is not found
... and 20 more commits
I think we would need to do it beforehand, e.g. a external person is adding a claim for: !487 (merged), this claim will be used for routing, so unless we explicitly mention it in the above doc folks working in monolith won't know that there is another place they have to update this in.
@tkhandelwal3 I guess I was mostly coming from the angle that adding claims is a separate concern from using them for routing. We'd have to implement the routing logic/rules to use that new claim. That's the part when we'd need Classify to support the new claim type. We don't currently document how we'd make use of the new claim.
That's probably something we should extend the documentation to cover eventually, including how to update path rules in HTTP router. I've created Create developer documentation for defining new... (gitlab-com/gl-infra/tenant-scale/cells-infrastructure/team#701) to track this work.
On a related note, I can't think of a field that we are just claiming for global uniquness and won't be used for routing, I don't think such fields would exist, can you think of any use case for it?
Claims are purely used for routing as far as I know, so I don't think we have such cases.
Sure, I've opened an issue here: Replace Bucket message with Claim message (gitlab-com/gl-infra/tenant-scale/cells-infrastructure/team#700).
That is true, the alternative would be to use
stingsin the claim type too and consolidate everything to be claim, that would mean the client would need to send "1" instead 1.
@tkhandelwal3 if we type everything as strings, then there would be no practical difference to the Bucket type. The bucket types everything as a string.
Given that in the DB we are storing them as raw bytes, for my learning / understanding, why would we want to use strings or type specific attributes?
The type makes it clear what is the expected value of a field. Project ID being int64 clearly denotes this should be an int64 value. Using a string also brings questions around how is this value encoded in string. Is it decimal, hex or something else? These can of course be documented but using the appropriate type clearly signals the appropriate type of the value and prevents mistakes through compile time type checks.
Other than clarity and avoiding mistakes, using the appropriate type enables using the most efficient encoding of a given type. 9223372036854775807 is 20 bytes as a string but 8 bytes when encoded as a string.
It's also worth noting that Topology Service is already enforcing these values are integers through regex validation and parsing the integers. Using the appropriate type means to implement we don't need to manually implement type checks and parsing logic.
Given that in the DB we are storing them as raw bytes, for my learning / understanding, why would we want to use strings or type specific attributes?
For API clarity, efficiency and compile time type safety. The storage model is a different concern from how we model the API, and we can keep using the existing approach there.
@tkhandelwal3 to get rid of the "claim" container, we'd have to flatten the message into the in to the oneof in ClassifyRequest:
message ClassifyRequest {
ClassifyType type = 2;
string value = 3;
// classification_key is the key to classify.
oneof classification_key {
string organization_path = 1;
int64 organization_id = 2;
string namespace_path = 3;
int64 namespace_id = 4;
// rest of claims ...
}
}
Flattening the fields here loses the Claim message and would mean we can't reuse the definition elsewhere. If it is a separate message, we can use it here, and also in the write side RPCs that create claims where it would replace Bucket.
The first option in your example on the other hand seems to put the claim type in the ClassifyType enum, and I'm assuming we'd keep having the string value = 3 for the value. The downside with this is that it's essentially saying that all claims are strings, and we lose the ability to type them with the most fitting type.
Extending a bit we could end up with something like below.
message ClassifyRequest {
ClassifyType type = 2;
oneof value {
string string_value = 2;
int64 int_value = 3;
}
}
This would allow typing the claims as strings, integers or whatever is the appropriate type. The downside here is that the API doesn't really say what is the right value to use for each claim. I can set the type as organization_path and send an int_value. Modeling the API with a oneof that associates a type with the namespace directly as done in this MR prevents that sort of mistakes.