Rafael Henchen (985e966e) at 20 Mar 01:15
Rafael Henchen (2cc5859e) at 20 Mar 01:15
Merge branch 'duo-edit-20260320-011416' into 'master'
... and 1 more commit
Removes the following nodes from dbre-toolkit/inventory/gprd-ci.yml:
Removes the following nodes from dbre-toolkit/inventory/gprd-ci.yml:
Rafael Henchen (985e966e) at 20 Mar 01:14
Remove patroni-ci-v17-14 through v17-19 nodes from gprd-ci inventory
@alexander-sosna ok, so google's documentation on "default" values seems to not be working. I have calculated at #21592 (comment 3172729683) that for a 16TB disk the IOPS would be (6 × 16,384) + 3,000 = 101,304 IOPS and Throughput would be (1.5 × 16,384) + 140 = 24,716 MiB/s (which would be limited at 2,400 MiB/s).
IMO the ideal is to have:
Resolved
We modified the CR plan just to remove Phase 3 which was launching nodes 14-19
So this CR objective is to only deploy 3x C4 nodes and perform the switchover of Primary into C4, that should be enough to provide high availability for the CI Primary node.
Resolved This should not be a blocker as we decided to reduce the number of nodes to the 3 nodes as c4-highmem-192
The c4-hihghmem-144 is only available over Intel Granite Rapids.
So, we'll try to provision the 3 nodes as c4-highmem-192 which are available in Emerald Rapids.
We couldn't launch the 6th gen Intel Granite Rapids in zones B and D. It worked only in zone C. So apparently there's an issue with fleet provisioning by Google for Intel Granite Rapids.
╷
│ Error: Error creating instance: googleapi: Error 400: C4 VM does not support minCpuPlatform Intel Granite Rapids., badRequest
│
│ with module.patroni-ci-v17.google_compute_instance.instance_with_attached_disk["11"],
│ on .terraform/modules/patroni-ci-v17/instance.tf line 339, in resource "google_compute_instance" "instance_with_attached_disk":
│ 339: resource "google_compute_instance" "instance_with_attached_disk" {
│
╵
╷
│ Error: Error creating instance: googleapi: Error 400: C4 VM does not support minCpuPlatform Intel Granite Rapids., badRequest
│
│ with module.patroni-ci-v17.google_compute_instance.instance_with_attached_disk["12"],
│ on .terraform/modules/patroni-ci-v17/instance.tf line 339, in resource "google_compute_instance" "instance_with_attached_disk":
│ 339: resource "google_compute_instance" "instance_with_attached_disk" {
│
╵
During the CR execution we had issues with hyperdisk quota and decided to split the node deployment in 2
First MR to create nodes 11-13 which are the critical nodes to provide capacity for us to switchover the Primary node to C4 : https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/13552
Second MR to create nodes 14-19 which are the Replica-only nodes: https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/13553
Rafael Henchen (7ef43470) at 19 Mar 07:23
Rafael Henchen (8484c93d) at 19 Mar 07:23
Merge branch 'duo-edit-20260319-045356' into 'master'
... and 2 more commits
Updates the patroni-ci inventory to reference v17 nodes with an expanded list including:
Total of 21 nodes replacing the previous 6 v16 nodes.