Right now our Service Discovery DNS TTL is set to one minute. This means that, until this issue is resolved, during deployments, services might not be able to resolve a service discovery endpoint.
Here's an example:
- Service A calls Service B
- A new version of Service B is deployed, meaning a new Fargate instance is created
- Service A has the IP address for the now gone Service B Fargate Instance for up to 60 seconds
- Service A can't resolve Service B (if task count is 1) for 1 minute.
I think we can help mitigate this by lowering the TTL to 10~30 seconds.
Right now our Service Discovery DNS TTL is set to one minute. This means that, until this issue is resolved, during deployments, services might not be able to resolve a service discovery endpoint.
Here's an example:
I think we can help mitigate this by lowering the TTL to 10~30 seconds.