JAVA-3168 Copy node info for contact points on initial node refresh only from first match by endpoint#2007
Merged
tolbertam merged 1 commit intoapache:4.xfrom Jan 31, 2025
Conversation
tolbertam
reviewed
Jan 29, 2025
core/src/main/java/com/datastax/oss/driver/internal/core/session/PoolManager.java
Outdated
Show resolved
Hide resolved
tolbertam
reviewed
Jan 29, 2025
core/src/main/java/com/datastax/oss/driver/internal/core/session/PoolManager.java
Outdated
Show resolved
Hide resolved
jahstreet
commented
Jan 30, 2025
core/src/main/java/com/datastax/oss/driver/internal/core/metadata/InitialNodeListRefresh.java
Outdated
Show resolved
Hide resolved
tolbertam
approved these changes
Jan 30, 2025
Contributor
tolbertam
left a comment
There was a problem hiding this comment.
Looks great! Excellent work!
adutra
reviewed
Jan 30, 2025
core/src/main/java/com/datastax/oss/driver/internal/core/metadata/InitialNodeListRefresh.java
Outdated
Show resolved
Hide resolved
d77f414 to
0506c81
Compare
adutra
approved these changes
Jan 30, 2025
…nly from first match by endpoint patch by Alex Sasnouskikh; reviewed by Andy Tolbert and Alexandre Dura for JAVA-3168
0506c81 to
7b732d7
Compare
Contributor
Author
|
Squashed commits, PTAL. |
Contributor
|
This is great, thank you @jahstreet ! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
https://datastax-oss.atlassian.net/browse/JAVA-3168
DataStax Java Cassandra driver supports FixedHostNameAddressTranslator since v4.15.0 (Sep 19, 2022). This address translator plugin allows Cassandra clients to connect to a Cassandra database running in a different (private) network, eg. Kubernetes, via load balancer.
On the initial (control) connection to Cassandra, the driver queries the DB cluster for its topology and fetches its nodes IDs together with their IP addresses (Pod IPs in a private K8s network). These addresses are "translated" by the address translator to the configured URL pointing to the load balancer. So the client "thinks" it has connected to all Cassandra nodes, however the node addresses are the same.
Connecting to Cassandra behind load balancer with FixedHostNameAddressTranslatorAfter, the driver opens a connection pool to every discovered Cassandra node. And given all the node addresses now point to the load balancer, all the connections are getting opened to it. We can calculate the total number of connections as:
client_app_count * cassandra_node_count * connections_per_node_countBy default, the driver sets
connections_per_node_countto 1 (advanced.connection.pool), however for data intensive applications it is configured to a higher value. For example, Apache Spark Cassandra Connector overrides it to the number of available JVM CPUs (spark.cassandra.connection.localConnectionsPerExecutor). So it can be a usual scenario to have a Spark job with 64 executors (each running a separate instance of a Cassandra driver) with 16 CPU cores connecting to a 100 node Cassandra cluster, which generates64 * 100 * 16 = 102'400connections. And that is a single application. So we must be careful when configuring the load balancer to not exceed limits on number of connections, number of open files, memory, etc.Driver, when fetched the cluster topology and translated node addresses, does "optimization": it creates an instance of Node class for every contact point address used on initial (control) connection and reuses them for all the translated node addresses if they match any of the contact points. Then, in the scenario when contact point is the load balancer address and address translator also translates node addresses to the load balancer address, the driver metadata will contain node IDs mapping to the same Node instance, eg.:
node_id_1 -> LB_Node, node_id_2 -> LB_Node, node_id_3 -> LB_Node, …Then the connection pool is initialized for each node (or better say multiple times for the same Node), but with a "bug": after a pool is created, it is put to a map with Node as a key, and when Node is the same instance for all pools (LB_Node), we get a map with a single pool in it (each time we put a new value for the same key, the value is being overwritten). So, in result we have created pools for every discovered Cassandra node, but all except one of them are leaked and will be running in a background keeping the connections alive, but staying unused.
With that change, the driver reuses contact point Nodes and fills them with the missing node info on initial node refresh only from first match with the node info by endpoint. That ensures we always create separate Node instances for Nodes having the same EndPoints, which down the line protects us from leaking the connection (channel) pools in PoolManager.