Skip to content

Commit 79ce4ed

Browse files
JoelKatzrec
authored andcommitted
Document cluster configuration and monitoring (RIPD-732)
1 parent e3a7aa0 commit 79ce4ed

1 file changed

Lines changed: 89 additions & 0 deletions

File tree

src/ripple/app/peers/README.md

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
# Ripple Clustering #
2+
3+
A cluster consists of more than one Ripple server under common
4+
administration that share load information, distribute cryptography
5+
operations, and provide greater response consistency.
6+
7+
Cluster nodes are identified by their public node keys. Cluster nodes
8+
exchange information about endpoints that are imposing load upon them.
9+
Cluster nodes share information about their internal load status. Cluster
10+
nodes do not have to verify the cryptographic signatures on messages
11+
received from other cluster nodes.
12+
13+
## Configuration ##
14+
15+
A server's public key can be determined from the output of the `server_info`
16+
command. The key is in the `pubkey_node` value, and is a text string
17+
beginning with the letter `n`. The key is maintained across runs in a
18+
database.
19+
20+
Cluster members are configured in the `rippled.cfg` file under
21+
`[cluster_nodes]`. Each member should be configured on a line beginning
22+
with the node public key, followed optionally by a space and a friendly
23+
name.
24+
25+
Because cluster members can introduce other cluster members, it is not
26+
necessary to configure every cluster member on every other cluster member.
27+
If a hub and spoke system is used, it is sufficient to configure every
28+
cluster member on the hub(s) and only configure the hubs on the spokes.
29+
That is, each spoke does not need to be configured on every other spoke.
30+
31+
New spokes can be added as follows:
32+
33+
- In the new spoke's `[cluster_nodes]`, include each hub's public node key
34+
- Start the spoke server and determine its public node key
35+
- Configure each hub with the new spoke's public key
36+
- Restart each hub, one by one
37+
- Restart the spoke
38+
39+
## Transaction Behavior ##
40+
41+
When a transaction is received from a cluster member, several normal checks
42+
are bypassed:
43+
44+
Signature checking is bypassed because we trust that a cluster member would
45+
not relay a transaction with an incorrect signature. Validators may wish to
46+
disable this feature, preferring the additional load to get the additional
47+
security of having validators check each transaction.
48+
49+
Local checks for transaction checking are also bypassed. For example, a
50+
server will not reject a transaction from a cluster peer because the fee
51+
does not meet its current relay fee. It is preferable to keep the cluster
52+
in agreement and permit confirmation from one cluster member to more
53+
reliably indicate the transaction's acceptance by the cluster.
54+
55+
## Server Load Information ##
56+
57+
Cluster members exchange information on their server's load level. The load
58+
level is essentially the amount by which the normal fee levels are multiplied
59+
to get the server's fee for relaying transactions.
60+
61+
A server's effective load level, and the one it uses to determine its relay
62+
fee, is the highest of its local load level, the network load level, and the
63+
cluster load level. The cluster load level is the median load level reported
64+
by a cluster member.
65+
66+
## Gossip ##
67+
68+
Gossip is the mechanism by which cluster members share information about
69+
endpoints (typically IPv4 addresses) that are imposing unusually high load
70+
on them. The endpoint load manager takes into account gossip to reduce the
71+
amount of load the endpoint is permitted to impose on the local server
72+
before it is warned, disconnected, or banned.
73+
74+
Suppose, for example, that an attacker controls a large number of IP
75+
addresses, and with these, he can send sufficient requests to overload a
76+
server. Without gossip, he could use these same addresses to overload all
77+
the servers in a cluster. With gossip, if he chooses to use the same IP
78+
address to impose load on more than one server, he will find that the amount
79+
of load he can impose before getting disconnected is much lower.
80+
81+
## Monitoring ##
82+
83+
The `peers` command will report on the status of the cluster. The `cluster`
84+
object will contain one entry for each member of the cluster (either configured
85+
or introduced by another cluster member). The `age` field is the number of
86+
seconds since the server was last heard from. If the server is reporting an
87+
elevated cluster fee, that will be reported as well.
88+
89+
In the `peers` object, cluster members will contain a `cluster` field set to `true`.

0 commit comments

Comments
 (0)