[7] graph/db+autopilot: improve efficiency of autopilot methods that use the ForEachNode/ForEachChannel pattern#10127
Conversation
There was a problem hiding this comment.
Summary of Changes
Hello @ellemouton, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request significantly refactors the autopilot subserver's interaction with the channel graph database. The primary goal is to eliminate an inefficient ForEachNode { ForEachChannel()} pattern, which was particularly slow for native SQL backends. By moving to a more optimized batch fetching approach, I've improved the efficiency of methods that calculate network statistics, such as those used by GetNetworkInfo and GetNodeMetrics.
Highlights
- Refactored Graph Traversal: I've introduced a new
ForEachNodesChannelsmethod in theautopilot.ChannelGraphinterface. This new method allows for iterating through nodes and their associated channels in a single, more efficient batch operation, replacing the nestedForEachNodeandForEachChannelcalls. - Updated Node Representation: The
dbNodestruct in the autopilot package has been simplified to directly store public keys and addresses, removing its dependency ongraphdb.NodeRTxand theForEachChannelmethod. - Database Interface Modernization: The underlying
graph/dbinterfaces (NodeRTx,V1Store) have been updated. TheNodeRTxinterface has been removed entirely, and theForEachNodeandForEachNodeCachedmethods now directly pass*models.LightningNodeor include address data, streamlining database interactions. - Performance Improvements: The changes leverage the
ForEachNodeCachedmethod from the graph database, which is designed for efficient batch retrieval, leading to better performance for graph-related operations within the autopilot. - Code Simplification: By removing the
ForEachChannelmethod from theautopilot.Nodeinterface and its implementations, the code has become cleaner and less prone to inefficient database access patterns.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request is a great step towards improving performance by refactoring away from the inefficient ForEachNode { ForEachChannel()} pattern. The introduction of ForEachNodesChannels on the ChannelGraph interface is a solid architectural improvement. The changes are extensive and well-executed across both the autopilot and graph/db packages.
I've found a couple of bugs, one of which could impact the autopilot's decision-making process, and another in a test helper. I've also included some suggestions to improve function comments to align with the provided style guide.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request successfully refactors the inefficient ForEachNode { ForEachChannel()} pattern within the autopilot subserver, replacing it with a more efficient ForEachNodesChannels approach. This is a significant performance improvement, particularly for SQL backends, and impacts network statistic calculations. The changes are consistently applied across interfaces, implementations, and tests. The core logic appears sound, and I have one minor style suggestion regarding function argument indentation to align with the provided style guide.
da86865 to
d0e10ef
Compare
In preparation for removing the ForEachNode/ForEachChannel call pattern, we refactor the autopilot subserver by simplifying the `ChannelEdge` such that it no longer returns the `Node` type which has a `ForEachChannel` method on it. The aim is to completely remove the `ForEachChannel` usage.
In this commit, we remove the need for the NodeRTx interface to have the FetchNode method on it.
guggero
left a comment
There was a problem hiding this comment.
Very nice, awesome that we can remove the NodeRTx use completely. Only nits, otherwise LGTM 🎉
Here we adjust the ForEachNodeCached graph DB method to pass in a node's addresses into the provided call-back if requested. This will allow us to improve the performance of node/channel iteration in the autopilot subserver.
This is in preparation for removing the ForEachChannel method from the Node interface.
And instead make use of the new ForEachNodesChannels method which uses a much more efficient method for iterating through nodes&channels.
|
Cc @guggero for override merge - both ci fails seem unrelated |
As was shown in #10123, the pattern of using the
existing
ForEachNode { ForEachChannel()}pattern to iterate through all channels for everynode is very inefficient for native SQL backends. That PR then updated the
ForEachNodeCachedmethod to stop using this old pattern and instead use batch fetching to collect node & channel data.
However, the
autopilotstill makes use of theForEachNode { ForEachChannel()}pattern. So thisPR refactors things in that subserver such that that pattern is no longer used & to instead use the updated
ForEachNodeCachedmethod which we have shown (in previous PR) is now much more efficient.Performance Improvements
This is important since those autopilot methods are used to calculate network stats for calls like GetNodeMetrics. For example: the calculation of betweeness centrality of a mainnet graph before this change took 3m9s, now with this change, it takes 14s.
Depends on #10123
Part of #9795