[graph mig 2]: graph/db: migrate graph channels and policies from kvdb to SQL#10050
Conversation
There was a problem hiding this comment.
Summary of Changes
Hello @ellemouton, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request continues the migration of LND's graph database from a key-value store to a SQL database, focusing on the transfer of channel and channel policy data. My changes ensure that this critical data is accurately moved while handling potential inconsistencies from the legacy database gracefully. I've also added comprehensive testing to validate the integrity and completeness of the migrated data.
Highlights
- Graph Database Migration: I've expanded the ongoing graph database migration to specifically include channels and channel policies, moving them from the existing KVDB to the new SQL database. This is the second part of a multi-stage migration effort.
- Data Validation and Handling: I've implemented robust validation for
ExtraOpaqueDatawithin channels and policies during the migration process. This ensures that entries with invalid TLV streams are gracefully skipped and logged with warnings, preventing the migration from halting due to historical data inconsistencies. - Refactored KVDB Interaction: I've adjusted the
kv_store.gologic to perform TLV validation at the publicUpdateEdgePolicyentry point. This change allows internal test helpers to bypass this validation, enabling the recreation and testing of scenarios involving pre-existing invalid data. - Comprehensive Testing: I've introduced extensive new test cases and helper functions (
makeTestChannel,makeTestPolicy,fetchAllChannelsAndPolicies) to thoroughly verify the correctness and integrity of channel and policy migration. This includes specific tests for various edge cases and data configurations, such as invalid TLV data and missing fields.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request effectively expands the graph migration to include channels and their policies. The code is well-structured, and the addition of comprehensive tests, including those for edge cases with invalid data, is commendable. My feedback focuses on minor improvements to enhance code clarity and maintainability, such as correcting a typo in a test helper, refining error messages for better debugging, and ensuring proper error handling for invalid TLV data.
graph/db/sql_migration.go
Outdated
| return fmt.Errorf("unable to marshal extra opaque "+ | ||
| "data for channel %d: %w %v", scid, err, | ||
| channel.ExtraOpaqueData) |
There was a problem hiding this comment.
Similar to a previous comment, the error wrapping here can be improved. The format string ...: %w %v is unconventional as it's generally better to have %w at the end of the format string. The channel.ExtraOpaqueData is already logged in the Warnf call, so it might be redundant to include it in the error message that gets propagated up.
return fmt.Errorf("unable to marshal extra opaque data for channel %d: %w", scid, err)d62aa8f to
cb959bd
Compare
ef5f7ed to
3867d13
Compare
|
(will fix the release notes link on next push) |
Here, we move TLV validation for the KVStore out of `updateEdgePolicy` so that we can re-use `updateEdgePolicy` in our tests to write policies with invalid TLV (since that was possible before the recently added TLV sanity check) so that we can test that our SQL migration behaves correctly for these cases.
Here we factor out some of the crud code in TestEdgePolicyMissingMaxHtcl so that we can re-use it later on.
3867d13 to
e840cf8
Compare
|
/gemini review |
| checkPolicy := func(expPolicy, | ||
| migPolicy *models.ChannelEdgePolicy) error { | ||
|
|
||
| switch { | ||
| // Both policies are nil, nothing to compare. | ||
| case expPolicy == nil && migPolicy == nil: | ||
| return nil | ||
|
|
||
| // One of the policies is nil, but the other is not. | ||
| case expPolicy == nil || migPolicy == nil: | ||
| return fmt.Errorf("expected both policies to be "+ | ||
| "non-nil. Got expPolicy: %v, "+ | ||
| "migPolicy: %v", expPolicy, migPolicy) | ||
|
|
||
| // Both policies are non-nil, we can compare them. | ||
| default: | ||
| } | ||
|
|
||
| if len(expPolicy.ExtraOpaqueData) == 0 { | ||
| expPolicy.ExtraOpaqueData = nil | ||
| } | ||
| if len(migPolicy.ExtraOpaqueData) == 0 { | ||
| migPolicy.ExtraOpaqueData = nil | ||
| } | ||
|
|
||
| return sqldb.CompareRecords( | ||
| *expPolicy, *migPolicy, "channel policy", | ||
| ) | ||
| } |
There was a problem hiding this comment.
The current logic in checkPolicy does not correctly handle cases where a policy is intentionally skipped during migration due to invalid TLV data. If an expected policy (expPolicy) is non-nil but has invalid data, it will be skipped by migChanPolicy, resulting in a nil migrated policy (migPolicy). The current check will incorrectly fail this as an error, preventing the migration from completing successfully for users with such data. The logic should be updated to account for this case: if expPolicy has invalid TLV, it should assert that migPolicy is nil.
// If we don't expect a policy, we shouldn't have a migrated one.
if expPolicy == nil {
if migPolicy != nil {
return fmt.Errorf("expected nil policy, but got one")
}
return nil
}
// We expect a policy. Check if it should have been migrated.
_, err := marshalExtraOpaqueData(expPolicy.ExtraOpaqueData)
isInvalid := errors.Is(err, ErrParsingExtraTLVBytes)
// If the policy from kvdb is invalid, we expect it to be skipped.
if isInvalid {
if migPolicy != nil {
return fmt.Errorf("policy with invalid TLV was migrated")
}
return nil
}
// The policy from kvdb is valid, so we expect it to be migrated.
if migPolicy == nil {
return fmt.Errorf("valid policy was not migrated")
}
if len(expPolicy.ExtraOpaqueData) == 0 {
expPolicy.ExtraOpaqueData = nil
}
if len(migPolicy.ExtraOpaqueData) == 0 {
migPolicy.ExtraOpaqueData = nil
}
return sqldb.CompareRecords(
*expPolicy, *migPolicy, "channel policy",
)There was a problem hiding this comment.
hmmm indeed.
wonder why this wasnt caught by the unit test... looking
There was a problem hiding this comment.
ah, ok it is handled, just in an embedded way:
policies are loaded from kvdb using getChannelMap which has this case:
// We don't want a single policy with bad TLV data to stop us
// from loading the rest of the data, so we just skip this
// policy. This is for backwards compatibility since we did not
// use to validate TLV data in the past before persisting it.
case errors.Is(err, ErrParsingExtraTLVBytes):
return nil
so when loading the channel policy from kvdb, it is already skipped if the extra tlv bytes are not valid. So we actually wont have the case where we need to deal with a channel policy with invalid TLV bytes in the migration
There was a problem hiding this comment.
I'll add a comment to make this clear.
There was a problem hiding this comment.
There was a problem hiding this comment.
Thank you for the detailed explanation, @ellemouton. I understand now that the forEachChannel function already filters out channel policies with invalid TLV data, ensuring that nil policies are passed to migChanPolicy (and subsequently checkPolicy) in those specific cases. This means the scenario I highlighted, where a non-nil expPolicy would have invalid TLV and a nil migPolicy, is indeed prevented by the upstream logic.
Your plan to add a clarifying comment to make this explicit is a good approach, as it will help future readers understand this crucial data flow and why the checkPolicy function doesn't need to handle that specific invalid TLV case directly.
There was a problem hiding this comment.
cc @bhandras - i havent re-requested review since no logic has changed here. But just want to flag in case
There was a problem hiding this comment.
Great catch by gemini, but I agree that the existing solution whereby we ignore the policy from the original record upon loading is sufficient.
In this commit, the `MigrateGraphToSQL` function is expanded to migrate the channel and channe policy data. Both of these have the special case where the kvdb store records may contain invalid TLV. If we encounter a channel with invalid TLV, we skip it and its policies. If we encounter a policy with invalid TLV, we skip it. The `TestMigrateGraphToSQL` and `TestMigrationWithChannelDB` tests are updated accordingly.
e840cf8 to
640caef
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
The pull request expands the graph migration from KVDB to SQL to include channels and channel policies. The changes are well-structured and include comprehensive tests for both normal operation and various edge cases, such as handling of invalid data. The migration logic includes verification steps to ensure data integrity.
This expands the graph migration code& tests to cover channels and channel policies.
Please see #10025 for the final result we are aiming for here.
Part of #9795
Depends on #10036