Fix CSVLogger: read existing learning_stats.csv file when resuming previous training#3177
Conversation
Previously, when resuming training the CSVLogger created a new learning_stats.csv file for the second training run, overwriting previously saved training metrics. This commit addresses feature request DeepLabCut#3176 and changes the behaviour to reading any previously saved log metrics and appending new epochs to the learning_stats.csv file.
|
closed in favor of #3179 |
The header and previous rows are appended for every save call. This in unintended behavior. - changing back to 'write' mode instead of 'append' mode
|
Looks good ! Should we consider adding some more validation in case the previous output is somehow broken, interrupted or to try and account for potential formatting changes in the future ? Or do you feel that would be redundant for now/the current solution is enough ? |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
I don't think it would be a good idea to put specific constraints on the values or data types that should be in the existent CSV, except for the 'step' index. The CSVLogger is currently quite flexible and even logs different metrics for different epoch types (e.g. validation epochs). I think the benefits are limited when we drop this flexible/column-agnostic behavior of the CSVLogger. |
In case an exception occurrs during reading (e.g. corrupted CSV, or CSV that misses 'step' column), the reader now starts with an empty CSV instead of keeping the succeeded rows from the existent CSV.
|
@deruyter92 Sounds good, thanks ! |
|
Thanks @C-Achard! Ok, @MMathisLab, please have a look if you agree. In that case these changes can be merged! |
|
@MMathisLab let me know what you think! I think it could be merged. |
Previously, when resuming training the CSVLogger created a new learning_stats.csv file for the second training run, overwriting previously saved training metrics.
This commit addresses feature request #3176 and changes the behaviour to reading any previously saved log metrics and appending new epochs to the learning_stats.csv file.
_load_existing_datamethod to CSVLogger_load_existing_data