Skip to content

nmutto/rhone_discharge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Geneve, Halle de l'Ile discharge forecast MLP model

This repository contains a Python script and configuration file to train, evaluate, and visualize a model that predicts future values of hydrological time-series at the Geneve, Halle de l'Ile station using a multilayer perceptron (MLP).

The model learns to forecast a fixed number of hours into the future based on:

  • Historical station measurements
  • Time-of-day, day-of-month, and month features

Files

File Description
train_model.py Main training, evaluation, and plotting script
utils.py Routine functions
config.toml All model, training, and data configuration parameters
environment.yml Conda environment packages
README.md This documentation

What the Code Does

The script performs the following steps:

  1. Reads hyperparameters and paths from config.toml
  2. Loads time-series data from an HDF5 file
  3. Builds a MLP model in PyTorch
  4. Trains the model (or loads an existing one)
  5. Predicts values n_hours_prediction into the future
  6. Evaluates performance on validation/test data
  7. Saves:
    • The trained model
    • Loss curves
    • Prediction vs truth plots
    • Diagnostic statistics

The model predicts a vector of future values (e.g., the next 24 hours) for multiple hydrological stations simultaneously.


Configuration Overview (config.toml)

Model Architecture

[model]
dim_hidden = 1024
n_layers = 8
activation = "LeakyReLU"

Training Parameters

[training]
epochs = 1000
batch_size = 512
learning_rate = 0.001
patience = 50
min_delta = 0.001
loss = "HuberLoss"
weight_decay = 0

Run Settings

[run]
seed = 2234
gpu = 1
data_fraction = 1
train_split = 0.8
valid_split = 0.1
n_hours_prediction = 24
read_model = 0
station_features = ["2009", "2606", "2174", "2170", "2027"]
time_features = ["Hour", "Day", "Month"]

Input/Output

[input_output]
input_data = "/path/to/data.hdf5"
output_folder = "/path/to/results"
model_name = "best_model"
overwrite = 1
n_check_plots = 100

Requirements

The easiest way to get the code to work is to install a conda environment via

conda env create -f environment.yml

This will create a torch environment you can access by running

conda activate torch

Usage example

Open the config.toml file and set:

input_data = "/absolute/path/to/your/data.hdf5" output_folder = "/absolute/path/where/results/should/be/saved"

If you created the conda environment, access it. Then, from the project directory, run:

python train_model.py config.toml

Once the code finishes, inside the output_folder the script will create the following nested structure

output_folder/
+-- config_used.toml
+-- best_model.pth
+-- feature_scaler.joblib
+-- target_scaler.joblib
|
+-- plots/
|   +-- Performance.pdf
|   +-- Target_vs_prediction.pdf
|
+-- std.txt
+-- mean_relative_difference_over_time.txt
+-- loss_values.txt

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages