Description
Prepare the labeled dataset for machine learning by cleaning, formatting, and structuring it for training. This step ensures high-quality input data for the model. All outputs must be stored inside the inference/ folder.
Tasks
- Create script prepare_data.py inside inference/
- Read dataset from inference/labeled_flight_data.csv
- Remove unnecessary columns (e.g., timestamp if not used for training)
- Handle missing or null values (drop or fill appropriately)
- Encode event_label into numerical format (e.g label encoding)
- Select input features: altitude, heading, vertical_speed, velocity, roll, pitch, yaw, g_force
- Separate features (X) and labels (y)
- Split dataset into training and testing sets (e.g 80/20 split)
- Normalize or scale feature values
- Save and push the processed dataset to inference/dataset
- Push code to inference/
Acceptance Criteria
Description
Prepare the labeled dataset for machine learning by cleaning, formatting, and structuring it for training. This step ensures high-quality input data for the model. All outputs must be stored inside the inference/ folder.
Tasks
Acceptance Criteria