We built our project using Python, leveraging Pandas and NumPy for data manipulation, Scikit-learn for implementing a Random Forest model, and XGBoost for gradient boosting. We also used visualization tools to better understand feature distributions and model behavior. Our workflow involved cleaning and aligning the dataset, engineering seasonal weather features, splitting the data into training and testing sets, training our models, and evaluating them using RMSE and ( R^2 ). Mathematically, we evaluated prediction error using:
Through this process, we learned that building the models was not the hardest part — preparing the data was. The most challenging aspect of the project was formatting and editing the dataset so that the models could actually run correctly. Weather data had to align precisely with the correct harvest year, units needed to be standardized, missing values handled carefully, and features aggregated properly. Small inconsistencies often caused errors or misleading results. We quickly realized that data cleaning and preprocessing account for the majority of real-world machine learning work.
Midway through the hackathon, two of our teammates had to leave, which significantly increased the pressure on our remaining team members. We had to redistribute responsibilities quickly and adapt our workflow. While stressful, this challenge strengthened our collaboration and pushed us to learn faster. We divided responsibilities across data preprocessing, model tuning, and visualization to stay efficient under time constraints.
Ultimately, this project taught us that machine learning is not just about optimizing algorithms; it is about understanding data, validating assumptions, and interpreting results honestly. We gained hands-on experience in data manipulation, feature engineering, model evaluation, and visualization, while also learning the importance of baselines and the reality that more complex models do not guarantee better explanatory power. Even though our models did not strongly predict yields, the process gave us deeper insight into the relationship between climate and agriculture and strengthened our technical and collaborative skills.
!!!Youtube Video Frozen, Please refer to website!!!
Built With
- claude
- datbricks
- python
Log in or sign up for Devpost to join the conversation.