Inspiration
Global warming is an existential threat to humanity, but it could be mitigated if society begins an energy transition into renewable energy sources. Chevron is looking to lead investment into the renewable energy transition by looking into states where renewable energy investment is high. Using this dataset, we can help Chevron pinpoint the best states to look into for renewable energy investment and therefore help accelerate the energy transition.
What it does
This project narrows down the top states that Chevron should look into to invest into their renewable energy projects by predicting the amount of renewable energy investment dollars each state receives in 2020.
How we built it
After cleaning the data, we implemented proprietary machine-learning models such as linear regression, gradient boosting, and random forest to predict the amount of renewable energy investment dollars each state receives for the upcoming year. Then, we would calculate the accuracy of the models by comparing it to the actual amount of renewable energy investment dollars each state receives. We would visualize the data collected on each model on a US map using geopandas and decide which model was the best predictor by plotting the residuals for each state.
Challenges we ran into
As an all freshman team, we had several struggles due to our lack of experience. Specifically, we struggled creating predictions based on the limited data sets. Most of us have very limited statistics experience, making it difficult to conceptualize where to start within the dataset to create predictions. We spent hours drafting plans on a whiteboard to plan on course of action to understand the dataset. In addition, we struggled with finding external datasets to complement our predictions, limiting the scope of how we could manipulate data to generate predictions.
Accomplishments that we're proud of
This was our first datathon, and we were able to complete the project as a team. The amount all of us learned over the course of the challenge cannot be overstated. We specifically chose the Chevron track over the beginner track because we were interested in how the energy transition affects our lives and Chevron's commitment to invest into renewable energy. Our success came from the dedication of our entire team. We successfully predicted 7 out of 10 of the highest value states in the test data based on our model.
What we learned
This was our first real experience with data science modeling. Many members of our group had never worked with numpy or pandas, so getting the hands-on experience helped us grow our skillsets. One member self-taught themselves how to plot data onto geopandas, opening up a whole new set of data visualization tools we could use.
What's next for Ecoficiency
In the future, we hope to supplement our given dataset with a second or possibly third dataset to help us increase the accuracy of our predictions.
Log in or sign up for Devpost to join the conversation.