Ecoficiency

Inspiration

Global warming is an existential threat to humanity, but it could be mitigated if society begins an energy transition into renewable energy sources. Chevron is looking to lead investment into the renewable energy transition by looking into states where renewable energy investment is high. Using this dataset, we can help Chevron pinpoint the best states to look into for renewable energy investment and therefore help accelerate the energy transition.

What it does

This project narrows down the top states that Chevron should look into to invest into their renewable energy projects by predicting the amount of renewable energy investment dollars each state receives in 2020.

How we built it

After cleaning the data, we implemented proprietary machine-learning models such as linear regression, gradient boosting, and random forest to predict the amount of renewable energy investment dollars each state receives for the upcoming year. Then, we would calculate the accuracy of the models by comparing it to the actual amount of renewable energy investment dollars each state receives. We would visualize the data collected on each model on a US map using geopandas and decide which model was the best predictor by plotting the residuals for each state.

Challenges we ran into

As an all freshman team, we had several struggles due to our lack of experience. Specifically, we struggled creating predictions based on the limited data sets. Most of us have very limited statistics experience, making it difficult to conceptualize where to start within the dataset to create predictions. We spent hours drafting plans on a whiteboard to plan on course of action to understand the dataset. In addition, we struggled with finding external datasets to complement our predictions, limiting the scope of how we could manipulate data to generate predictions.

Accomplishments that we're proud of

This was our first datathon, and we were able to complete the project as a team. The amount all of us learned over the course of the challenge cannot be overstated. We specifically chose the Chevron track over the beginner track because we were interested in how the energy transition affects our lives and Chevron's commitment to invest into renewable energy. Our success came from the dedication of our entire team. We successfully predicted 7 out of 10 of the highest value states in the test data based on our model.

What we learned

This was our first real experience with data science modeling. Many members of our group had never worked with numpy or pandas, so getting the hands-on experience helped us grow our skillsets. One member self-taught themselves how to plot data onto geopandas, opening up a whole new set of data visualization tools we could use.

What's next for Ecoficiency

In the future, we hope to supplement our given dataset with a second or possibly third dataset to help us increase the accuracy of our predictions.

Built With

geopandas
numpy
pandas
python
seaborn
sklearn

Submitted to

Rice Datathon 2023
- Winner Best Social Impact

Created by

I worked on the prediction algorithms, including developing and testing our random forest, gradient boosting, elastic net, and other regression models we implemented. I also worked to analyze the error associated with our result and interpret our findings.

Mason Weiss
I worked on data visualization. Specifically, I created the United States heatmap using Geopandas. I also was in charge of of editing the video. This was my first real experience with prediction modeling, so I learned how to use scikit and to be more comfortable with pandas and NumPy.

Joel Villarino
I primarily focused on the format and layout of the presentation, but I also helped formulating our action plan for clarifying the data and creating a few of our early machine-learning algorithms.

ryan-gan
I worked on research, graphing, and implementation. I had not previously used many of the packages used such as Pandas or Seaborn, and had come to understand how they worked in order to efficiently manage and visualize the data. Additionally, I was exposed to prediction algorithms such as Gradient Boosting and Random Forest.

James Foxworth

Updates

Joel Villarino started this project — Jan 29, 2023 02:05 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.