This problem was taken from the website named Machine Hack which is a platform that post Data Science Hackathon.
It is a regression problem. I have used Extra Tree Regression Algorithm to predict the output which gave me final RMSE: 0.1951 and R2: 0.9616
In this project, I first time used the 'PyCaret' library. PyCaret let's you code less and with comparing different model at a same time with just few lines of codes. I really loved working with the library.
Website for Hakathon: https://www.machinehack.com/hackathons/power_plant_energy_output_prediction_weekend_hackathon_13/overview
The dataset was collected from a Combined Cycle Power Plant over 6 years (2006-2011) when the power plant was set to work with a full load. Features consist of hourly average ambient variables Temperature (T), Ambient Pressure (AP), Relative Humidity (RH), and Exhaust Vacuum (V) to predict the net hourly electrical energy output (PE) of the plant. A combined-cycle power plant (CCPP) is composed of gas turbines (GT), steam turbines (ST), and heat recovery steam generators.
In a CCPP, the electricity is generated by gas and steam turbines, which are combined in one cycle, and is transferred from one turbine to another. While the Vacuum is collected from and has an effect on the Steam Turbine, the other three of the ambient variables affect the GT performance.
Temperature (T) in the range 1.81°C and 37.11°C
Ambient Pressure (AP) in the range 992.89-1033.30 millibar
Relative Humidity (RH) in the range of 25.56% to 100.16%
Exhaust Vacuum (V) in the range 25.36-81.56 cm Hg
Net hourly electrical energy output (PE) 420.26-495.76 MW