This project is a data analysis and visualization of top-rated movies. Using Python, it explores trends in popularity, ratings, release years, and voting patterns to better understand what makes a movie successful.
- Data Cleaning β handled missing values and formatted release dates.
- Exploratory Data Analysis (EDA):
- Popularity vs. Average Ratings (scatter plot)
- Distribution of Movies by Release Year
- Ratings trends across different years
- Box plots for numeric features (popularity, vote counts, ratings, etc.)
- Vote Count vs. Average Rating relationship
- Top 10 Most Popular Movies (bar chart)
- Movies Released Per Year (line chart)
- Top 10 Movies by Highest Vote Count
- Interactive Visualizations β built with Plotly for dynamic exploration.
- Python
- Pandas β data manipulation
- Matplotlib & Seaborn β static visualization
- Plotly Express β interactive charts
- Movies with higher popularity do not always have the best ratings.
- The number of movies released per year has shown an increasing trend.
- Some movies stand out with very high vote counts, indicating wide audience engagement.
- Clone this repository:
git clone https://github.com/your-username/top-rated-movie-analysis.git cd top-rated-movie-analysis Install dependencies:
pip install -r requirements.txt Open Jupyter Notebook and run: jupyter notebook "Top Rated Movie Analysis (Minor Project -2).ipynb" Project Structure Top Rated Movie Analysis/ βββ Top Rated Movie Analysis (Minor Project -2).ipynb # Main notebook βββ movie 3.csv # Dataset (local) βββ requirements.txt # Dependencies βββ README.md # Project documentation