A Complex Network Analysis Approach in Cricket: Analyzing Batsmen and Bowlers in the IPL

Quantifying individual performance in cricket is essential for team selection and understanding player contributions to team success. This project applies concepts from social network analysis (SNA) to analyze the performance of batsmen and bowlers using a dataset. The dataset comprises four major parts: (1) batsmen facing specific bowlers, (2) overall batting averages for batsmen, (3) bowlers bowling to specific batsmen, and (4) overall bowling averages for bowlers. Building on previous research, the project constructs weighted and directed networks of interactions between batsmen and bowlers. For batsmen, a performance index is calculated based on runs scored against bowlers relative to their career bowling averages. Similarly, a quality index is determined for bowlers based on dismissals of batsmen relative to their career batting averages. Additionally, a one-mode projected network is generated to compare the relative importance of players. The PageRank algorithm is applied to evaluate the importance of each player in the network. By leveraging network analysis techniques, this project aims to provide insights into cricket player performance and contribute to the understanding of individual contributions to team success in the sport. Our results shows that Virat Kohli and Lasith Malinga are the most successful batsman and bowlers respectively in the history of Indian Premiere League (IPL).

Before running the project, you'll need to activate the virtual environment and install the required libraries.

On Windows:

Open a terminal or command prompt and navigate to your project directory.
Activate the virtual environment using the following command:

source env/Scripts/activate

Once activated, the terminal prompt will likely change to indicate you're working within the environment.
Install the required libraries listed in the requirements.txt file using:

pip install -r requirements.txt

Project Flow: Data Extraction and Analysis

This section outlines the steps involved in extracting cricket ball-by-ball data, processing it, and generating relevant statistics. It assumes the virtual environment and required libraries are set up as detailed in the "Setting Up the Environment" section.

Data Acquisition

Download Ball-by-Ball Data: Download the data from https://cricsheet.org/ in YAML format. This data will be used for subsequent processing.

Directory Structure

Create Folders: Organize your project with the following directory structure in the root directory:
- tests: Stores the downloaded ball-by-ball data in YAML format (one file per match).
- processed: Holds the intermediate processed data for each match (four files per match).
- teamwise: Contains compiled data for all matches across teams.
- stats: Houses additional statistics derived from the processed data.
- results (New Folder): Stores the output generated by analysis scripts.

Script Execution Flow

The following scripts work together to process the data and generate statistics:

extract.py: Parses a YAML file containing ball-by-ball data for a single match and extracts the required data.
process.py: Acts as the driver for extract.py. It iterates over files in the tests folder, processes each match, handles errors, and logs results. Output is placed in the processed folder.
compile.py: Compiles data from all matches into a consolidated format for each team. It also generates a general_stats.log file summarizing the data. Output goes into the teamwise folder.
realise_pib_qib.py (Optional): Calculates additional data required for building specific network models (PIB and QIB). This script takes data from the teamwise folder as input and writes the results to the stats folder.

Data Analysis

Once the data is extracted and processed, you can perform further analysis using the provided Python scripts in the project directory. These scripts typically take data from the processed, teamwise, or stats folders and generate visualizations or additional statistics stored in the results folder.

Here's an overview of some analysis scripts:

batting_analysis.py: Analyzes batting performance metrics.
bowling_analysis.py: Analyzes bowling performance metrics.
pib_qib_convergence_plot.py: (Optional) Creates convergence plots for PIB and QIB networks (if applicable).
linear_regression_batting.py: Performs linear regression analysis on batting data.
linear_regression_bowling.py: Performs linear regression analysis on bowling data.
batsmen_subgraph.py: (Optional) Creates a subgraph focusing on specific batsmen (if applicable).

Note: The specific analysis performed by each script depends on the project's goals. Refer to the script documentation (if available) for detailed information.

The results generated by these scripts will be saved in the results folder. You can then explore these results to gain insights into player performance, team dynamics, and other aspects of cricket data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Complex Network Analysis Approach in Cricket: Analyzing Batsmen and Bowlers in the IPL

Project Flow: Data Extraction and Analysis

Data Acquisition

Directory Structure

Script Execution Flow

Data Analysis

By following these steps and understanding the script functionality, you can effectively extract, process, analyze cricket data for your project!

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
data_extraction		data_extraction
env		env
lib		lib
results		results
README.md		README.md
batsmen_subgraph.py		batsmen_subgraph.py
batting_analysis.py		batting_analysis.py
bowling_analysis.py		bowling_analysis.py
linear_regression_batting.py		linear_regression_batting.py
linear_regression_bowling.py		linear_regression_bowling.py
pib_qib_convergence_plot.py		pib_qib_convergence_plot.py
readme.png		readme.png
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

A Complex Network Analysis Approach in Cricket: Analyzing Batsmen and Bowlers in the IPL

Project Flow: Data Extraction and Analysis

Data Acquisition

Directory Structure

Script Execution Flow

Data Analysis

By following these steps and understanding the script functionality, you can effectively extract, process, analyze cricket data for your project!

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages