This project scrapes publication authors from the Agricultural University of Georgia research publications page and builds an interactive co-authorship network using Python, NetworkX, and PyVis.
- 🔍 Web Scraper
Uses Selenium to scrape authors from the university’s research publications page. - 📄 Author Extraction
Extracts co-authors from publication text using regex and stores them in a text file. - 🔗 Co-Authorship Graph
Builds an undirected graph where nodes are authors and edges represent co-authorships. - 🌐 Interactive Visualization
Generates:- A main co-authorship network as an HTML file.
- Individual co-author graphs for each author.
- A dropdown in the main graph to easily open individual author graphs.
- 🎨 Edge Weight Coloring
Edge color intensity represents the strength (number) of co-authorships.
your_project/
├── scrape.py
├── agruni_coauthorship.py
├── driver/
│ └── chromedriver
├── co_authorship.txt # Output of scraper
├── agruni_coauthorship_graph.html
├── individual_author_graphs/
│ └── *_graph.html
This project requires ChromeDriver to run the scraper (scrape.py).
Please download the appropriate version of ChromeDriver for your OS and Chrome browser version, then place it in the driver/ directory or update the path in the script accordingly.
1️⃣ Run scrape.py
- Launches Chrome via Selenium.
- Scrapes publication data.
- Extracts author names and saves them line by line to
co_authorship.txt.
Note: The raw output may need slight manual cleaning because typos on the source website can break the expected pattern and affect how all authors are separated.
2️⃣ Run agruni_coauthorship.py
- Reads
authors.txt. # Cleaned .txt file - Creates a graph with:
- Nodes = Authors.
- Edges = Co-authorships.
- Builds the interactive co-authorship network and saves as
agruni_coauthorship_graph.html. - Generates separate interactive graphs for each author under
individual_author_graphs/.
3️⃣ View the Graphs
- Open
agruni_coauthorship_graph.htmlin a browser. - Use the dropdown to view individual author co-authorship networks.
Python Packages
seleniumnetworkxpyvismatplotlib