This project uses Python's DB-API to build a reporting tool that uses information from a database to analyse the webserver log for a news website.
- Install Python
- Install Virtual Box
- Install and set up Vagrant
- Start up the Virtual Machine after configuring Vagrant
- Download the data from here
- Unzip the downloaded file for a file called newsdata.sql
- Put newsdata.sql into the vagrant folder
- Fire up the VM using
vagrant ssh - Load the site's data into your local database using the command
- Run
psql news - Create the following views in the database for later use by logs.py when it's running:
A view for the number of bad requests each day
create view bad_requests
as select date(time) as day, count(status) as requests
from log where status != '200 ok'
group by day
order by requests desc;A view for the number of total requests each day
create view total_requests
as select date(time) as day, count(status) as requests
from log
group by day
order by requests desc;- Run logs.py to see the result of the queries