Local cache of scraped data

Issue Type
----------

- Feature request

Current Behavior
----------------

Unless I'm mistaken, running scrAPD from the command line multiple times would scrape the entire APD news site each time.

Expected Behavior
-----------------

scrAPD could cache the results of the scrape locally when it's run once. Then, by default on subsequent runs, scrAPD could construct its output from the local cache, and then hit the APD site only to scrape reports newer than the newest record in the cache.

If I'm mistaken and it's possible to use a cache already, then my request would be to update that part of the documentation.

Possible Solution
-----------------

The main reason for the feature request is to reduce the number of calls to the APD site. Even if the traffic isn't overwhelming for them, it seems like a better practice to have the ability to control it with caching. 

My best idea for implementing a cache would be for the CLI to create a SQLite database in a local directory that's not under version control. (This may sound hypocritical coming from me, but I don't want to put too much more personally identifiable information on github.) So, maybe using SQLAlchemy?

I'm not sure whether it would be better to cache just the output data, or to cache the entire text of the police report. If it's the latter, then when you re-ran the CLI you'd need options to (1) reuse the output data you already have, (2) re-parse the police reports in the local cache, or (3) re-download the police reports and then parse.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Local cache of scraped data #95

Issue Type

Current Behavior

Expected Behavior

Possible Solution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Local cache of scraped data #95

Description

Issue Type

Current Behavior

Expected Behavior

Possible Solution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions