Skip to content

Hadron/legda

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

legda

legda is a small asyncio-based data retrieval app for government documents.

Architecture

  • Document and DocumentInstance (legda/models.py) are SQLAlchemy models for persistent state.
  • RetrievalPlugin (legda/plugins/base.py) defines the plugin contract:
    • get_documents(datetime) -> list[Document]
    • get_document(Document, DocumentInstance) -> bytes
    • download_document(Document) -> list[DocumentInstance]
  • FederalRegisterRetriever (legda/plugins/federal_register.py) provides shared Federal Register retrieval logic.
  • ExecutiveOrderRetriever (legda/plugins/executive_orders.py) uses that base for executive orders.
  • OmbMemoRetriever (legda/plugins/omb_memos.py) scrapes OMB memoranda from the White House guidance index.
  • OpmFederalRegisterRetriever (legda/plugins/opm_federal_register.py) uses that base for OPM actions.
  • OpmMemoRetriever (legda/plugins/opm_memos.py) scrapes published OPM CHCOC memos.
  • CongressRetriever (legda/plugins/congress.py) discovers passed legislation and downloads bill XML from Congress.gov.
  • DiscoveryDriver (legda/drivers/discovery.py) runs all plugins and upserts discovered documents.
  • DownloadDriver (legda/drivers/download.py) fetches documents with status=to_fetch.
  • FetchDate (legda/models.py) stores per-plugin successful discovery timestamps.
  • Plugin registration lives in legda/plugins/__init__.py via a plugin-key to class map.

Run

python3 -m pip install -r requirements.txt
cat > .env <<'EOF'
CONGRESS_API_KEY=your_key_here
EOF
python3 -m legda

This fetches and downloads retrieved documents into:

  • data/<document_type>/<publication_date>-<document_id>.<extension>

Discovery uses the last successful plugin fetch timestamp from fetch_dates; new plugins start from DEFAULT_DISCOVERY_SINCE in legda/constants.py.

About

Governmental Data Acquisition code:

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages