AdOps Shield: Ads.txt Validator & Analyzer

A production-ready Streamlit analytics toolkit for validating, auditing, and operationalizing ads.txt and app-ads.txt supply-path declarations at scale.

Important

This tool focuses on syntax-level and structural validation of ads.txt and app-ads.txt records aligned with common IAB-style formatting expectations (Domain, Publisher ID, Account Type, optional Certification ID).

Features

Multi-source ingestion pipeline:
- Accepts user-provided domains/URLs.
- Accepts uploaded local .txt files.
Smart URL normalization:
- Adds https:// when protocol is missing.
- Appends /app-ads.txt when a raw domain/root URL is provided.
Parser for canonical record format:
- Extracts Domain, Publisher_ID, Account_Type, Certification_ID, and trailing inline comments.
IAB-style syntax validation checks:
- Detects insufficient comma-separated fields (minimum 3 required).
- Detects invalid account types (must be DIRECT or RESELLER).
Error observability workflow:
- Captures line-level parse errors with original line content and reason.
- Displays errors as a navigable table in the UI.
Analytics-ready data model:
- Creates a structured pandas.DataFrame for valid records.
- Generates normalized values (e.g., lowercase domains, uppercase account types).
Operational dashboarding:
- KPI cards for total valid rows, unique partners, DIRECT, and RESELLER counts.
- Pie chart for account type distribution.
- Horizontal bar chart for top supply partners.
Data explorer and export:
- Full-table browser with text search by domain or publisher ID.
- One-click CSV export of the filtered view.
Lightweight architecture:
- Clear separation of concerns between parsing logic (adops_logic.py) and UI (app.py).

Tip

Use this as a first-line QA utility in AdOps workflows before reconciling seller declarations against deeper business rules or external partner allowlists.

Tech Stack & Architecture

Core Stack

Language: Python 3.10+
Web UI: Streamlit
Data Processing: Pandas
Charting: Plotly Express
HTTP Client: Requests

Project Structure

Ads.txt-Validator-Analyzer/
├── app.py                 # Streamlit interface and dashboard orchestration
├── adops_logic.py         # Fetching, parsing, validation, and stats engine
├── requirements.txt       # Runtime dependencies
├── LICENSE                # Apache-2.0 license text
├── README.md              # Project documentation
└── .github/
    └── FUNDING.yml        # Sponsorship metadata

Key Design Decisions

Separation of concerns
- app.py owns interaction and visualization.
- adops_logic.py owns data retrieval, validation, and parsing.
- This keeps parser logic reusable for CLI, API, or batch integrations.
Fail-soft parsing strategy
- Invalid lines are collected into an error list instead of terminating parsing.
- Valid lines still produce analytics and exportable datasets.
Schema-first normalization
- Parser emits a fixed column model for compatibility with BI workflows.
- Account types are normalized to uppercase and domains to lowercase.
Operational UI ergonomics
- Sidebar input for source selection.
- In-page KPI + charts + searchable table for quick triage.

flowchart TD
    A[User Input] --> B{Source Type}
    B -->|URL/Domain| C[fetch_from_url]
    B -->|Text File| D[Read Uploaded Content]
    C --> E[Raw ads.txt/app-ads.txt Content]
    D --> E
    E --> F[parse_content]
    F --> G[Valid Records DataFrame]
    F --> H[Line-level Validation Errors]
    G --> I[get_stats]
    G --> J[Plotly Visualizations]
    G --> K[Search + CSV Export]
    H --> L[Error Table in UI]
    I --> M[KPI Metrics]

Note

The current parser implements practical syntax checks and normalization. It is intentionally lightweight and does not implement every possible ads.txt semantic enforcement rule.

Getting Started

Prerequisites

Python 3.10 or newer
pip (latest recommended)
Internet connectivity (for URL-based fetching)

Installation

Clone the repository:

git clone https://github.com/your-username/ads.txt-validator-analyzer.git
cd ads.txt-validator-analyzer

Create and activate a virtual environment:

python -m venv .venv
source .venv/bin/activate

Install dependencies:

pip install --upgrade pip
pip install -r requirements.txt

Run the application:

streamlit run app.py

Open the local URL shown by Streamlit (typically http://localhost:8501).

Testing

This repository currently does not ship with a formal automated test suite, but you can run the following quality checks to validate runtime health and parser behavior.

Syntax and import checks:

python -m compileall app.py adops_logic.py

Manual validation in the Streamlit interface:

streamlit run app.py

Optional parser smoke test in Python REPL:

python - <<'PY'
from adops_logic import AdsTxtParser

sample = """
google.com, pub-123, DIRECT, f08c47fec0942fa0
invalid.example, pub-456, WRONG
# comment-only line
"""

parser = AdsTxtParser()
df, errors = parser.parse_content(sample)
print(df)
print(errors)
print(parser.get_stats(df))
PY

Warning

If you add CI later, include deterministic tests for parser edge cases (comments, malformed rows, mixed casing, and missing optional fields) to prevent silent regressions.

Deployment

Production Run (Single-Instance)

Use Streamlit’s standard runtime with explicit host/port settings:

streamlit run app.py --server.address 0.0.0.0 --server.port 8501

Containerization (Recommended)

A minimal Docker deployment command (after adding your own Dockerfile):

docker build -t adops-shield:latest .
docker run --rm -p 8501:8501 adops-shield:latest

CI/CD Integration Guidelines

In your CI pipeline, add stages for:

Dependency installation (pip install -r requirements.txt)
Static/syntax checks (python -m compileall ...)
Optional parser smoke tests
Container image build and push

Caution

For internet-facing deployments, place this app behind TLS termination and standard reverse-proxy protections; URL fetching accepts external input and should be monitored with request limits/timeouts.

Usage

Launch the UI

streamlit run app.py

Analyze from URL

Open the sidebar and select Load from URL.
Enter either:
- a root domain (example.com), or
- a full path (https://example.com/ads.txt).
Click Fetch Data.
Inspect KPIs, chart insights, and syntax errors.
Export filtered records via Download CSV.

Analyze from File Upload

Select Upload File in the sidebar.
Upload an ads.txt/app-ads.txt-style .txt file.
Review parsed output and validation report.
Filter by domain or publisher ID and export to CSV.

Programmatic Parsing Example

from adops_logic import AdsTxtParser

raw_text = """
google.com, pub-0000000000000000, DIRECT, f08c47fec0942fa0
appnexus.com, 12345, RESELLER
invalid.com, account-1, PARTNER
"""

parser = AdsTxtParser()

# Parse content into structured rows + validation errors.
df, errors = parser.parse_content(raw_text)

# Compute aggregate stats for dashboards or downstream reporting.
stats = parser.get_stats(df)

print("Valid rows:")
print(df)
print("\nErrors:")
print(errors)
print("\nStats:")
print(stats)

Expected behavior:

The first two records are accepted as valid rows.
The third record is flagged due to invalid account type.
stats is calculated only from valid rows.

Configuration

Input and Fetch Behavior

Option	Current Behavior	Notes
URL protocol	Auto-prepends `https://` if missing	In `fetch_from_url`
Default path	Appends `/app-ads.txt` if URL does not end with `ads.txt`/`app-ads.txt`	Supports quick root-domain input
HTTP timeout	`10` seconds	Uses `requests.get(..., timeout=10)`

Validation Behavior

Rule	Requirement	Failure Output
Field count	At least 3 comma-separated fields	`Insufficient parameters (minimum 3 required)`
Account type	Must be `DIRECT` or `RESELLER`	`Invalid Account Type: ...`
Comment handling	Inline `#` comments stripped from parse payload	Preserved in `Comment` column

Environment Configuration

This project currently has no mandatory .env file. If you need environment-driven behavior, consider introducing the following pattern:

# Example optional runtime settings for future extension
APP_PORT=8501
APP_HOST=0.0.0.0
FETCH_TIMEOUT_SECONDS=10

# Example startup with explicit Streamlit flags
streamlit run app.py --server.address 0.0.0.0 --server.port 8501

Note

A future enhancement could externalize fetch timeout and default path behavior into typed config for reproducible multi-environment deployments.

License

This project is licensed under the Apache License 2.0. See LICENSE for full terms.

Contacts & Community Support

Support the Project

If you find this tool useful, consider leaving a star on GitHub or supporting the author directly.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github		.github
LICENSE		LICENSE
README.md		README.md
adops_logic.py		adops_logic.py
app.py		app.py
requirements.txt		requirements.txt

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

AdOps Shield: Ads.txt Validator & Analyzer

Table of Contents

Features

Tech Stack & Architecture

Core Stack

Project Structure

Key Design Decisions

Getting Started

Prerequisites

Installation

Testing

Deployment

Production Run (Single-Instance)

Containerization (Recommended)

CI/CD Integration Guidelines

Usage

Launch the UI

Analyze from URL

Analyze from File Upload

Programmatic Parsing Example

Configuration

Input and Fetch Behavior

Validation Behavior

Environment Configuration

License

Contacts & Community Support

Support the Project

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages