A production-ready Streamlit analytics toolkit for validating, auditing, and operationalizing ads.txt and app-ads.txt supply-path declarations at scale.
Important
This tool focuses on syntax-level and structural validation of ads.txt and app-ads.txt records aligned with common IAB-style formatting expectations (Domain, Publisher ID, Account Type, optional Certification ID).
- Features
- Tech Stack \& Architecture
- Getting Started
- Testing
- Deployment
- Usage
- Configuration
- License
- Contacts \& Community Support
- Multi-source ingestion pipeline:
- Accepts user-provided domains/URLs.
- Accepts uploaded local
.txtfiles.
- Smart URL normalization:
- Adds
https://when protocol is missing. - Appends
/app-ads.txtwhen a raw domain/root URL is provided.
- Adds
- Parser for canonical record format:
- Extracts
Domain,Publisher_ID,Account_Type,Certification_ID, and trailing inline comments.
- Extracts
- IAB-style syntax validation checks:
- Detects insufficient comma-separated fields (minimum 3 required).
- Detects invalid account types (must be
DIRECTorRESELLER).
- Error observability workflow:
- Captures line-level parse errors with original line content and reason.
- Displays errors as a navigable table in the UI.
- Analytics-ready data model:
- Creates a structured
pandas.DataFramefor valid records. - Generates normalized values (e.g., lowercase domains, uppercase account types).
- Creates a structured
- Operational dashboarding:
- KPI cards for total valid rows, unique partners,
DIRECT, andRESELLERcounts. - Pie chart for account type distribution.
- Horizontal bar chart for top supply partners.
- KPI cards for total valid rows, unique partners,
- Data explorer and export:
- Full-table browser with text search by domain or publisher ID.
- One-click CSV export of the filtered view.
- Lightweight architecture:
- Clear separation of concerns between parsing logic (
adops_logic.py) and UI (app.py).
- Clear separation of concerns between parsing logic (
Tip
Use this as a first-line QA utility in AdOps workflows before reconciling seller declarations against deeper business rules or external partner allowlists.
- Language: Python 3.10+
- Web UI: Streamlit
- Data Processing: Pandas
- Charting: Plotly Express
- HTTP Client: Requests
Ads.txt-Validator-Analyzer/
├── app.py # Streamlit interface and dashboard orchestration
├── adops_logic.py # Fetching, parsing, validation, and stats engine
├── requirements.txt # Runtime dependencies
├── LICENSE # Apache-2.0 license text
├── README.md # Project documentation
└── .github/
└── FUNDING.yml # Sponsorship metadata
-
Separation of concerns
app.pyowns interaction and visualization.adops_logic.pyowns data retrieval, validation, and parsing.- This keeps parser logic reusable for CLI, API, or batch integrations.
-
Fail-soft parsing strategy
- Invalid lines are collected into an error list instead of terminating parsing.
- Valid lines still produce analytics and exportable datasets.
-
Schema-first normalization
- Parser emits a fixed column model for compatibility with BI workflows.
- Account types are normalized to uppercase and domains to lowercase.
-
Operational UI ergonomics
- Sidebar input for source selection.
- In-page KPI + charts + searchable table for quick triage.
flowchart TD
A[User Input] --> B{Source Type}
B -->|URL/Domain| C[fetch_from_url]
B -->|Text File| D[Read Uploaded Content]
C --> E[Raw ads.txt/app-ads.txt Content]
D --> E
E --> F[parse_content]
F --> G[Valid Records DataFrame]
F --> H[Line-level Validation Errors]
G --> I[get_stats]
G --> J[Plotly Visualizations]
G --> K[Search + CSV Export]
H --> L[Error Table in UI]
I --> M[KPI Metrics]
Note
The current parser implements practical syntax checks and normalization. It is intentionally lightweight and does not implement every possible ads.txt semantic enforcement rule.
- Python
3.10or newer pip(latest recommended)- Internet connectivity (for URL-based fetching)
- Clone the repository:
git clone https://github.com/your-username/ads.txt-validator-analyzer.git
cd ads.txt-validator-analyzer- Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate- Install dependencies:
pip install --upgrade pip
pip install -r requirements.txt- Run the application:
streamlit run app.py- Open the local URL shown by Streamlit (typically
http://localhost:8501).
This repository currently does not ship with a formal automated test suite, but you can run the following quality checks to validate runtime health and parser behavior.
- Syntax and import checks:
python -m compileall app.py adops_logic.py- Manual validation in the Streamlit interface:
streamlit run app.py- Optional parser smoke test in Python REPL:
python - <<'PY'
from adops_logic import AdsTxtParser
sample = """
google.com, pub-123, DIRECT, f08c47fec0942fa0
invalid.example, pub-456, WRONG
# comment-only line
"""
parser = AdsTxtParser()
df, errors = parser.parse_content(sample)
print(df)
print(errors)
print(parser.get_stats(df))
PYWarning
If you add CI later, include deterministic tests for parser edge cases (comments, malformed rows, mixed casing, and missing optional fields) to prevent silent regressions.
Use Streamlit’s standard runtime with explicit host/port settings:
streamlit run app.py --server.address 0.0.0.0 --server.port 8501A minimal Docker deployment command (after adding your own Dockerfile):
docker build -t adops-shield:latest .
docker run --rm -p 8501:8501 adops-shield:latestIn your CI pipeline, add stages for:
- Dependency installation (
pip install -r requirements.txt) - Static/syntax checks (
python -m compileall ...) - Optional parser smoke tests
- Container image build and push
Caution
For internet-facing deployments, place this app behind TLS termination and standard reverse-proxy protections; URL fetching accepts external input and should be monitored with request limits/timeouts.
streamlit run app.py- Open the sidebar and select Load from URL.
- Enter either:
- a root domain (
example.com), or - a full path (
https://example.com/ads.txt).
- a root domain (
- Click Fetch Data.
- Inspect KPIs, chart insights, and syntax errors.
- Export filtered records via Download CSV.
- Select Upload File in the sidebar.
- Upload an
ads.txt/app-ads.txt-style.txtfile. - Review parsed output and validation report.
- Filter by domain or publisher ID and export to CSV.
from adops_logic import AdsTxtParser
raw_text = """
google.com, pub-0000000000000000, DIRECT, f08c47fec0942fa0
appnexus.com, 12345, RESELLER
invalid.com, account-1, PARTNER
"""
parser = AdsTxtParser()
# Parse content into structured rows + validation errors.
df, errors = parser.parse_content(raw_text)
# Compute aggregate stats for dashboards or downstream reporting.
stats = parser.get_stats(df)
print("Valid rows:")
print(df)
print("\nErrors:")
print(errors)
print("\nStats:")
print(stats)Expected behavior:
- The first two records are accepted as valid rows.
- The third record is flagged due to invalid account type.
statsis calculated only from valid rows.
| Option | Current Behavior | Notes |
|---|---|---|
| URL protocol | Auto-prepends https:// if missing |
In fetch_from_url |
| Default path | Appends /app-ads.txt if URL does not end with ads.txt/app-ads.txt |
Supports quick root-domain input |
| HTTP timeout | 10 seconds |
Uses requests.get(..., timeout=10) |
| Rule | Requirement | Failure Output |
|---|---|---|
| Field count | At least 3 comma-separated fields | Insufficient parameters (minimum 3 required) |
| Account type | Must be DIRECT or RESELLER |
Invalid Account Type: ... |
| Comment handling | Inline # comments stripped from parse payload |
Preserved in Comment column |
This project currently has no mandatory .env file. If you need environment-driven behavior, consider introducing the following pattern:
# Example optional runtime settings for future extension
APP_PORT=8501
APP_HOST=0.0.0.0
FETCH_TIMEOUT_SECONDS=10# Example startup with explicit Streamlit flags
streamlit run app.py --server.address 0.0.0.0 --server.port 8501Note
A future enhancement could externalize fetch timeout and default path behavior into typed config for reproducible multi-environment deployments.
This project is licensed under the Apache License 2.0. See LICENSE for full terms.
If you find this tool useful, consider leaving a star on GitHub or supporting the author directly.