The SAFER FMCSA DOT Crawler collects structured, high-quality data on U.S. motor carriers directly from FMCSA’s public “Company Snapshot” pages. It streamlines large-scale data gathering for transportation analytics, compliance workflows, and lead generation. This repository provides a reliable, filter-driven way to extract consistent DOT records at scale.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for SAFER FMCSA DOT Crawler you've just found your team — Let’s Chat. 👆👆
This project automates the extraction of FMCSA carrier information, returning each carrier profile as a structured JSON object. It solves the challenge of manually collecting DOT records by providing a fast, resilient, and filter-friendly crawler. Ideal for logistics companies, insurance providers, data analysts, researchers, and teams building transportation datasets.
- Fetches detailed FMCSA “Company Snapshot” records in bulk.
- Supports DOT range, registration date filters, and fleet attributes.
- Provides optional extended fields such as crash reports and safety ratings.
- Clean, flat JSON output optimized for analytics pipelines.
- Designed for large-scale, long-running extraction tasks.
| Feature | Description |
|---|---|
| Comprehensive Data Extraction | Captures legal names, phone numbers, addresses, cargo types, fleet size, inspections, and more. |
| Filter-Based Targeting | Pull only the carriers you need using DOT ranges and registration date filters. |
| High Reliability | Automatic retries, session rotation, and back-off logic prevent throttling interruptions. |
| Scalable Performance | Optimized for millions of records with memory-efficient streaming. |
| Premium Mode Support | Optionally extracts emails, crash statistics, and safety ratings. |
| Cleaned Output | Phone numbers normalized, dates formatted, and records standardized for downstream use. |
| Field Name | Field Description |
|---|---|
| DOT_num | DOT registration number. |
| entity_type | Carrier business type. |
| legal_name | Official legal company name. |
| dba_name | “Doing Business As” name. |
| mcs150_date | MCS-150 update date in MM-DD-YY format. |
| mcs150_mileage | Annual mileage reported. |
| mcs150_mileage_year | Year of the mileage report. |
| mc_mx_ff_numbers | Carrier MC/MX/FF registration IDs. |
| phone | Primary phone number (digits only). |
| cell_phone | Mobile contact number if available. |
| physical_address | Full physical business address. |
| mailing_address | Full mailing address. |
| power_units | Count of power units (trucks, tractors). |
| drivers | Total active drivers. |
| truck_units | Truck units count. |
| bus_units | Bus units count. |
| fleet_size | Fleet size bucket/category. |
| cargo_carried | Array of cargo types transported. |
| carrier_operation | Interstate or intrastate operation classification. |
| operation_classification | Operational categories. |
| company_officer_1 | Primary company officer. |
| company_officer_2 | Secondary officer. |
| DUNS_num | Dun & Bradstreet identifier. |
| Public email address (premium). | |
| inspections_us | U.S. inspection statistics. |
| inspections_ca | Canadian inspection stats. |
| crashes_us | U.S. crash stats (premium). |
| crashes_ca | Canadian crash stats (premium). |
| safety_rating | Safety rating summary (premium). |
{
"DOT_num": "2802023",
"entity_type": "CARRIER",
"legal_name": "Example Logistics LLC",
"dba_name": "Example Trucks",
"mcs150_date": "09-06-24",
"mcs150_mileage": "120000",
"mcs150_mileage_year": "2024",
"mc_mx_ff_numbers": "MC123456",
"phone": "5551234567",
"cell_phone": "5559876543",
"physical_address": "123 Main St, Springfield, IL, 62701, US",
"mailing_address": "PO Box 456, Springfield, IL, 62701, US",
"power_units": 50,
"drivers": 75,
"truck_units": 45,
"bus_units": 0,
"fleet_size": "45-55",
"cargo_carried": ["General Freight", "Building Materials"],
"carrier_operation": ["Interstate"],
"operation_classification": ["For Hire"],
"company_officer_1": "John Smith",
"company_officer_2": "Jane Doe",
"DUNS_num": "012345678",
"email": "[email protected]",
"inspections_us": {
"driver": "12",
"vehicle": "8",
"hazmat": "2",
"iep": "0"
}
}
SAFER FMCSA DOT Crawler/
├── src/
│ ├── index.js
│ ├── crawler/
│ │ ├── fetchSnapshot.js
│ │ ├── parseSnapshot.js
│ │ └── filters.js
│ ├── utils/
│ │ ├── formatters.js
│ │ ├── request.js
│ │ └── validation.js
│ ├── outputs/
│ │ └── writer.js
│ └── config/
│ └── settings.example.json
├── data/
│ ├── sample-input.json
│ └── sample-output.json
├── package.json
├── LICENSE
└── README.md
- Logistics sales teams use it to identify carriers in specific regions or fleet sizes so they can build targeted outreach lists.
- Insurance analysts use it to evaluate carrier operations, inspections, and crash data to improve risk scoring.
- Market researchers use it to measure competitor presence and regional fleet distribution.
- Compliance departments use it to monitor carrier safety ratings and regulatory updates.
- Data engineering teams use it to populate transportation datasets for analytics dashboards.
Does this tool support filtering by multiple attributes at once? Currently, the crawler performs best with a primary filter per run, such as a DOT range or registration date. Combining multiple filters may reduce output volume or slow down performance.
Are inactive carriers included in the results? No. Only records marked with active status are returned for consistency and relevancy.
Does the output include cleaned and normalized fields? Yes. Phone numbers contain digits only, dates are normalized, and addresses follow a consistent formatting scheme.
Can I extract crash statistics and safety ratings? Yes, these are available through the optional premium fields.
Primary Metric: Processes an average of 8,000–12,000 carrier snapshots per hour depending on DOT range density. Reliability Metric: Achieves a 98%+ successful retrieval rate during long-running sessions with automated back-off. Efficiency Metric: Maintains low memory usage through streaming output, enabling multi-million-record extractions. Quality Metric: Produces 95%+ field completeness across standard fields due to robust parsing and normalization.
