Bureau of Transportation Statistics | U.S. Department of Transportation
This repository contains the design documents, data collection instruments, analytical schemas, and statistical analysis code for the National Cargo Theft Data Collection Program -- a proposed federal statistical program to produce the first comprehensive, publicly accountable national estimates of cargo theft prevalence, total economic loss, and the dark figure of unreported crime in the United States.
Cargo theft costs the American freight transportation system hundreds of millions of dollars annually in direct losses, with total economic impact -- including insurance costs, security expenditures, operational disruption, and investigation -- estimated to be several times higher. Yet no single authoritative data source exists to measure the true scope of the problem. Federal data through the FBI's National Incident-Based Reporting System (NIBRS) are incomplete due to voluntary participation and limited cargo-specific detail. Private-sector databases such as Verisk CargoNet provide richer operational intelligence but are proprietary and membership-dependent. Insurance data capture financial losses but lack operational context. The result is a fragmented data ecosystem that leaves policymakers without the evidence base needed to allocate resources effectively.
This program, developed under the authority of 49 U.S.C. Section 111 and in response to DOT RFI Docket DOT-OST-2025-1326, deploys a multi-source data architecture integrating carrier surveys, insurance industry questionnaires, enhanced law enforcement reporting, and industry database partnerships through a unified incident schema. The repository provides the complete methodological and operational foundation for establishing a permanent national cargo theft statistical program.
cargo-theft/
├── README.md # This file
├── docs/
│ ├── data_sources_review.md # Comprehensive review of existing cargo theft data sources
│ ├── methodology.md # Statistical methodology for data collection and estimation
│ └── report_framework.md # Report template for Congress/DOT leadership
├── instruments/
│ ├── carrier_survey.md # Form BTS CT-1: Motor Carrier and Logistics Provider Survey
│ ├── insurer_questionnaire.md # Form BTS CT-2: Insurance Industry Questionnaire
│ └── law_enforcement_reporting_guide.md # Form BTS CT-3 Guide: Law Enforcement Reporting Guide
├── schemas/
│ ├── incident_schema.json # Unified Cargo Theft Incident schema (JSON Schema Draft 2020-12, v1.0)
│ ├── data_dictionary.md # Field-by-field documentation of the incident schema
│ └── source_crosswalk.md # Mapping of data elements across all five source types
├── analysis/
│ ├── requirements.txt # Python dependencies for analysis scripts
│ ├── data_loader.py # Data ingestion and validation pipeline
│ ├── prevalence_estimation.py # Capture-recapture and prevalence estimation models
│ ├── loss_valuation.py # Economic loss valuation and total impact estimation
│ ├── geographic_analysis.py # Spatial analysis and geographic distribution
│ ├── trend_analysis.py # Time-series trend analysis and forecasting
│ └── generate_report_tables.py # Generates all tables for the report framework
└── references/
└── bibliography.md # Annotated bibliography and reference materials
docs/ -- Core program documentation, including the comprehensive review of all existing cargo theft data sources (federal, private sector, insurance, international), the full statistical methodology governing the data collection program, and the report framework template structured as a formal report to Congress.
instruments/ -- The three data collection instruments designed for OMB clearance under the Paperwork Reduction Act. Form BTS CT-1 targets motor carriers and logistics providers (45-minute burden). Form BTS CT-2 targets cargo and inland marine insurers (60-minute burden). The Form BTS CT-3 Guide provides law enforcement agencies with instructions for enhanced NIBRS cargo theft reporting (15-minute burden per incident).
schemas/ -- The technical data infrastructure. The unified incident schema defines the standard record format for integrating data from all five source types (NIBRS, carrier surveys, insurance claims, voluntary reports, industry databases). The data dictionary documents every field. The source crosswalk maps each source's native data elements to the unified schema.
analysis/ -- Python analysis pipeline for processing collected data and producing the statistical estimates and report tables. Modules cover data loading and validation, prevalence estimation using capture-recapture methods, economic loss valuation, geographic analysis, trend analysis, and automated report table generation.
references/ -- Annotated bibliography and reference materials supporting the program design.
The following figures summarize the current state of cargo theft in the United States, drawn from the best available industry data. These figures underscore the urgency of establishing a comprehensive federal statistical program.
| Metric | Value | Source |
|---|---|---|
| Reported cargo theft events (2025) | 3,594 | Verisk CargoNet |
| Total reported losses (2025) | $725 million | Verisk CargoNet |
| Average loss per incident (2025) | $273,990 | Verisk CargoNet |
| Year-over-year increase in total losses | 60% | Verisk CargoNet |
| Year-over-year increase in average loss | 36% (from $202,364 in 2024) | Verisk CargoNet |
| Cargo theft increase (2024) | 27% | NICB |
| Projected additional increase (2025) | 22% | NICB |
| Top states by incident volume | CA, TX, FL, IL, GA | CargoNet / NICB |
| Top targeted commodities | Food/beverage, metals, electronics, pharmaceuticals | CargoNet |
| NIBRS participating agencies | 16,000+ | FBI CJIS |
| Estimated program budget (18-month cycle) | $1.8M -- $2.9M | BTS estimate |
These reported figures almost certainly understate the true scope of the problem. The actual prevalence of cargo theft is unknown because of the dark figure -- the proportion of incidents never reported to any data source.
- Python 3.10 or later
- pip package manager
# Clone the repository
git clone https://github.com/[org]/cargo-theft.git
cd cargo-theft
# Install Python dependencies for analysis
pip install -r analysis/requirements.txtThe analysis scripts are designed to process collected survey and administrative data. Each module can be run independently or through the report table generator.
# Generate all report tables (runs the full pipeline)
python analysis/generate_report_tables.py
# Run individual analysis modules
python analysis/data_loader.py # Load and validate source data
python analysis/prevalence_estimation.py # Estimate prevalence and dark figure
python analysis/loss_valuation.py # Calculate economic loss estimates
python analysis/geographic_analysis.py # Produce geographic distributions
python analysis/trend_analysis.py # Analyze trends over timeNote: The analysis scripts require input data from the data collection instruments. Until data collection is complete, the scripts operate on synthetic or placeholder data for development and testing purposes.
For a comprehensive understanding of the program, read the documents in this order:
docs/data_sources_review.md-- Understand the current data landscape and its limitationsdocs/methodology.md-- Learn the statistical design of the data collection programinstruments/-- Review the three data collection instrumentsschemas/-- Examine the unified data architecturedocs/report_framework.md-- See the final report structure
The program employs a multi-source data architecture that integrates five independent data streams:
- FBI NIBRS -- Federal law enforcement incident records flagged with Data Element 2A (Cargo Theft)
- Carrier Surveys (Form BTS CT-1) -- Stratified probability sample of motor carriers from the FMCSA Motor Carrier Census
- Insurance Claims (Form BTS CT-2) -- Aggregate claims data from major cargo and inland marine insurers
- Voluntary Industry Reports -- Direct submissions from carriers, shippers, and logistics providers
- Industry Databases -- Structured data sharing with CargoNet, NICB, and other intelligence providers
The estimation framework uses:
- Stratified sampling with design-based weighting and nonresponse adjustment for population estimates
- Capture-recapture methods applied to source overlap to estimate the dark figure of unreported theft
- Multi-source triangulation to validate estimates across independent data streams
- Commodity-specific valuation using wholesale price indices for consistent loss measurement
All data elements are standardized through the unified incident schema (schemas/incident_schema.json), with source-specific mappings documented in the source crosswalk (schemas/source_crosswalk.md).
The full methodology is documented in docs/methodology.md.
The program draws on and integrates the following data sources. A comprehensive assessment of each source's strengths, limitations, and coverage is provided in docs/data_sources_review.md.
- FBI NIBRS -- National Incident-Based Reporting System with cargo theft flag (Data Element 2A). Primary federal law enforcement data. Voluntary participation; limited cargo-specific detail.
- BTS Freight Analysis Framework (FAF) -- Provides freight flow denominators (value, tonnage, ton-miles) for calculating theft rates by mode, commodity, and geography.
- DOT Office of the Secretary -- Issued RFI Docket DOT-OST-2025-1326 (September 2025) on cargo theft data collection.
- Verisk CargoNet -- Largest industry cargo theft database in North America. Subscription-based.
- National Insurance Crime Bureau (NICB) -- Insurance-industry cargo theft intelligence. Membership-dependent.
- Overhaul -- Quarterly U.S. cargo theft reports with spatial and temporal analysis.
- BSI / TT Club -- Annual global cargo theft intelligence report.
- American Transportation Research Institute (ATRI) -- Published cargo theft cost research (October 2025).
- Motor Carrier Survey (Form BTS CT-1) -- Industry-reported theft incidents, security costs, underreporting assessment
- Insurance Questionnaire (Form BTS CT-2) -- Aggregate cargo theft claims data from insurers
- Law Enforcement Reports (Form BTS CT-3 Guide) -- Enhanced NIBRS reporting with supplemental cargo data elements
49 U.S.C. Section 111 -- Bureau of Transportation Statistics. Authorizes BTS to collect, compile, analyze, and publish information on the state of transportation in the United States, including data on freight transportation and conditions affecting the movement of goods.
USA Patriot Improvement and Reauthorization Act of 2005 (H.R. 3199), Section 7201. Directed the Attorney General to ensure that reports of cargo theft are reflected as a separate category within the Uniform Crime Reporting system, leading to the creation of NIBRS Data Element 2A.
Confidential Information Protection and Statistical Efficiency Act of 2018 (CIPSEA), 44 U.S.C. Section 3572. All respondent data collected under this program are protected under CIPSEA. Data are used for statistical purposes only and cannot be disclosed in identifiable form or used for any non-statistical purpose, including law enforcement, regulation, or taxation. Violations carry penalties of up to $250,000 in fines, five years imprisonment, or both.
All data collection instruments require clearance from the Office of Management and Budget under the Paperwork Reduction Act (44 U.S.C. Chapter 35) prior to deployment. OMB control numbers are pending.
| Form | Title | Target Population | Burden | Reference |
|---|---|---|---|---|
| BTS CT-1 | Motor Carrier and Logistics Provider Survey | Motor carriers, brokers, 3PLs, freight forwarders | 45 min | instruments/carrier_survey.md |
| BTS CT-2 | Insurance Industry Questionnaire | Cargo and inland marine insurers | 60 min | instruments/insurer_questionnaire.md |
| BTS CT-3 Guide | Law Enforcement Reporting Guide | State, local, and federal LE agencies | 15 min/incident | instruments/law_enforcement_reporting_guide.md |
The unified incident schema (schemas/incident_schema.json) standardizes cargo theft data elements across all five source types into a single analytical framework. Key features:
- JSON Schema Draft 2020-12 format for programmatic validation
- 17 required top-level fields and over 120 total data elements
- Source-agnostic design accommodating records from NIBRS, carrier surveys, insurance claims, voluntary reports, and industry databases
- Enumerated classifications for theft type, commodity, location, transportation mode, and supply chain stage
- Provenance tracking with source identification, ingestion timestamp, and quality indicators
Supporting documentation:
- Data Dictionary:
schemas/data_dictionary.md-- Business definitions, valid values, and source mappings for every field - Source Crosswalk:
schemas/source_crosswalk.md-- Element-by-element mapping across all five source types
This repository supports the design phase of the National Cargo Theft Data Collection Program. Contributions, feedback, and subject-matter expertise are welcome in the following areas:
- Data source identification and assessment
- Statistical methodology review
- Survey instrument design and cognitive testing
- Schema and data dictionary refinement
- Analysis pipeline development and testing
- Review the existing documentation, particularly
docs/data_sources_review.mdanddocs/methodology.md - Open an issue describing the proposed contribution or feedback
- For code changes, submit a pull request with a clear description of the change and its rationale
- All contributions must be consistent with the program's statistical standards and CIPSEA confidentiality requirements
Bureau of Transportation Statistics U.S. Department of Transportation 1200 New Jersey Avenue SE Washington, DC 20590
- Cargo Theft Program: [email protected]
- General Inquiries: (202) XXX-XXXX
- DOT Freight Page: https://www.transportation.gov/freight/cargotheft
- DOT RFI: Docket DOT-OST-2025-1326, "Protecting America's Supply Chain from Cargo Theft" (September 2025)
- FBI NIBRS Addendum for Submitting Cargo Theft Data (January 2010)
- USA Patriot Improvement and Reauthorization Act of 2005, Section 7201
- 49 U.S.C. Section 111 -- Bureau of Transportation Statistics
- CIPSEA (44 U.S.C. Section 3572)
- Verisk CargoNet Annual Cargo Theft Intelligence Report, 2025
- NICB Cargo Theft Annual Report, 2024
- ATRI, The Costs of Cargo Theft, October 2025
For the full annotated bibliography, see references/bibliography.md.
This project is a work product of the United States Government. As such, it is not subject to copyright protection within the United States under 17 U.S.C. Section 105.