Skip to content

bryancasler/cargo-theft

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

National Cargo Theft Data Collection Program

Bureau of Transportation Statistics | U.S. Department of Transportation


Overview

This repository contains the design documents, data collection instruments, analytical schemas, and statistical analysis code for the National Cargo Theft Data Collection Program -- a proposed federal statistical program to produce the first comprehensive, publicly accountable national estimates of cargo theft prevalence, total economic loss, and the dark figure of unreported crime in the United States.

Cargo theft costs the American freight transportation system hundreds of millions of dollars annually in direct losses, with total economic impact -- including insurance costs, security expenditures, operational disruption, and investigation -- estimated to be several times higher. Yet no single authoritative data source exists to measure the true scope of the problem. Federal data through the FBI's National Incident-Based Reporting System (NIBRS) are incomplete due to voluntary participation and limited cargo-specific detail. Private-sector databases such as Verisk CargoNet provide richer operational intelligence but are proprietary and membership-dependent. Insurance data capture financial losses but lack operational context. The result is a fragmented data ecosystem that leaves policymakers without the evidence base needed to allocate resources effectively.

This program, developed under the authority of 49 U.S.C. Section 111 and in response to DOT RFI Docket DOT-OST-2025-1326, deploys a multi-source data architecture integrating carrier surveys, insurance industry questionnaires, enhanced law enforcement reporting, and industry database partnerships through a unified incident schema. The repository provides the complete methodological and operational foundation for establishing a permanent national cargo theft statistical program.


Project Structure

cargo-theft/
├── README.md                                    # This file
├── docs/
│   ├── data_sources_review.md                   # Comprehensive review of existing cargo theft data sources
│   ├── methodology.md                           # Statistical methodology for data collection and estimation
│   └── report_framework.md                      # Report template for Congress/DOT leadership
├── instruments/
│   ├── carrier_survey.md                        # Form BTS CT-1: Motor Carrier and Logistics Provider Survey
│   ├── insurer_questionnaire.md                 # Form BTS CT-2: Insurance Industry Questionnaire
│   └── law_enforcement_reporting_guide.md       # Form BTS CT-3 Guide: Law Enforcement Reporting Guide
├── schemas/
│   ├── incident_schema.json                     # Unified Cargo Theft Incident schema (JSON Schema Draft 2020-12, v1.0)
│   ├── data_dictionary.md                       # Field-by-field documentation of the incident schema
│   └── source_crosswalk.md                      # Mapping of data elements across all five source types
├── analysis/
│   ├── requirements.txt                         # Python dependencies for analysis scripts
│   ├── data_loader.py                           # Data ingestion and validation pipeline
│   ├── prevalence_estimation.py                 # Capture-recapture and prevalence estimation models
│   ├── loss_valuation.py                        # Economic loss valuation and total impact estimation
│   ├── geographic_analysis.py                   # Spatial analysis and geographic distribution
│   ├── trend_analysis.py                        # Time-series trend analysis and forecasting
│   └── generate_report_tables.py                # Generates all tables for the report framework
└── references/
    └── bibliography.md                          # Annotated bibliography and reference materials

Directory Descriptions

docs/ -- Core program documentation, including the comprehensive review of all existing cargo theft data sources (federal, private sector, insurance, international), the full statistical methodology governing the data collection program, and the report framework template structured as a formal report to Congress.

instruments/ -- The three data collection instruments designed for OMB clearance under the Paperwork Reduction Act. Form BTS CT-1 targets motor carriers and logistics providers (45-minute burden). Form BTS CT-2 targets cargo and inland marine insurers (60-minute burden). The Form BTS CT-3 Guide provides law enforcement agencies with instructions for enhanced NIBRS cargo theft reporting (15-minute burden per incident).

schemas/ -- The technical data infrastructure. The unified incident schema defines the standard record format for integrating data from all five source types (NIBRS, carrier surveys, insurance claims, voluntary reports, industry databases). The data dictionary documents every field. The source crosswalk maps each source's native data elements to the unified schema.

analysis/ -- Python analysis pipeline for processing collected data and producing the statistical estimates and report tables. Modules cover data loading and validation, prevalence estimation using capture-recapture methods, economic loss valuation, geographic analysis, trend analysis, and automated report table generation.

references/ -- Annotated bibliography and reference materials supporting the program design.


Key Statistics: The Cargo Theft Data Landscape

The following figures summarize the current state of cargo theft in the United States, drawn from the best available industry data. These figures underscore the urgency of establishing a comprehensive federal statistical program.

Metric Value Source
Reported cargo theft events (2025) 3,594 Verisk CargoNet
Total reported losses (2025) $725 million Verisk CargoNet
Average loss per incident (2025) $273,990 Verisk CargoNet
Year-over-year increase in total losses 60% Verisk CargoNet
Year-over-year increase in average loss 36% (from $202,364 in 2024) Verisk CargoNet
Cargo theft increase (2024) 27% NICB
Projected additional increase (2025) 22% NICB
Top states by incident volume CA, TX, FL, IL, GA CargoNet / NICB
Top targeted commodities Food/beverage, metals, electronics, pharmaceuticals CargoNet
NIBRS participating agencies 16,000+ FBI CJIS
Estimated program budget (18-month cycle) $1.8M -- $2.9M BTS estimate

These reported figures almost certainly understate the true scope of the problem. The actual prevalence of cargo theft is unknown because of the dark figure -- the proportion of incidents never reported to any data source.


Getting Started

Prerequisites

  • Python 3.10 or later
  • pip package manager

Installation

# Clone the repository
git clone https://github.com/[org]/cargo-theft.git
cd cargo-theft

# Install Python dependencies for analysis
pip install -r analysis/requirements.txt

Running the Analysis Pipeline

The analysis scripts are designed to process collected survey and administrative data. Each module can be run independently or through the report table generator.

# Generate all report tables (runs the full pipeline)
python analysis/generate_report_tables.py

# Run individual analysis modules
python analysis/data_loader.py              # Load and validate source data
python analysis/prevalence_estimation.py    # Estimate prevalence and dark figure
python analysis/loss_valuation.py           # Calculate economic loss estimates
python analysis/geographic_analysis.py      # Produce geographic distributions
python analysis/trend_analysis.py           # Analyze trends over time

Note: The analysis scripts require input data from the data collection instruments. Until data collection is complete, the scripts operate on synthetic or placeholder data for development and testing purposes.

Reading the Documentation

For a comprehensive understanding of the program, read the documents in this order:

  1. docs/data_sources_review.md -- Understand the current data landscape and its limitations
  2. docs/methodology.md -- Learn the statistical design of the data collection program
  3. instruments/ -- Review the three data collection instruments
  4. schemas/ -- Examine the unified data architecture
  5. docs/report_framework.md -- See the final report structure

Methodology Overview

The program employs a multi-source data architecture that integrates five independent data streams:

  1. FBI NIBRS -- Federal law enforcement incident records flagged with Data Element 2A (Cargo Theft)
  2. Carrier Surveys (Form BTS CT-1) -- Stratified probability sample of motor carriers from the FMCSA Motor Carrier Census
  3. Insurance Claims (Form BTS CT-2) -- Aggregate claims data from major cargo and inland marine insurers
  4. Voluntary Industry Reports -- Direct submissions from carriers, shippers, and logistics providers
  5. Industry Databases -- Structured data sharing with CargoNet, NICB, and other intelligence providers

The estimation framework uses:

  • Stratified sampling with design-based weighting and nonresponse adjustment for population estimates
  • Capture-recapture methods applied to source overlap to estimate the dark figure of unreported theft
  • Multi-source triangulation to validate estimates across independent data streams
  • Commodity-specific valuation using wholesale price indices for consistent loss measurement

All data elements are standardized through the unified incident schema (schemas/incident_schema.json), with source-specific mappings documented in the source crosswalk (schemas/source_crosswalk.md).

The full methodology is documented in docs/methodology.md.


Data Sources

The program draws on and integrates the following data sources. A comprehensive assessment of each source's strengths, limitations, and coverage is provided in docs/data_sources_review.md.

Federal Government Sources

  • FBI NIBRS -- National Incident-Based Reporting System with cargo theft flag (Data Element 2A). Primary federal law enforcement data. Voluntary participation; limited cargo-specific detail.
  • BTS Freight Analysis Framework (FAF) -- Provides freight flow denominators (value, tonnage, ton-miles) for calculating theft rates by mode, commodity, and geography.
  • DOT Office of the Secretary -- Issued RFI Docket DOT-OST-2025-1326 (September 2025) on cargo theft data collection.

Private Sector Sources

  • Verisk CargoNet -- Largest industry cargo theft database in North America. Subscription-based.
  • National Insurance Crime Bureau (NICB) -- Insurance-industry cargo theft intelligence. Membership-dependent.
  • Overhaul -- Quarterly U.S. cargo theft reports with spatial and temporal analysis.
  • BSI / TT Club -- Annual global cargo theft intelligence report.
  • American Transportation Research Institute (ATRI) -- Published cargo theft cost research (October 2025).

Program-Collected Data

  • Motor Carrier Survey (Form BTS CT-1) -- Industry-reported theft incidents, security costs, underreporting assessment
  • Insurance Questionnaire (Form BTS CT-2) -- Aggregate cargo theft claims data from insurers
  • Law Enforcement Reports (Form BTS CT-3 Guide) -- Enhanced NIBRS reporting with supplemental cargo data elements

Legal Authority

Statutory Basis

49 U.S.C. Section 111 -- Bureau of Transportation Statistics. Authorizes BTS to collect, compile, analyze, and publish information on the state of transportation in the United States, including data on freight transportation and conditions affecting the movement of goods.

USA Patriot Improvement and Reauthorization Act of 2005 (H.R. 3199), Section 7201. Directed the Attorney General to ensure that reports of cargo theft are reflected as a separate category within the Uniform Crime Reporting system, leading to the creation of NIBRS Data Element 2A.

Data Protection

Confidential Information Protection and Statistical Efficiency Act of 2018 (CIPSEA), 44 U.S.C. Section 3572. All respondent data collected under this program are protected under CIPSEA. Data are used for statistical purposes only and cannot be disclosed in identifiable form or used for any non-statistical purpose, including law enforcement, regulation, or taxation. Violations carry penalties of up to $250,000 in fines, five years imprisonment, or both.

OMB Clearance

All data collection instruments require clearance from the Office of Management and Budget under the Paperwork Reduction Act (44 U.S.C. Chapter 35) prior to deployment. OMB control numbers are pending.


Data Collection Instruments

Form Title Target Population Burden Reference
BTS CT-1 Motor Carrier and Logistics Provider Survey Motor carriers, brokers, 3PLs, freight forwarders 45 min instruments/carrier_survey.md
BTS CT-2 Insurance Industry Questionnaire Cargo and inland marine insurers 60 min instruments/insurer_questionnaire.md
BTS CT-3 Guide Law Enforcement Reporting Guide State, local, and federal LE agencies 15 min/incident instruments/law_enforcement_reporting_guide.md

Unified Data Schema

The unified incident schema (schemas/incident_schema.json) standardizes cargo theft data elements across all five source types into a single analytical framework. Key features:

  • JSON Schema Draft 2020-12 format for programmatic validation
  • 17 required top-level fields and over 120 total data elements
  • Source-agnostic design accommodating records from NIBRS, carrier surveys, insurance claims, voluntary reports, and industry databases
  • Enumerated classifications for theft type, commodity, location, transportation mode, and supply chain stage
  • Provenance tracking with source identification, ingestion timestamp, and quality indicators

Supporting documentation:

  • Data Dictionary: schemas/data_dictionary.md -- Business definitions, valid values, and source mappings for every field
  • Source Crosswalk: schemas/source_crosswalk.md -- Element-by-element mapping across all five source types

Contributing

This repository supports the design phase of the National Cargo Theft Data Collection Program. Contributions, feedback, and subject-matter expertise are welcome in the following areas:

  • Data source identification and assessment
  • Statistical methodology review
  • Survey instrument design and cognitive testing
  • Schema and data dictionary refinement
  • Analysis pipeline development and testing

How to Contribute

  1. Review the existing documentation, particularly docs/data_sources_review.md and docs/methodology.md
  2. Open an issue describing the proposed contribution or feedback
  3. For code changes, submit a pull request with a clear description of the change and its rationale
  4. All contributions must be consistent with the program's statistical standards and CIPSEA confidentiality requirements

Contact

Bureau of Transportation Statistics U.S. Department of Transportation 1200 New Jersey Avenue SE Washington, DC 20590


References

  • DOT RFI: Docket DOT-OST-2025-1326, "Protecting America's Supply Chain from Cargo Theft" (September 2025)
  • FBI NIBRS Addendum for Submitting Cargo Theft Data (January 2010)
  • USA Patriot Improvement and Reauthorization Act of 2005, Section 7201
  • 49 U.S.C. Section 111 -- Bureau of Transportation Statistics
  • CIPSEA (44 U.S.C. Section 3572)
  • Verisk CargoNet Annual Cargo Theft Intelligence Report, 2025
  • NICB Cargo Theft Annual Report, 2024
  • ATRI, The Costs of Cargo Theft, October 2025

For the full annotated bibliography, see references/bibliography.md.


License

This project is a work product of the United States Government. As such, it is not subject to copyright protection within the United States under 17 U.S.C. Section 105.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors