The paper analyzes 45,807 abstracts from ten Epidemiology and Public Health journals and classifies whether the abstract contains a policy claim. The objective is descriptive. The study quantifies trends in the prevalence of policy claims by time, country, journal, field, and study design, with classification performed using a large language model plus human validation.
- Journals: Ten established epidemiology and public health journals that publish original empirical research. The journal list extended prior manual evaluations and was finalized after discussion among the authors.
- Time window: 1990 to 2024, spanning periods before and during the rise of the policy impact agenda.
- Source and fields: Abstracts and metadata were retrieved through the Scopus API. Retrieved fields included publication year, keywords, citation counts, and corresponding author country.
- Inclusion criteria: Records classified as research articles. Additional filtering removed non-empirical content such as systematic reviews and commentaries.
- Definition: A policy claim is a concluding abstract statement that calls for policy attention or action, ranging from explicit recommendations to broader implications for policy.
- Model: DeepSeek V3.1 was run at low temperature to improve determinism. Prompts were designed to identify both explicit and implicit policy recommendations.
- Aim: The classification was used to map policy claims at scale for descriptive purposes. The study does not assess the validity of individual claims.
- Primary measures: Prevalence of policy claims by year, country, journal, keywords/topics, and study design.
- Deliverables: Summary tables and figures for the manuscript and supplementary materials.
Due to licensing restrictions, the full set of Scopus abstracts cannot be shared; not all publishers enable free sharing of abstracts, see https://i4oa.org.
Derived datasets containing publicly available bibliographic metadata (DOI, title, journal, publication year, keywords, and corresponding author country) and large language model classifications are provided in data/analysis/, together with all analysis code in code/. Researchers with Scopus access can reproduce the complete corpus using the included identifiers.
The cost and time to process such a large number of abstracts are dependent on the LLM compute / API costs; for the Deepseek API, for example, the analysis incurred ~$3 and ~10 hours of processing time. Since Deepseek is open-weight, this or other open-weight models can be run on local hardware with sufficiently high RAM.
├── code # data processing, LLM classification, validation, and analysis scripts
├── concordance # repeated LLM run outputs for concordance analyses
├── data
│ ├── analysis # derived analytic datasets
│ └── json_files # raw JSON files; not publicly available (requires SCOPUS access)
├── figures # main and supplementary figures
├── table # validation files and exported main/supplementary tables
└── docs
└── paper_draft # manuscript files
The analysis follows the sequence laid out in the code/ directory:
-
Download metadata
Query Scopus for each journal over 1990-2024 and save abstracts and metadata fields including publication year, keywords, citation counts, and corresponding author country. -
Clean corpus
Restrict the dataset to research articles and remove non-empirical items, systematic reviews, and commentaries to produce a de-duplicated, analysis-ready corpus. -
Classify policy claims
Run DeepSeek V3.1 at low temperature on each abstract using the study prompt and generate a binary indicator for the presence of a policy claim. -
Human validation
Draw samples for blinded human review and compute agreement metrics against model outputs to assess reliability of the automated classification. -
Primary analyses
Estimate prevalence by year, country, journal, field, and study design. Generate time series, country rankings, and journal contrasts. -
Keyword analyses
Describe variation in claim rates across keywords and examine changes over time by topic. -
Reporting
Export figures and tables for the manuscript and supplementary materials.
David Bann1
Mengyao Wang2
- Centre for Longitudinal Studies, University College London, UK
- Department of Biostatistics, Yale University, US