RBiblioSynth : Bibliometric Analysis and Report Automation

Overview

RBiblioSynth is a comprehensive modular R framework for IEEE Q1 Journal-quality bibliometric analysis and systematic review automation. It provides:

M0 Data Orchestrator: Multi-source loading, merging, PRISMA automation
M1 Main Information: Descriptive analysis, author metrics, citation networks, topic modeling
M2 Annual Production: 30+ growth models, 8 forecasting methods, changepoint detection
M3 Countries: Spatial statistics, economic correlations, collaboration networks

The framework combines Bibliometrix (R), custom R6 classes, and rigorous statistical methods to produce IEEE-style outputs (plots, tables, LaTeX reports).

Key Features

Beyond bibliometrix

| Feature | bibliometrix | RBiblioSynth | |---------|--------------|--------------||Growth Models | 2-3 | 30+ (Bass, Gompertz, Weibull, Richards, von Bertalanffy, MMF) | | Forecasting | Basic | 8 methods (ARIMA, SARIMA, ETS, TBATS, Prophet, State Space, Ensemble) | | Spatial Statistics | None | Full suite (Moran's I, Geary's C, LISA, Getis-Ord Gi*) | | Economic Correlation | None | GDP, HDI, R&D integration | | Bootstrap CIs | None | BCA method for all metrics | | Hypothesis Tests | Limited | 40+ formal tests with FDR correction |

Unique Capabilities

30+ Growth Models: Bass diffusion, Gompertz, Weibull, Richards, von Bertalanffy, MMF with comparison
Ensemble Forecasting: AIC-weighted model averaging
Spatial Bibliometrics: First implementation of Moran's I, LISA for bibliometric data
PRISMA Automation: Automatic PRISMA 2020 flow diagrams from JSON/YAML specs
Bootstrap Confidence Intervals: BCA method for robust inference
QAP Network Tests: Statistical significance for collaboration networks

Quick Start

# Load package
source("R/core/bootstrap.R")

# Define data sources
sources <- list(
  scopus = list(file = "data/scopus.bib", db = "scopus", format = "bibtex")
)

# Run M0 Data Orchestrator
m0_result <- run_m0(sources)
bib_data <- m0_get_bib_data(m0_result)

# Run M1 Main Information
m1_result <- run_m1(bib_data)

# Run M2 Annual Production
annual_data <- m0_get(m0_result, "annual")
m2_result <- run_m2(annual_data)

# Run M3 Countries
m3_result <- run_m3(bib_data)

Module Details

M0: Data Orchestrator

Multi-source loading (Scopus, WoS, OpenAlex, Generic CSV)
Automatic deduplication (DOI + fuzzy title matching)
PRISMA 2020 diagram generation
Quality validation (ORCID, DOI, email extraction)

M1: Main Information

Author productivity and indices (h-index, g-index, m-quotient, i10)
Citation analysis with distribution fitting
Topic modeling (LDA with coherence/perplexity)
Bradford's Law and Lotka's Law analysis
Keyword co-occurrence networks
Kleinberg burst detection

M2: Annual Production

30+ Growth Models: Bass, Gompertz, Weibull, Richards, von Bertalanffy, MMF
8 Forecasting Methods: ARIMA, SARIMA, ETS, TBATS, Prophet, State Space, Naive, Ensemble
Changepoint Detection: PELT, CUSUM, Binary Segmentation
Harmonic Analysis: FFT, Lomb-Scargle
Model Diagnostics: AIC/BIC comparison, cross-validation
15+ Statistical Tests: Normality, autocorrelation, heteroscedasticity

M3: Countries

Spatial Statistics: Moran's I, Geary's C, LISA, Getis-Ord Gi*
Economic Correlation: GDP, HDI, R&D expenditure
Collaboration Indices: Salton, Jaccard, Affinity
Temporal Dynamics: Rank mobility, Markov transitions, NELSOP
QAP Network Tests: Statistical significance for correlations
12 Formal Hypothesis Tests

Roadmap (Modules)

Bibliometric (R6 in R)

M0: Data Orchestrator (load, merge, organize sources; PRISMA diagram & report)
M1: Main Information (overview, doc types, authors, citations, countries, keywords, Bradford, Lotka)
M2: Annual Production (30+ growth models, forecasting, changepoint, diagnostics)
M3: Countries (spatial stats, economic correlation, QAP tests, temporal dynamics)
M4: Institutions (collaboration networks, quadrants, indicators)
M5: Authors & Documents (h-index evolution, citation networks)
M6: Clustering & Themes (co-word, Louvain, topic evolution)
M7: Conceptual & Social Structure (collaboration networks, betweenness)
M8: Automated Bibliometric Report (JSON + CSV + plots)

LLM / AI Integration

M9: LLM Report Generator (combine bibliometric insights with text analysis)
M10: Zotero Integration (tagging, notes, thematic structuring)

Statistical Rigor

All modules include:

Bootstrap Confidence Intervals: BCA method for all statistics
Multiple Testing Correction: FDR (Benjamini-Hochberg)
Effect Sizes: Cohen's d, eta-squared
Power Analysis: Sample size considerations
Formal Hypothesis Tests: 40+ tests with interpretations

Visualization

# Treemap for hierarchical display
render_m1_treemap(m1_data, type = "countries")

# Sankey diagram for collaboration flows
render_m3_sankey(m3_data, type = "country")

# Three-field plot (Authors-Keywords-Sources)
render_m1_three_field(data)

# Time series with forecast bands
m2_result$artifacts$plots$forecasting

Installation

# From GitHub
devtools::install_github("yourusername/RBiblioSynth")
library(RBiblioSynth)

Citation

@article{rbibliosynth2026,
  title = {RBiblioSynth: A Comprehensive R Framework for Bibliometric Analysis},
  author = {Mayorga},
  journal = {SoftwareX},
  year = {2026},
  doi = {10.1016/j.softx.2026.XXXXX}
}

License

MIT License

Acknowledgments

This package extends bibliometrix with advanced statistical methods and rigorously tested implementations suitable for IEEE Q1 journal publications.

Name		Name	Last commit message	Last commit date
Latest commit History 1,040 Commits
.claude		.claude
R		R
data		data
docs		docs
examples		examples
inst		inst
tests		tests
vignettes		vignettes
www		www
.DS_Store		.DS_Store
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
ANTIGRAVITY.md		ANTIGRAVITY.md
ARCHITECTURE.md		ARCHITECTURE.md
CLAUDE.md		CLAUDE.md
COMPLETION_SUMMARY.md		COMPLETION_SUMMARY.md
COMPREHENSIVE_REVIEW.md		COMPREHENSIVE_REVIEW.md
DESCRIPTION		DESCRIPTION
ENHANCEMENT_SUMMARY.md		ENHANCEMENT_SUMMARY.md
FINAL_VERIFICATION_REPORT.md		FINAL_VERIFICATION_REPORT.md
FIXES_APPLIED.md		FIXES_APPLIED.md
LICENSE		LICENSE
MAIN_R_FIXES.md		MAIN_R_FIXES.md
NAMESPACE		NAMESPACE
PRODUCTION_FIXES.R		PRODUCTION_FIXES.R
PRODUCTION_READINESS_REPORT.md		PRODUCTION_READINESS_REPORT.md
Q1_JOURNAL_REVIEW.md		Q1_JOURNAL_REVIEW.md
README.md		README.md
ROADMAP.md		ROADMAP.md
Rplots.pdf		Rplots.pdf
debug_auco.R		debug_auco.R
debug_c1.R		debug_c1.R
debug_columns.R		debug_columns.R
debug_loop.R		debug_loop.R
debug_m1.R		debug_m1.R
debug_prepare.R		debug_prepare.R
debug_prepare2.R		debug_prepare2.R
debug_test.R		debug_test.R
debug_type.R		debug_type.R
error_log.txt		error_log.txt
example_data.txt		example_data.txt
example_paper.pdf		example_paper.pdf
regenerate_figures.R		regenerate_figures.R
regenerate_figures.Rout		regenerate_figures.Rout
run_debug.bat		run_debug.bat
test_setup.R		test_setup.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RBiblioSynth : Bibliometric Analysis and Report Automation

Overview

Key Features

Beyond bibliometrix

Unique Capabilities

Quick Start

Module Details

M0: Data Orchestrator

M1: Main Information

M2: Annual Production

M3: Countries

Roadmap (Modules)

Bibliometric (R6 in R)

LLM / AI Integration

Statistical Rigor

Visualization

Installation

Citation

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RBiblioSynth : Bibliometric Analysis and Report Automation

Overview

Key Features

Beyond bibliometrix

Unique Capabilities

Quick Start

Module Details

M0: Data Orchestrator

M1: Main Information

M2: Annual Production

M3: Countries

Roadmap (Modules)

Bibliometric (R6 in R)

LLM / AI Integration

Statistical Rigor

Visualization

Installation

Citation

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages