Preview: During setup, you'll be asked to choose:
- Project type:
project(full-featured),course(teaching), orpresentation(single talk) - Notebook format: Quarto
.qmd(recommended) or RMarkdown.Rmd - Git: Whether to initialize a
gitrepository - Package management: Whether to use renv for package management
Not sure? Choose the defaults. You can always change these later in settings.yml.
Install the CLI:
curl -fsSL https://raw.githubusercontent.com/table1/framework/main/inst/bin/install-cli.sh | bashAnd get started:
# Create projects
framework new myproject
framework new slides presentation
framework newSee Command Line Interface for full details.
One-liner (macOS/Linux/Windows with Git Bash):
curl -fsSL https://raw.githubusercontent.com/table1/framework-project/main/new-project.sh | bashThis guides you through creating a new project without installing the CLI.
Clone the template and customize init.R to your preferences:
git clone https://github.com/table1/framework-project my-project
cd my-projectOpen init.R in your favorite editor to set your project name, type, and options, then run it:
framework::init(
project_name = "MyProject",
type = "project", # or "course" or "presentation"
use_renv = FALSE,
default_notebook_format = "quarto",
author_name = "Your Name", # Allows auto-filling Notebook author (optional)
author_email = "[email protected]",
author_affiliation = "Johns Hopkins University"
)
# Then run your code from your IDE. Or save your changes and run:
source("init.R")- project (default): Full-featured research projects with exploratory notebooks, production scripts, organized data management, and documentation
- course: Teaching materials with presentations, student notebooks, and example data
- presentation: Single talks or presentations with minimal overhead: just data, helper functions, and output
Not sure? Use type = "project". You can always delete directories you don't need; you won't break anything.
Example structure:
project/
├── notebooks/ # Exploratory analysis
├── scripts/ # Production pipelines
├── data/
│ ├── source/private/ # Raw data (gitignored)
│ ├── source/public/ # Public raw data
│ ├── cached/ # Computation cache (gitignored)
│ └── final/private/ # Results (gitignored)
├── functions/ # Custom functions
├── results/private/ # Analysis outputs (gitignored)
├── docs/ # Documentation
├── settings.yml # Project configuration
├── framework.db # Metadata/tracking database
└── .env # Secrets (gitignored)
Framework reduces boilerplate and enforces best practices for data analysis:
- Project scaffolding: Standardized directories, config-driven setup
- Data management: Declarative data catalog, integrity tracking, encryption (on roadmap)
- Auto-loading: Load the packages you use in every file with one command; no more file juggling with your
library()calls - Pain-free
renvintegration: Userenvfor reproducible package management without having to fightrenvor babysit it. - Caching: Smart caching for expensive computations
- Database helpers: PostgreSQL, SQLite with credential management
- Supported file formats: CSV, TSV, RDS, Stata (.dta), SPSS (.sav), SAS (.xpt, .sas7bdat)
When you run init(), Framework creates:
- Project structure: Organized directories (varies by type)
- Configuration files:
settings.ymland optionalsettings/files - Git setup:
.gitignoreconfigured to protect private data - Tooling:
.lintr,.editorconfigfor code quality - Database:
framework.dbfor metadata tracking - Environment:
.envtemplate for secrets
A lightweight R package for structured, reproducible data analysis projects focusing on convention over configuration.
Preview: During setup, you'll be asked to choose:
- Project type:
project(full-featured),course(teaching), orpresentation(single talk) - Notebook format: Quarto
.qmd(recommended) or RMarkdown.Rmd - Git: Whether to initialize a
gitrepository - Package management: Whether to use renv for package management
Not sure? Choose the defaults. You can always change these later in config.yml.
# Install
curl -fsSL https://raw.githubusercontent.com/table1/framework/main/inst/bin/install-cli.sh | bash
# Create projects
framework new myproject
framework new slides presentation
framework newSee Command Line Interface for full details.
One-liner (macOS/Linux/Windows with Git Bash):
curl -fsSL https://raw.githubusercontent.com/table1/framework-project/main/new-project.sh | bashThis guides you through creating a new project without installing the CLI.
Clone the template and customize init.R to your preferences:
git clone https://github.com/table1/framework-project my-project
cd my-projectOpen init.R in your favorite editor to set your project name, type, and options, then run it:
framework::init(
project_name = "MyProject",
type = "project", # or "course" or "presentation"
use_renv = FALSE,
default_notebook_format = "quarto",
author_name = "Your Name", # Allows auto-filling Notebook author (optional)
author_email = "[email protected]",
author_affiliation = "Johns Hopkins University"
)
# Then run your code from your IDE. Or save your changes and run:
source("init.R")- project (default): Full-featured research projects with exploratory notebooks, production scripts, organized data management, and documentation
- course: Teaching materials with presentations, student notebooks, and example data
- presentation: Single talks or presentations with minimal overhead: just data, helper functions, and output
Not sure? Use type = "project". You can always delete directories you don't need; you won't break anything.
Example structure:
project/
├── notebooks/ # Exploratory analysis
├── scripts/ # Production pipelines
├── data/
│ ├── source/private/ # Raw data (gitignored)
│ ├── source/public/ # Public raw data
│ ├── cached/ # Computation cache (gitignored)
│ └── final/private/ # Results (gitignored)
├── functions/ # Custom functions
├── results/private/ # Analysis outputs (gitignored)
├── docs/ # Documentation
├── config.yml # Project configuration
├── framework.db # Metadata/tracking database
└── .env # Secrets (gitignored)
Framework reduces boilerplate and enforces best practices for data analysis:
- Project scaffolding: Standardized directories, config-driven setup
- Data management: Declarative data catalog, integrity tracking, encryption (on roadmap)
- Auto-loading: Load the packages you use in every file with one command; no more file juggling with your
library()calls - Pain-free
renvintegration: Userenvfor reproducible package management without having to fightrenvor babysit it. - Caching: Smart caching for expensive computations
- Database helpers: PostgreSQL, SQLite with credential management
- Supported file formats: CSV, TSV, RDS, Stata (.dta), SPSS (.sav), SAS (.xpt, .sas7bdat)
When you run init(), Framework creates:
- Project structure: Organized directories (varies by type)
- Configuration files:
config.ymland optionalsettings/files - Git setup:
.gitignoreconfigured to protect private data - Tooling:
.lintr,.styler.R,.editorconfigfor code quality - Database:
framework.dbfor metadata tracking - Environment:
.envtemplate for secrets
library(framework)
scaffold() # Loads packages, functions, config, standardizes working directoryVia config:
# config.yml or settings/data.yml
data:
source:
private:
survey:
path: data/source/private/survey.dta
type: stata
locked: true# Load using dot notation
df <- data_load("source.private.survey")Direct path:
df <- data_load("data/my_file.csv") # CSV
df <- data_load("data/stata_file.dta") # Stata
df <- data_load("data/spss_file.sav") # SPSSStatistical formats (Stata/SPSS/SAS) strip metadata by default for safety. Use keep_attributes = TRUE to preserve labels.
model <- get_or_cache("model_v1", {
expensive_model_fit(df)
}, expire_after = 1440) # Cache for 24 hours# Save data
data_save(processed_df, "final.private.clean", type = "csv")
# Save analysis output
result_save("regression_model", model, type = "model")
# Save notebook (blinded)
result_save("report", file = "report.html", type = "notebook",
blind = TRUE, public = FALSE)# config.yml
connections:
db:
driver: postgresql
host: !expr Sys.getenv("DB_HOST")
database: !expr Sys.getenv("DB_NAME")
user: !expr Sys.getenv("DB_USER")
password: !expr Sys.getenv("DB_PASS")df <- query_get("SELECT * FROM users WHERE active = true", "db")Simple:
default:
packages:
- dplyr
- ggplot2
data:
example: data/example.csvAdvanced: Split config into settings/ files:
default:
data: settings/data.yml
packages: settings/packages.yml
connections: settings/connections.yml
security: settings/security.ymlUse .env for secrets:
DB_HOST=localhost
DB_PASS=secret
DATA_ENCRYPTION_KEY=key123Reference in config:
security:
data_key: !expr Sys.getenv("DATA_ENCRYPTION_KEY")| Function | Purpose |
|---|---|
scaffold() |
Initialize session (load packages, functions, config) |
data_load() |
Load data from path or config |
data_save() |
Save data with integrity tracking |
query_get() |
Execute SQL query, return data |
query_execute() |
Execute SQL command |
get_or_cache() |
Lazy evaluation with caching |
result_save() |
Save analysis output |
result_get() |
Retrieve saved result |
scratch_capture() |
Quick debug/temp file save |
renv_enable() |
Enable renv for reproducibility (opt-in) |
renv_disable() |
Disable renv integration |
packages_snapshot() |
Save package versions to renv.lock |
packages_restore() |
Restore packages from renv.lock |
- Hash tracking - All data files tracked with SHA-256 hashes
- Locked data - Flag files as read-only, errors on modification
- Encryption - AES encryption for sensitive data/results
- Gitignore by default - Private directories auto-ignored
Framework includes optional renv integration (OFF by default):
# Enable renv for this project
renv_enable()
# Your packages are now managed by renv
# Use snapshot after installing new packages
packages_snapshot()
# Disable renv if you prefer
renv_disable()Version pinning in config.yml:
packages:
- dplyr # Latest from CRAN
- [email protected] # Specific version
- tidyverse/dplyr@main # GitHub with refSee renv integration docs for details.
- Excel file support
- Quarto codebook generation