23. Skip to content

23. CLI reference

Use this reference when you already know you want the terminal interface and need the exact command surface. For IDs and registry lookups, continue with Catalogs and registries.

23.1 What it is for

ModSSC exposes a root CLI plus brick-specific entry points for datasets, sampling, preprocess, graphs, augmentation, evaluation, and method registries. The commands are implemented as Typer apps and grouped so you can inspect one subsystem without importing Python manually. [1][2][3]

23.2 When to use

  • Use modssc when you want one entry point with shared logging, doctor, and access to every brick.
  • Use the direct entry points when you want shorter commands in shell scripts or when a workflow only touches one subsystem.
  • Use the Python API instead when you need in-process objects, custom control flow, or notebook-friendly inspection.

23.3 Minimal examples

Start with the root help and the environment diagnostic:

modssc --help
modssc doctor --json

Then move to the brick you need:

modssc datasets list
modssc preprocess steps list
modssc inductive methods list --available-only

23.4 Command map

Command Use it for First next page
modssc doctor inspect installed bricks and missing extras Optional extras and platform support
modssc datasets ... list, inspect, download, and cache datasets Manage datasets
modssc sampling ... create and validate split artifacts Create and reuse sampling splits
modssc preprocess ... inspect steps/models and run preprocess plans Run preprocessing plans
modssc graph ... build graphs and graph-derived views Build graphs and views
modssc augmentation ... inspect augmentation ops Use data augmentation
modssc evaluation ... list metrics and score predictions Compute evaluation metrics
modssc inductive ... inspect inductive methods Inductive tutorial
modssc transductive ... inspect transductive methods Transductive tutorial
modssc supervised ... inspect supervised baselines Catalogs and registries

23.5 How the CLI is installed and invoked

The CLI entry points are declared in pyproject.toml and implemented in src/modssc/cli/. [1][2]

Primary entry point:

modssc --help
modssc --version

Direct entry points:

modssc-datasets --help
modssc-sampling --help
modssc-preprocess --help
modssc-graph --help
modssc-inductive --help
modssc-transductive --help
modssc-augmentation --help
modssc-evaluation --help

Use modssc when you want one command namespace with shared logging and doctor. Use the direct entry points when you prefer smaller wrappers around the same Typer apps. [1][2][3]

23.6 Commands and subcommands

23.6.1 modssc

  • Purpose: Root CLI that wires all bricks and provides doctor and --version. [3]
  • Syntax: modssc [--version] [--log-level <level>] <command> [OPTIONS]
  • Options:
  • --version: print the package version and exit
  • --log-level / --log: logging level (none, basic, detailed)
  • Examples:
modssc doctor
modssc --log-level detailed datasets list

23.6.2 modssc doctor

  • Purpose: Report which optional CLI bricks are available and which extras are missing. [3]
  • Syntax: modssc doctor [--json]
  • Options:
  • --json: emit machine-readable JSON
  • Examples:
modssc doctor
modssc doctor --json

23.6.3 modssc datasets

  • Purpose: List, inspect, and download datasets plus manage cache. [4]
  • Syntax: modssc datasets <providers|list|info|download|cache> [OPTIONS]
  • Options (selected):
  • list --modalities <modality>
  • info --dataset <id>
  • download --dataset <id> | --all plus --force, --cache-dir, --ignore-missing-extras/--no-ignore-missing-extras, --skip-cached, --modalities
  • Examples:
modssc datasets list
modssc datasets info --dataset toy
modssc datasets download --dataset toy

23.6.4 modssc datasets cache

  • Purpose: Inspect and clean the dataset cache. [4]
  • Syntax: modssc datasets cache <ls|purge|gc> [OPTIONS]
  • Options:
  • ls --cache-dir <path>
  • purge <dataset_or_fp> [--fingerprint]
  • gc [--keep-latest/--no-keep-latest]
  • Examples:
modssc datasets cache ls
modssc datasets cache purge toy

23.6.5 modssc sampling

  • Purpose: Create and inspect deterministic SSL splits. [5]
  • Syntax: modssc sampling <create|show|validate> [OPTIONS]
  • Options:
  • create --dataset <id> --plan <file> --out <dir> [--seed <n>] [--overwrite]
  • show <split_dir>
  • validate <split_dir> --dataset <id>
  • Examples:
modssc sampling create --dataset toy --plan sampling_plan.yaml --out splits/toy
modssc sampling show splits/toy

23.6.6 modssc preprocess

  • Purpose: Run preprocessing plans and inspect registries. [6]
  • Syntax: modssc preprocess <steps|models|run> [OPTIONS]
  • Options:
  • steps list [--json]
  • steps info <step_id>
  • models list [--modality <modality>] [--json]
  • models info <model_id>
  • run --plan <file> --dataset <id> [--seed <n>] [--no-cache] [--purge-unused]
  • Examples:
modssc preprocess steps list
modssc preprocess run --plan preprocess_plan.yaml --dataset toy

23.6.7 modssc graph

  • Purpose: Build graphs and graph-derived views; inspect caches. [7]
  • Syntax: modssc graph <build|views|cache> [OPTIONS]
  • Options (build, selected):
  • --dataset <id>
  • --spec <file> for a full graph spec [8]
  • --scheme knn|epsilon|anchor, --metric cosine|euclidean, --k, --radius, --backend auto|numpy|sklearn|faiss
  • --chunk-size, --n-anchors, --anchors-k, --anchors-method, --candidate-limit
  • --faiss-exact, --faiss-hnsw-m, --faiss-ef-search, --faiss-ef-construction
  • --seed, --cache, --cache-dir, --edge-shard-size, --resume
  • Examples:
modssc graph build --dataset toy --scheme knn --metric euclidean --k 8
modssc graph views build --dataset toy --views attr --views diffusion

23.6.8 modssc graph views

  • Purpose: Build graph-derived views and inspect the views cache. [7]
  • Syntax: modssc graph views <build|cache-ls> [OPTIONS]
  • Options (build, selected):
  • --dataset <id>
  • --views <name> (repeatable; attr, diffusion, struct)
  • --diffusion-steps, --diffusion-alpha
  • --struct-method, --struct-dim, --walk-length, --num-walks-per-node, --window-size, --p, --q
  • --scheme, --metric, --k-graph, --radius
  • Examples:
modssc graph views build --dataset toy --views attr --views diffusion --diffusion-steps 5
modssc graph views cache-ls

23.6.9 modssc graph cache

  • Purpose: Inspect or purge graph caches. [7]
  • Syntax: modssc graph cache <ls|purge>
  • Examples:
modssc graph cache ls
modssc graph cache purge

23.6.10 modssc augmentation

  • Purpose: List augmentation ops and inspect defaults. [9]
  • Syntax: modssc augmentation <list|info> [OPTIONS]
  • Options:
  • list [--modality <modality>]
  • info <op_id> [--as-json]
  • Examples:
modssc augmentation list --modality text
modssc augmentation info text.word_dropout --as-json

23.6.11 modssc evaluation

  • Purpose: List metrics and compute scores from .npy files. [10]
  • Syntax: modssc evaluation <list|compute> [OPTIONS]
  • Options:
  • list [--json]
  • compute --y-true <path> --y-pred <path> [--metric <name>] [--json]
  • Examples:
modssc evaluation list
modssc evaluation compute --y-true y_true.npy --y-pred y_pred.npy --metric accuracy

23.6.12 modssc inductive

  • Purpose: Inspect inductive method registry. [11]
  • Syntax: modssc inductive methods <list|info> [OPTIONS]
  • Options:
  • list [--all/--available-only]
  • info <method_id>
  • Examples:
modssc inductive methods list
modssc inductive methods info pseudo_label

23.6.13 modssc transductive

  • Purpose: Inspect transductive method registry. [12]
  • Syntax: modssc transductive methods <list|info> [OPTIONS]
  • Options:
  • list [--all/--available-only]
  • info <method_id>
  • Examples:
modssc transductive methods list
modssc transductive methods info label_propagation

23.6.14 modssc supervised

  • Purpose: List supervised baselines and their backends. [13]
  • Syntax: modssc supervised <list|info> [OPTIONS]
  • Options:
  • list [--available-only] [--json]
  • info <classifier_id>
  • Examples:
modssc supervised list --available-only
modssc supervised info logreg

23.7 Common mistakes

  • Running python -m bench.main ... after a PyPI-only install and expecting bench/ assets to exist locally. Use a source checkout for repository assets.
  • Expecting modssc datasets download --dataset <id> to ignore missing extras automatically. The ignore flag exists, but you should still treat a missing extra as a dependency problem to resolve intentionally.
  • Looking for dataset IDs, step IDs, or method IDs in this page alone. Use Catalogs and registries for the full lists.
  • Treating modssc doctor as a full environment validator. It reports available bricks and missing extras, but it does not validate your benchmark config semantics.
Sources
  1. pyproject.toml
  2. src/modssc/cli/
  3. src/modssc/cli/app.py
  4. src/modssc/cli/datasets.py
  5. src/modssc/cli/sampling.py
  6. src/modssc/cli/preprocess.py
  7. src/modssc/cli/graph.py
  8. src/modssc/graph/specs.py
  9. src/modssc/cli/augmentation.py
  10. src/modssc/cli/evaluation.py
  11. src/modssc/cli/inductive.py
  12. src/modssc/cli/transductive.py
  13. src/modssc/cli/supervised.py