cd /Users/gk/Code
git clone --recurse-submodules https://github.com/GeorgeKontsevik/super-duper-disser.git super-duper-disser
cd /Users/gk/Code/super-duper-disser
chmod +x scripts/bootstrap_fresh_machine.sh
./scripts/bootstrap_fresh_machine.shWhat the script does:
- installs
uvif missing - tries to install
python3,curl, andgitautomatically via available package manager (brew,apt,dnf,yum,pacman,zypper,winget,choco) - initializes git submodules
- creates root orchestration env:
.venv - creates dedicated per-repo envs:
blocksnet/.venvconnectpt/.venvwith forkedidueduavailable for preprocessing importsfloor-predictor/.venvsegregation-by-design-experiments/.venviduedu-fork/.venvfrom forkedGeorgeKontsevik/IduEdu
cd /Users/gk/Code/super-duper-disser
PLACE="Saint Petersburg, Russia"
PYTHONPATH=/Users/gk/Code/super-duper-disser .venv/bin/python -m aggregated_spatial_pipeline.pipeline.run_joint --place "$PLACE" --buffer-m 5000 --street-grid-step 500
PYTHONPATH=/Users/gk/Code/super-duper-disser blocksnet/.venv/bin/python -m aggregated_spatial_pipeline.pipeline.run_pipeline2_prepare_solver_inputs --place "$PLACE"
PYTHONPATH=/Users/gk/Code/super-duper-disser .venv/bin/python -m aggregated_spatial_pipeline.pipeline.run_pipeline3_street_pattern_to_quarters --place "$PLACE"cd /Users/gk/Code/super-duper-disser
PYTHONPATH=/Users/gk/Code/super-duper-disser MPLCONFIGDIR=/tmp/mpl-super-duper-disser \
.venv/bin/python scripts/run_random_50_cities_pipeline.py \
--cities-file simplemaps_worldcities_basicv1/worldcities.csv \
--min-population 800000 \
--sample-size 30 \
--seed 42 \
--buffer-m 10000 \
--street-grid-step 500 \
--pt-subway-stop-buffer-m 0 \
--pt-dependency-top-routes 30 \
--services hospital polyclinic school kindergarten \
--output-root aggregated_spatial_pipeline/outputs/batch_runs/random50_pop800k_10kmThis runner now supports:
- on-the-fly population filtering via
--min-population - automatic skip for already completed cities in the same
--output-root(joint/<slug>/manifest_joint.jsonandjoint_inputs/<slug>/pipeline_2/manifest_prepare_solver_inputs.json)
Batch runner:
Force full rebuild for all cities:
cd /Users/gk/Code/super-duper-disser
PYTHONPATH=/Users/gk/Code/super-duper-disser MPLCONFIGDIR=/tmp/mpl-super-duper-disser \
.venv/bin/python scripts/run_random_50_cities_pipeline.py \
--cities-file simplemaps_worldcities_basicv1/worldcities.csv \
--min-population 800000 \
--sample-size 30 \
--seed 42 \
--buffer-m 10000 \
--street-grid-step 500 \
--pt-subway-stop-buffer-m 0 \
--pt-dependency-top-routes 30 \
--services hospital polyclinic school kindergarten \
--output-root aggregated_spatial_pipeline/outputs/batch_runs/random50_pop800k_10km \
--no-cacheRetry only failed cities from an existing batch summary:
cd /Users/gk/Code/super-duper-disser
.venv/bin/python - <<'PY'
import json, os, subprocess
from pathlib import Path
summary = Path("aggregated_spatial_pipeline/outputs/batch_runs/random50_pop800k_10km/summary.json")
data = json.loads(summary.read_text(encoding="utf-8"))
env = dict(os.environ)
env["PYTHONPATH"] = f"{Path.cwd()}:{env.get('PYTHONPATH','')}".rstrip(":")
env.setdefault("MPLCONFIGDIR", "/tmp/mpl-super-duper-disser")
for row in data.get("results", []):
if row.get("status") != "failed":
continue
slug = row.get("slug", "unknown")
print(f"\n==> retry failed city: {slug}")
joint_manifest = Path(row["joint_output_dir"]) / "manifest_joint.json"
p2_manifest = Path(row["joint_input_dir"]) / "pipeline_2" / "manifest_prepare_solver_inputs.json"
try:
if not joint_manifest.exists():
subprocess.run(row["commands"]["run_joint"], check=True, env=env)
else:
print(" skip run_joint (already has manifest_joint.json)")
if not p2_manifest.exists():
subprocess.run(row["commands"]["run_pipeline2_prepare_solver_inputs"], check=True, env=env)
else:
print(" skip pipeline2 (already has manifest_prepare_solver_inputs.json)")
row["status"] = "ok_after_retry"
row.pop("error", None)
except Exception as exc:
row["status"] = "failed"
row["error"] = str(exc)
summary.write_text(json.dumps(data, ensure_ascii=False, indent=2), encoding="utf-8")
print(f"\nupdated summary: {summary}")
PYAudit all accumulated outputs and classify which city bundles are complete, resumable, or just experimental/non-bundle artifacts:
cd /Users/gk/Code/super-duper-disser
.venv/bin/python scripts/audit_outputs_status.py \
--only-problematic \
--print-cities \
--write-json /tmp/sdd_outputs_audit.json \
--write-tsv /tmp/sdd_outputs_audit.tsvThis audit distinguishes:
- complete city bundles with
pipeline_2 - phase-1-complete bundles that can go straight to
pipeline_2 - resumable partial bundles (
early / mid / late) non_bundle_layoutexperimental roots that are not full city bundles
One project-level visualization tool now owns the default preview canvas and base map styling:
- aggregated_spatial_pipeline/visualization/map_canvas.py
- aggregated_spatial_pipeline/visualization/init.py
Use it instead of adding custom per-module matplotlib setup when a step already has:
- a boundary or circle canvas
- one or more
GeoDataFramelayers - a default background layer such as blocks
- a preview PNG output
Core helpers provided by the tool:
normalize_preview_gdf(...)clip_to_preview_boundary(...)apply_preview_canvas(...)legend_bottom(...)footer_text(...)save_preview_figure(...)
Already wired into:
- aggregated_spatial_pipeline/pipeline/run_joint.py
- aggregated_spatial_pipeline/pipeline/run_pipeline2_prepare_solver_inputs.py
- aggregated_spatial_pipeline/pipeline/run_sm_imputation_external.py
Quick check after changes:
cd /Users/gk/Code/super-duper-disser
PYTHONPYCACHEPREFIX=/Users/gk/Code/super-duper-disser/.cache/pyc python3 -m py_compile \
aggregated_spatial_pipeline/visualization/__init__.py \
aggregated_spatial_pipeline/visualization/map_canvas.py \
aggregated_spatial_pipeline/pipeline/run_joint.py \
aggregated_spatial_pipeline/pipeline/run_pipeline2_prepare_solver_inputs.py \
aggregated_spatial_pipeline/pipeline/run_sm_imputation_external.pyIf a preview changes visually, check the rendered PNGs in:
- aggregated_spatial_pipeline/outputs/joint_inputs
.../preview_png/all_together/.../preview_png/stages/<stage>/
One shared runtime module now owns the default logger format and workspace-local cache settings:
Use it instead of per-file LOG_FORMAT, ad-hoc logger.remove() setup, or /tmp-based MPLCONFIGDIR defaults.
Core helpers:
configure_logger(...)repo_cache_dir(...)repo_mplconfigdir(...)ensure_repo_mplconfigdir(...)
Each major step can now be invoked directly for a city bundle or territory.
Phase 1 only:
cd /Users/gk/Code/super-duper-disser
PYTHONPATH=/Users/gk/Code/super-duper-disser \
.venv/bin/python -m aggregated_spatial_pipeline.pipeline.run_joint \
--place "Tartu, Estonia" \
--buffer-m 1000 \
--street-grid-step 300 \
--collect-only \
--no-cacheFloor enrichment only:
cd /Users/gk/Code/super-duper-disser
PYTHONPATH=/Users/gk/Code/super-duper-disser \
floor-predictor/.venv/bin/python -m aggregated_spatial_pipeline.pipeline.run_floor_predictor_external \
--place "Tartu, Estonia"Intermodal graph only:
cd /Users/gk/Code/super-duper-disser
PYTHONPATH=/Users/gk/Code/super-duper-disser \
iduedu-fork/.venv/bin/python -m aggregated_spatial_pipeline.intermodal_graph_data_pipeline.run_bundle_external \
--place "Tartu, Estonia"BlocksNet bundle only:
cd /Users/gk/Code/super-duper-disser
PYTHONPATH=/Users/gk/Code/super-duper-disser \
blocksnet/.venv/bin/python -m aggregated_spatial_pipeline.blocksnet_data_pipeline.run_bundle_external \
--place "Tartu, Estonia"ConnectPT bundle only:
cd /Users/gk/Code/super-duper-disser
PYTHONPATH=/Users/gk/Code/super-duper-disser \
connectpt/.venv/bin/python -m aggregated_spatial_pipeline.connectpt_data_pipeline.run_bundle_external \
--place "Tartu, Estonia" \
--modalities bus tram trolleybus subwaySM imputation only:
cd /Users/gk/Code/super-duper-disser
PYTHONPATH=/Users/gk/Code/super-duper-disser \
sm_imputation/.venv/bin/python -m aggregated_spatial_pipeline.pipeline.run_sm_imputation_external \
--place "Tartu, Estonia"Solver inputs and accessibility:
cd /Users/gk/Code/super-duper-disser
PYTHONPATH=/Users/gk/Code/super-duper-disser \
blocksnet/.venv/bin/python -m aggregated_spatial_pipeline.pipeline.run_pipeline2_prepare_solver_inputs \
--place tartu_estonia \
--placement-exactStreet-pattern transfer to blocks:
cd /Users/gk/Code/super-duper-disser
PYTHONPATH=/Users/gk/Code/super-duper-disser \
.venv/bin/python -m aggregated_spatial_pipeline.pipeline.run_pipeline3_street_pattern_to_quarters \
--place tartu_estoniaConnectPT route generation on existing city bundle:
cd /Users/gk/Code/super-duper-disser
PYTHONPATH=/Users/gk/Code/super-duper-disser \
connectpt/.venv/bin/python -m aggregated_spatial_pipeline.connectpt_data_pipeline.run_route_generator_external \
--place "Tartu, Estonia" \
--modality bus \
--replace-in-intermodal \
--recompute-accessibilityBatch phase 1 for many cities:
cd /Users/gk/Code/super-duper-disser
PYTHONPATH=/Users/gk/Code/super-duper-disser \
.venv/bin/python -m aggregated_spatial_pipeline.pipeline.run_phase1_batch \
--regions europe usa australia_oceania africa asia \
--limit-per-region 25 \
--buffer-m 1000 \
--street-grid-step 300 \
--no-cacheRegional city lists live here:
The root repo contains a few explicitly temporary guardrails. If you add a new workaround, document it here as well.
Current temporary behavior:
aggregated_spatial_pipeline/connectpt_data_pipeline/pipeline.pykeeps the standard intermodal-to-connectpt stop bridge distance atIDUEDU_CONNECTPT_BRIDGE_DISTANCE_M = 30.0.
- Extend the solver to support changing capacities of existing services.
- Combine that capacity-change scenario with the genetic solver workflow.
- Integrate
sm-imputerinto the main pipeline. - Recompute accessibility, provision, and optimization outputs after
sm-imputer. - Build a new bus graph in
connectpt. - Recompute accessibility on top of the new
connectptbus graph. - Add
connectptoptimization over proposed PT links together with service optimization. - Move from city-level runs to agglomeration-level runs with
arctic_accessto account for seasonal effects. - Repeat the same agglomeration-level flow for Africa to account for climate and external-environment effects.