Skip to content

SutureBot/SutureBot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

SutureBot: High-Level Policy and ACT Training

A comprehensive framework for training high-level policies and language-conditioned Action Chunking with Transformers (ACT) for autonomous surgical suturing.

πŸ“‹ Overview

This repository provides tools for training both low-level and high-level policies for the SutureBot system, enabling autonomous surgical suturing through imitation learning and language conditioning.

πŸ“š Table of Contents

πŸš€ Quick Start

Prerequisites

  • Python 3.8.10
  • CUDA-compatible GPU (recommended: RTX 4090 with 21GB+ memory)
  • Ubuntu/Linux environment

Installation

  1. Clone the repository

  2. Set up Python environment

    conda create -n suturebot python=3.8.10
    conda activate suturebot
    pip install -r requirements.txt
  3. Optional: Audio processing setup

    # For Whisper integration
    sudo apt update && sudo apt install ffmpeg
    
    # For audio recording capabilities
    sudo apt install portaudio19-dev python3-pyaudio

Environment Configuration

Add these environment variables to your ~/.bashrc:

export PATH_TO_SUTUREBOT=/path/to/your/srth/directory
export PATH_TO_DATASET=/path/to/your/dataset/folders
export YOUR_CKPT_PATH="$PATH_TO_SUTUREBOT/model_ckpts"

Then reload your shell:

source ~/.bashrc

🎯 Training Pipeline

Project Structure

β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ act/                             # Low-level policy implementation
β”‚   β”‚   β”œβ”€β”€ dvrk_scripts/
β”‚   β”‚   β”‚   └── constants_dvrk.py        # Task configuration settings
β”‚   β”‚   β”œβ”€β”€ generic_dataset.py           # Dataset handling
β”‚   β”‚   β”œβ”€β”€ auto_training_suturing.py    # Training orchestration
β”‚   β”‚   β”œβ”€β”€ imitate_episodes.py          # Core training logic
β”‚   β”‚   └── img_aug.py                   # Image augmentation utilities
β”‚   └── instructor/                      # High-level policy implementation
β”œβ”€β”€ script/                              # Utility scripts
β”‚   β”œβ”€β”€ calculate_std_mean.py            # Dataset normalization statistics
β”‚   β”œβ”€β”€ suture_point_labeling.py         # Annotation tools
β”‚   └── encode_instruction.py            # Text embedding generation

πŸ”§ Low-Level Policy Training

Dataset Structure

Organize your training data according to this hierarchy:

$PATH_TO_DATASET/
β”œβ”€β”€ [DATASET_NAME]/                      # Dataset root directory
β”‚   β”œβ”€β”€ tissue_1/                        # Tissue sample 1
β”‚   β”‚   └── 1_[task_name]/               # Task directory
β”‚   β”‚       └── [episode_timestamp]/     # Episode (timestamped)
β”‚   β”‚           β”œβ”€β”€ left_img_dir/        # Left endoscope images
β”‚   β”‚           β”‚   └── frame000000_left.jpg
β”‚   β”‚           β”œβ”€β”€ right_img_dir/       # Right endoscope images
β”‚   β”‚           β”‚   └── frame000000_right.jpg
β”‚   β”‚           β”œβ”€β”€ endo_psm1/           # Right wrist camera
β”‚   β”‚           β”‚   └── frame000000_psm1.jpg
β”‚   β”‚           β”œβ”€β”€ endo_psm2/           # Left wrist camera
β”‚   β”‚           β”‚   └── frame000000_psm2.jpg
β”‚   β”‚           └── ee_csv.csv           # Kinematics data
β”‚   └── tissue_2/                        # Additional tissue samples
β”‚       └── ...                          # Same structure as above

Training Steps

Follow these steps to train your low-level policy:

1. Generate Text Embeddings

python encode_instruction.py \
    --dataset_dir $PATH_TO_DATASET/[DATASET_NAME] \
    --encoder distilbert \
    --from_count

This creates candidate_embeddings_distilbert.json with task name and direction correction embeddings.

2. Calculate Dataset Statistics

python script/calculate_std_mean.py

Note: Configure tissue IDs and data directory in the script before running. Results are saved to script/chole/std_mean.txt.

3. Configure Training Parameters

Edit src/act/dvrk_scripts/constants_dvrk.py with your task configuration:

Parameter Description Example Values
dataset_dir Path to your dataset $PATH_TO_DATASET/my_dataset
num_episodes Total episodes (from step 2) 150
tissue_samples_ids Tissue IDs for training [1, 2, 3]
camera_names Cameras to use ["left", "right", "left_wrist", "right_wrist"]
action_mode Action representation "hybrid" (recommended)
norm_scheme Normalization method "std" (recommended)
goal_condition_style goal condition mode ("dot", "map", or "mask") "dot"

4. Set Up Training Script

Configure auto_training_suturing.py with these key parameters:

Parameter Description Recommended Value
task_name Task config name From constants_dvrk.py
policy_class Policy type "ACT"
batch_size Training batch size 16 (requires ~21GB GPU)
num_epochs Training epochs 1000
language_encoder Text encoder "distilbert"
image_encoder Vision backbone "efficientnet_b3film"
policy_level Policy type "low"

5. Start Training

python auto_training_suturing.py

Tip: The training script automatically handles interruptions and resumes from the last checkpoint.

🧠 High-Level Policy Training

Data Preparation

  1. Configure Dataset Path

    • Set DATA_DIR in src/instructor/constants_daVinci.py
    • Define camera folder names and validation/test tissue splits
    • Labels are automatically extracted from directory names
  2. Dataset Statistics

    python script/chole/dataset_rgb_mean_std.py

    Alternative: Use ImageNet statistics if dataset-specific stats aren't available.

Data Quality Control

  • Check if all recordings look good:
    • Create a video with all demonstrations concatenated for one tissue and check if all demonstrations look good and are in the correct task directory script/chole/concatenate_all_tissue_demos.py (sometimes it happens that a demonstration is saved in the previous task folder (or vice versa)
    • Specifically check that the demonstrations are complete. If the demonstrations are incomplete (started to late or ended to early), then concatenating the task recordings is erroneous
  • If a task recording started too early or is too long, you can add a β€œindices_curated.json” in the demonstration directory with the keys β€œstart” and/or β€œend” giving them the frame index of the curated start/end

Key Components

File Purpose
model_daVinci.py Temporal models (Transformer, etc.)
backbone_models_daVinci.py Vision backbones (ResNet, SwinT)
dataset_daVinci.py Dataset loading and augmentation
train_daVinci.py Training orchestration
instructor_pipeline.py Inference pipeline

Training Command

python train_daVinci.py \
    --dataset_names [dataset_name]\
    --ckpt_dir ./model_ckpts/hl/suturing_hl_3\
    --gpu 0 \
    --recovery_probability 0.6 \
    --batch_size 16 \
    --num_epochs 2000 \
    --lr 4e-4 \
    --min_lr 1e-5 \
    --lr_cycle 25 \
    --warmup_epochs 5 \
    --weight_decay 0.05 \
    --validation_interval 10 \
    --prediction_offset 15 \
    --history_len 4 \
    --save_ckpt_interval 5 \
    --history_step_size 30 \
    --one_hot_flag \
    --early_stopping_interval 300 \
    --seed 5 \
    --plot_val_images_flag \
    --max_num_images 5 \
    --cameras_to_use left_img_dir \
    --backbone_model swin-t \
    --model_init_weights imagenet \
    --image_dim 224 224 \
    --freeze_backbone_until none \
    --multitask_loss_weight 0.6 \
    --uniform_sampling_flag \
    --extra_repeated_phase_last_frame_sampling_flag \
    --extra_repeated_phase_last_frame_sampling_probability 0.15 \
    --add_center_crop_view_flag \
    --global_pool_image_features_flag \
    --dataset_mean_std_file_names "dataset_mean_std_camera_type='left_img_dir'_image_step_size=1.json" \
    --val_split_number 0 \
    --use_complexer_multitask_mlp_head_flag \
    --selected_multitasks dominant_moving_direction 

πŸ“ Notes

  • Experimental Features: The codebase contains experimental features that can be removed for simplification
  • Deprecated Files:
    • future_frame_predictor_model.py
    • hl_correction_publisher_ui_w_whisper.py
    • temporal_models.py (contains only TCN, not the final Transformer architecture)

🀝 Contributing

When contributing to this project, please ensure:

  • Code follows existing style conventions
  • New features are documented
  • Training configurations are tested before submission

πŸ“ž Support

For questions or issues, please contact: [email protected]

πŸ“„ Citation

@misc{suturebot2025,
title       = {SutureBot: A Precision Framework and Benchmark for Autonomous End-to-End Suturing},
author      = {Jesse Haworth, Juo-Tung Chen, Nigel Nelson, Ji Woong Kim, Masoud Moghani, Chelsea Finn, Axel Krieger},
year        = {2025},
note        = {Under review at NeurIPS 2025 Datasets and Benchmarks Track},
howpublished = {\url{https://huggingface.co/datasets/jchen396/suturebot}}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors