Skip to content

jamesforhealth/SurgeryOCR

Repository files navigation

SurgeryOCR

Description

This project aims to automate the extraction and labeling of critical numerical data displayed within ophthalmological surgery videos. By leveraging CV+OCR(Optical Character Recognition) techniques, the repository provides tools to:

  • Identify and extract numerical values appearing on the video feed.
  • Label these extracted numbers based on the specific stage of the surgery they correspond to.
  • Provide a structured dataset for further analysis of surgical procedures, potentially leading to insights for research, training, and quality improvement in ophthalmology.

Key Features

  • Video Analysis: Processes ophthalmological surgery videos as input.
  • On-Screen OCR: Employs OCR to recognize numerical data displayed on the video frames.
  • Stage Identification: Aims to identify different stages of the surgery to provide context to the extracted data.
  • Data Labeling: Labels the extracted numerical data according to the identified surgical stage
  • Output: Generates structured output (e.g., jsonl) containing the extracted numerical data and their corresponding labels and timestamps.

Installation

  1. Clone the repository:

    git clone <your-repository-url>
    cd SurgeryOCR
  2. Install dependencies:

    pip install -r requirements.txt
    

Usage

This is the standard operating procedure for analyzing surgery videos. These commands assume you are running them from the root of the project directory.

  1. Place Videos: Put all surgery videos (e.g., .mp4 files) into the data/ directory.

  2. Extract ROI Images: Pre-process the videos to extract ROI images. This step significantly speeds up the subsequent analysis and UI responsiveness. The following command processes all videos in the data/ directory and extracts all configured ROIs, including saving full frames for faster UI loading.

    python extract_roi_images.py --video ./data --region all --save-full-frames
  3. Run OCR Analysis: Perform change detection and OCR on the extracted ROIs. This command processes all videos and generates .jsonl files with the OCR results.

    python surgery_analysis_process.py --video ./data --region all
  4. Analyze Surgery Stages: Run pattern matching to detect and segment different surgical stages based on the visual patterns in the ROIs.

    python stage_pattern_analysis.py --video ./data
  5. Review Results: Launch the graphical user interface to load a video and visually inspect the analysis results, including OCR data and surgical stage segments.

    python video_annotator_gui.py

About

Analyzing ophthalmological surgery videos to identify and extract critical numerical metrics displayed during different stages using OCR, aiming to provide valuable insights into surgical procedures.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors