Skip to content

lyu-yx/EventVCOD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

363 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

EventVCOD: Towards Explainable Video Camouflaged Object Detection

Paper Supp License

Official implementation of "Towards Explainable Video Camouflaged Object Detection: SAM2 with Eventstream-Inspired Data" (AAAI 2026).

πŸ“‹ Overview

EventVCOD introduces a novel framework for Video Camouflaged Object Detection (VCOD) by leveraging event camera-inspired data and Segment Anything Model 2 (SAM2). Our approach provides explainable detection through event-based motion representations that capture temporal dynamics invisible to standard RGB cameras.

EventVCOD Pipeline

Key Features

  • 🎯 SAM2-based Architecture: Fine-tuned SAM2 with custom prompt generators for VCOD
  • ⚑ Event-Inspired Data: Novel eventstream-like representation from RGB videos
  • 🎬 Video Understanding: Temporal coherence through event polarity (+/-) encoding
  • πŸ” Explainable Detection: Motion-based interpretable features
  • πŸ“Š State-of-the-art Performance: Superior results on MoCA-Mask-Video and CAD-2016 benchmarks

πŸ—οΈ Architecture

Our framework consists of three main components:

  1. Event-Inspired Data Generation: Convert RGB frames to event-like representations with positive/negative polarities
  2. Prompt Generator: Dense embedding generator for SAM2 conditioning
  3. SAM2 Backbone: Fine-tuned image encoder with memory attention mechanism

πŸ“¦ Installation

Prerequisites

  • Python 3.8+
  • CUDA 12.1+
  • PyTorch 2.3.0+

Setup

# Clone the repository
git clone https://github.com/yourusername/EventVCOD.git
cd EventVCOD

# Install dependencies
pip install -r requirements.txt

# Install SAM2
pip install -e .

Dependencies

Main dependencies include:

  • torch>=2.3.0
  • torchvision>=0.18.0
  • opencv-python>=4.8.0
  • numpy>=1.24.2
  • Pillow>=9.4.0
  • tensorboardX>=2.6.2
  • timm==0.4.12

See requirements.txt for complete list.

πŸ“Š Dataset Preparation

Download Datasets

All datasets and pre-processed event-like data are available at:

Dataset Structure

Organize your datasets as follows:

datasets/
β”œβ”€β”€ MoCA-Video-Train_event/
β”‚   β”œβ”€β”€ crab/
β”‚   β”œβ”€β”€ flatfish_0/
β”‚   └── ...
β”œβ”€β”€ MoCA-Video-Test_event/
β”‚   β”œβ”€β”€ arctic_fox/
β”‚   β”œβ”€β”€ black_cat_1/
β”‚   └── ...
└── CAD2016_event/
    └── ...

Generate Event-Inspired Data

To generate event-like representations from your own RGB videos:

cd data_manipulate
python eventflow_like_gen_claude.py --input_dir /path/to/videos --output_dir /path/to/output

🎯 Pre-trained Models

All pre-trained models (including SAM2 checkpoints and fine-tuned EventVCOD models) are available at:

Download the checkpoints and place them in the checkpoints/ directory.

πŸš€ Training

Single GPU Training

python train.py --config sam2/configs/sam2.1_training/sam2.1_hiera_b+_VCOD_finetune_tiny_adp0317_video_part2_SAM2_finetune30.yaml

Multi-GPU Distributed Training

# 4 GPUs example
python -m torch.distributed.launch \
    --nproc_per_node=4 \
    --master_port=29500 \
    train.py \
    --config sam2/configs/sam2.1_training/sam2.1_hiera_b+_VCOD_finetune_tiny_adp0317_video_part2_SAM2_finetune30.yaml \
    --name eventvcod_exp \
    --tag experiment_v1

Training Configuration

Main training parameters in config files:

  • resolution: Input resolution (default: 1024)
  • train_batch_size: Batch size per GPU
  • num_frames: Number of frames per video clip
  • num_epochs: Total training epochs
  • base_lr: Base learning rate

Modify configurations in sam2/configs/sam2.1_training/ for different settings.

πŸ§ͺ Testing & Evaluation

Run Inference

python test.py \
    --config sam2/configs/sam2.1/sam2.1_hiera_b+_VCOD_infer_modify.yaml \
    --model /path/to/checkpoint.pth

Evaluation Metrics

We evaluate using standard VCOD metrics:

  • S-measure (SΞ±): Structure similarity
  • E-measure (EΟ†): Enhanced alignment measure
  • weighted F-measure (FwΞ²): Weighted precision-recall
  • MAE (M): Mean Absolute Error
  • mean Dice (mDice): Dice coefficient
  • mean IoU (mIoU): Intersection over Union

MATLAB Evaluation

For comprehensive evaluation using MATLAB:

# For MoCA-Mask-Video dataset
cd eval
run main_MoCA.m

# For CAD-2016 dataset
run main_CAD.m

Python Evaluation

cd eval/PySODMetrics
python evaluate.py --pred_dir /path/to/predictions --gt_dir /path/to/ground_truth

πŸ“ˆ Results

MoCA-Mask-Video Test Set

Method Sα↑ Fwβ↑ Eφ↑ M↓ mDice↑ mIoU↑
RCRNet .597 .174 .583 .025 .194 .137
PNS-Net .576 .134 .536 .038 .189 .133
MG .547 .165 .537 .095 .197 .137
SLT-Net .656 .357 .785 .021 .387 .310
IMEX .661 .371 .778 .020 .409 .319
TSP-SAM(M+P) .673 .400 .766 .012 .421 .345
TSP-SAM(M+B) .689 .444 .808 .008 .458 .388
ZoomNeXt(T=1) .690 .395 .702 .017 .420 .353
ZoomNeXt(T=5) .734 .476 .736 .010 .497 .422
EMIP .669 .374 ↑ .017 .424 .326
EMIP-L .675 .381 ↑ .015 .426 .333
EventVCOD (Ours) .753 .573 .855 .009 .574 .496

CAD-2016 Dataset

Method Sα↑ Fwβ↑ Eφ↑ M↓ mDice↑ mIoU↑
RCRNet ↑ ↑ ↑ ↓ ↑ ↑
PNS-Net .678 .369 .720 .043 .409 .309
MG .613 .370 .537 .070 .351 .260
SLT-Net .669 .481 .845 .030 .368 .268
IMEX .684 .452 .813 .033 .469 .370
TSP-SAM(M+P) .705 .565 .836 .027 .591 .422
TSP-SAM(M+B) .751 .628 .865 .021 .603 .496
ZoomNeXt(T=1) .721 .525 .759 .024 .523 .436
ZoomNeXt(T=5) .757 .593 .865 .020 .509 .510
EMIP .710 .504 ↑ .029 .528 .415
EMIP-L .719 .514 ↑ .028 .536 .425
EventVCOD (Ours) .802 .717 .887 .023 .717 .615

Note: ↑/↓ indicates metrics not reported in original papers. Full quantitative results available in the paper.

πŸ› οΈ Code Structure

EventVCOD/
β”œβ”€β”€ sam2/                           # SAM2 core implementation
β”‚   β”œβ”€β”€ modeling/                   # Model architectures
β”‚   β”œβ”€β”€ configs/                    # Configuration files
β”‚   β”‚   β”œβ”€β”€ sam2.1_training/       # Training configs
β”‚   β”‚   └── ablation/              # Ablation study configs
β”‚   └── utils/                      # Utility functions
β”œβ”€β”€ training/                       # Training pipeline
β”‚   β”œβ”€β”€ trainer.py                 # Main trainer
β”‚   β”œβ”€β”€ trainer_supervision.py     # Supervised training
β”‚   └── model/                     # Model definitions
β”œβ”€β”€ datasets/                       # Dataset implementations
β”‚   β”œβ”€β”€ datasets.py                # Dataset loaders
β”‚   └── transform_custom.py        # Data augmentation
β”œβ”€β”€ data_manipulate/               # Event data generation
β”‚   β”œβ”€β”€ eventflow_like_gen_claude.py  # Event generation
β”‚   └── eventflow_p_n_visualization.ipynb  # Visualization
β”œβ”€β”€ eval/                          # Evaluation scripts
β”‚   β”œβ”€β”€ PySODMetrics/             # Python metrics
β”‚   └── *.m                       # MATLAB evaluation
β”œβ”€β”€ train.py                       # Training entry point
β”œβ”€β”€ test.py                        # Testing entry point
└── utils.py                       # General utilities

πŸ”¬ Event-Inspired Data Generation

Our event data generation process:

  1. Frame Difference: Compute temporal derivatives between consecutive frames
  2. Polarity Assignment: Threshold-based positive/negative event classification
  3. Accumulation: Aggregate events over time windows
  4. Normalization: Scale to appropriate intensity ranges

Example usage:

from data_manipulate.eventflow_like_gen_claude import generate_events

events_pos, events_neg = generate_events(
    video_path='path/to/video.mp4',
    threshold=0.2,
    output_dir='path/to/output'
)

See data_manipulate/eventflow_p_n_visualization.ipynb for visualization examples.

πŸ“„ License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

πŸ™ Acknowledgments

This work builds upon:

  • SAM2 - Meta's Segment Anything Model 2

πŸ”— Resources


Star ⭐ this repository if you find it helpful!

About

Official implementation of "Towards Explainable Video Camouflaged Object Detection: SAM2 with Eventstream-Inspired Data" (AAAI 2026).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors