Skip to content

MICLAB-BUPT/SGFormer

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SGFormer:
Semantic Graph Transformer for Point Cloud-based 3D Scene Graph Generation


arXiv Base Repo
1 Beijing University of Posts and Telecommunications   2 University of Rochester  

3D scene graph generation aims to parse a 3D scene into a structured graph of objects and their relationships. While recent methods leverage point clouds as input, they often overlook semantic richness and struggle to model long-range relational dependencies. To bridge this gap, we propose SGFormer — a novel Semantic Graph Transformer that injects enriched textual semantics (e.g., LLM-enhanced object descriptions) into a dual-layer architecture: a Graph Embedding Layer for structural reasoning and a Semantic Injection Layer for knowledge-aware message passing. SGFormer achieves state-of-the-art performance on the 3DSSG-O27R16 benchmark.

Release

  • 2024-02-15 🚀 SGFormer paper accepted by AAAI 2024!
  • 2024-01-10 💾 Code and model release for SGFormer!

Contents

3DSSG-O27R16 Dataset

Overview: We adopt the cleaned 3DSSG-O27R16 dataset introduced by SGGpoint (CVPR 2021), which enhances the original 3DSSG with:

  • Dense 10-dim point clouds (XYZ + RGB + normal + instance ID)
  • Full-scene graphs (not subgraphs)
  • 27 object classes (O27) and 16 structural relationship types (R16)
  • Removal of low-quality scans and comparative relations (e.g., more-comfortable-than)
  • Multi-class edge labeling (instead of multi-label)

🔍 For dataset download and preprocessing details, please visit the SGGpoint dataset page.

Results

Evaluation Setup: We evaluate SGFormer on the 3DSSG-O27R16 validation set using standard scene-graph metrics: Recall@50 for node classification and Mean Recall@50 for edge (relationship) prediction.

✨ SGFormer outperforms prior arts by a clear margin, especially in relationship understanding, thanks to its semantic-aware transformer design.

Run Your Own Evaluation

Dataset

Follow the instructions at SGGpoint Dataset Guide to obtain 3DSSG-O27R16. Place the data under data/3DSSG/.

Installation

conda create --name sgformer python=3.8
conda activate sgformer

git clone https://github.com/yourname/SGFormer.git
cd SGFormer

pip install -r requirements.txt

Training

CUDA_VISIBLE_DEVICES=5 python -m main --mode train --config /home/lcs/tpami2025/config/SGFormer.json --exp exp_76_test \
        --model_name Mmgnet --continue_learning_mode none --root /home/lcs/tpami2025/data/3DSSG_subset \
        --dataset_annotation_type 160O26R \
        --obj_label_path /home/lcs/tpami2025/data/3DSSG_subset/classes.txt \
        --rel_label_path /home/lcs/tpami2025/data/3DSSG_subset/relationships.txt \
        --num_workers 8 --task_type PredCls

Inference

CUDA_VISIBLE_DEVICES=3 python inference.py --config /home/lcs/tpami2025/config/SGFormer.json --exp exp_66 \
--model_name SGFormer --CKPT_PATH /data_3/lcs/tpami2025/workdir --num_workers 8 --root /home/lcs/tpami2025/data/3DSSG_subset --inference_num 67 \
--obj_label_path /home/lcs/tpami2025/data/3DSSG_subset/classes.txt --rel_label_path /home/lcs/tpami2025/data/3DSSG_subset/relationships.txt \
--use_VLM_description --use_triplet --dataset_annotation_type 160O26R 

Acknowledgement

Our evaluation code is build upon VL-SAT. We acknowledge their team for providing this excellent toolkit for evaluating multimodal large language models.

Citation

If you find our paper and code useful in your research, please consider giving us a star ⭐ and citing our work 📝 :)

@inproceedings{lv2024sgformer,
  title={SGFormer: Semantic Graph Transformer for Point Cloud-Based 3D Scene Graph Generation},
  author={Lv, Changsheng and Qi, Mengshi and Li, Xia and Yang, Zhengyuan and Ma, Huadong},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={38},
  number={5},
  pages={4035--4043},
  year={2024}
}

About

Code of AAAI2024 Paper 《SGFormer: Semantic Graph Transformer for Point Cloud-based 3D Scene Graph Generation》

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 97.9%
  • Jupyter Notebook 2.1%