Skip to content

RavishankarEvani/CAPTN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[CVPR 2025] Chebyshev Attention Depth Permutation Texture Network with Latent Texture Attribute Loss (CAPTN)

CAPTNPaper CAPTNPoster


PyTorch PyTorch Geometric MLflow Config: Hydra
License

📒 Table of Contents

🏛️ CAPTN Architecture & LTA Loss

We introduce the Chebyshev Attention Depth Permutation Texture Network (CAPTN), a novel approach for texture representation and recognition that outperforms state-of-the-art methods in terms of efficiency, accuracy, and interpretability. Additionally, we propose the Latent Texture Attribute (LTA) Loss, a multi-objective loss function that enhances discriminative representation learning by jointly optimizing for classification accuracy, spatial texture preservation, and orderless consistency.

Project Screenshot

🔍 Components of CAPTN and LTA Loss

(a) SLTM: extract a small patch from an input texture image, (b) $\mathrm{D}^2\mathrm{P}$: feature maps from backbone undergo depth permutation to diversify feature space, (c) LCP: enhanced Latent Texture Attributes undergo orderless aggregation and fit a learnable Chebyshev polynomial function, (d) TFA: use spatial texture frequencies to compute spatial attention, (e) SLAR: representation of spatial information for latent attributes and backbone features. SLTM and SLAR are used to compute Latent Texture Attribute Loss. $\textbar\textbar$ represents concatenation, $\times$ represents matrix multiplied by a scalar, $\cdot$ represents matrix multiplication.

Project Screenshot

📁 Folder Structure

CAPTN/
├── conf/
│   └── dataset/
│       ├── common/
│       │   └── architecture.yaml
│       └── dtd.yaml
├── dataloader/
│   ├── __init__.py
│   └── dtd.py
├── src/
│   ├── __init__.py
│   ├── backbone.py
│   ├── build.py
│   ├── loss.py
│   ├── model.py
│   ├── test.py
│   ├── train.py
│   └── utils.py
├── .gitignore
├── main.py
├── config.py
└── README.md
dataset/
└── texture_material/
    └── dtd/
        ├── images/
        │   ├── banded/
        |   |   ├── banded_###.jpg
        |   |   └── ...
        │   ├── blotchy/
        |   |   ├── blotchy_###.jpg
        |   |   └── ...
        |   └── ...
        └── labels/
            ├── train1.txt
            ├── val1.txt
            ├── test1.txt
            └── ...
Create the dataset/ folder in the parent directory of the current GTN/ folder, so that dataset/ and GTN/ are at the same directory level:
mkdir -p ../dataset/texture_material
[Optional] Create the following directories in the project root directory to store logs and snapshots:
mkdir -p ./log ./snapshot

📦 Data Download

Download datasets from the following links: GTOS, GTOS_MOBILE, DTD, KTH-TIPS2-b, FMD

🛠️ Installation

Ensure you have Python 3.8.16 and the following installed:

hydra-core==1.3.2
mlflow==2.9.2
torch==1.12.1+cu113 torchvision==0.13.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
torch-geometric==2.3.1
timm==0.6.7
numpy==1.24.4
Pillow==10.4.0

💻 Start MLflow server

Starting the MLflow tracking server remotely. The MLflow tracking server is accessible at the specified IP address (<remote_host>) and port number (<remote_port>). You can create an SSH tunnel to access it locally.

mlflow server --host <remote_host> --port <remote_port>
ssh -N -f -L <local_host>:<local_port>:<remote_host>:<remote_port> <username>@<headnode_ip>

⚙️ Configuration

To change hyperparameters, backbone selection, or architecture settings, edit the appropriate YAML files located in the conf directory:

conf/dataset/dtd.yaml
conf/dataset/common/architecture.yaml

🚀 Train & Test

To run the main Python script. Specify GPU device index, split number and backbone with size:

python main.py \
       accelerator.device='0' \
       training.split=1 \
       backbone='convnext_nano'

💬 Contact

If you have any questions, feel free to reach out at: [email protected]

🙏 Acknowledgement

The code for the DataLoaders was sourced from pytorch-material-classification.

📚 Citation

If you find our work useful in your research please consider citing our publication:

@InProceedings{Evani_2025_CVPR,
    author    = {Evani, Ravishankar and Rajan, Deepu and Mao, Shangbo},
    title     = {Chebyshev Attention Depth Permutation Texture Network with Latent Texture Attribute Loss},
    booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    month     = {June},
    year      = {2025},
    pages     = {23423-23432}
}

About

[CVPR 2025] Official implementation of Chebyshev Attention Depth Permutation Texture Network with Latent Texture Attribute Loss

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages