This is the official Repository containing the implementaton of the paper titled: "MAGIC-TBR: Multiview Attention Fusion for Transformer-based Bodily Behavior Recognition in Group Settings" (https://doi.org/10.1145/3581783.3612858)
To extract the video swin features please refer to: https://github.com/SwinTransformer/Video-Swin-Transformer
To extract LaViLA features, follow: https://github.com/facebookresearch/LaViLa
