Kavyansh Tyagi KAVYANSHTYAGI

3D Medical Segmentation • Efficient Vision Architectures • Event-Based Vision

3rd Year Undergraduate Researcher | NIT Kurukshetra, India
Published in Springer & Elsevier | Targeting CVPR · MICCAI · NeurIPS · ICLR

🧭 Research Identity

I design efficient deep learning architectures for tasks where compute and memory are real constraints — not afterthoughts.

My research centers on three threads:

3D Medical Image Segmentation — building lightweight transformer-based models (RefineFormer3D, LightMedSeg) that match or exceed SOTA on BraTS & Synapse benchmarks while dramatically cutting GFLOPs and parameters
Efficient Attention & State Mechanisms — designing novel sub-quadratic attention alternatives (Selective Holographic State) that trade full-rank attention for structured, hardware-friendly representations without sacrificing expressivity
Event-Based Vision & Sparse Graph Learning — leveraging the asynchronous, sparse nature of event camera data through GNN-based representations (SCAF) that process spatiotemporal structure without dense frame reconstruction

My work is driven by a single question: how do you get the most out of a model that's small enough to actually deploy?

"Efficient by design. Principled by necessity."

🔬 Research Projects

🧠 RefineFormer3D — Efficient 3D Medical Segmentation

PyTorch CUDA Vision Transformers 3D Attention BraTS2023 Synapse

Problem: Transformer-based 3D segmentation models (SwinUNETR, nnFormer, TransBTS) deliver strong accuracy but at prohibitive compute — limiting real-world clinical deployment.

Approach: A lightweight encoder–decoder with anatomical priors injected as learned spatial constraints, paired with a coarse-to-fine refinement module. Memory-efficient 3D attention via custom CUDA kernels. Ablations across attention variants, prior injection depth, and decoder topology.

Results:

Outperforms published SOTA on BraTS2023 and Synapse benchmarks
Significant reduction in GFLOPs and parameter count vs. transformer baselines
Full ablation suite validating each architectural decision

Status: Manuscript in preparation → CVPR 2026 Workshop / MICCAI 2026

🏥 LightMedSeg — Ultra-Lightweight 3D Segmentation with Learned Spatial Anchors

PyTorch 3D CNNs Spatial Anchoring Medical Imaging BraTS Synapse

Problem: Even "lightweight" segmentation models assume GPU-class hardware. There is no strong baseline for truly resource-constrained 3D medical segmentation (mobile, edge, embedded clinical devices).

Approach: Introduces learned spatial anchors — a small set of trainable volumetric reference points that guide the decoder without requiring full-resolution feature maps at every layer. Eliminates skip connection overhead while preserving anatomical localization.

Contribution: Achieves competitive Dice scores at a fraction of the memory footprint of existing lightweight baselines. Evaluated on BraTS & Synapse with full FLOPs/parameter/latency profiling.

Status: Architecture finalized → benchmarking phase → targeting MICCAI 2026

⚡ SHS — Selective Holographic State for Efficient Attention

PyTorch Custom CUDA State Space Models Attention Mechanisms Sub-quadratic Complexity

Problem: Standard softmax attention scales as O(N²) in sequence length, making it impractical for high-resolution volumetric inputs (3D scans, long event streams). SSM alternatives (Mamba, S6) sacrifice expressivity for efficiency.

Approach: Selective Holographic State (SHS) — a novel mechanism that represents attention as a low-rank holographic projection over a structured state matrix. Selectively retains interaction patterns based on input-conditioned gating, achieving sub-quadratic complexity without losing global context modeling.

Contribution: Designed as a drop-in replacement for self-attention in vision transformer backbones; evaluated on classification, segmentation, and sequence modeling benchmarks.

Status: Core implementation complete → ablation and benchmarking → targeting ICLR / NeurIPS 2026

🌐 SCAF — Sparse Graph Learning for Event-Based Vision

PyTorch Geometric GNNs Event Cameras Sparse Graphs Asynchronous Vision

Problem: Event cameras produce asynchronous, sparse spike streams that dense CNN/transformer pipelines cannot efficiently process — they either reconstruct frames (losing temporal precision) or apply attention to padded dense tensors (wasting compute on empty regions).

Approach: SCAF (Sparse-to-Dense Causal Attention Framework) models event streams directly as dynamic sparse graphs. Nodes represent active pixels; edges encode spatiotemporal proximity. A novel temporal grouping mechanism clusters nodes by activity burst patterns to prevent neighborhood explosion during message passing.

Contribution: Processes raw event streams without frame reconstruction; +15% over TGN baselines on large-scale temporal graph benchmarks (Wikipedia dataset, 5M+ nodes).

Status: Validation phase → targeting CVPR 2026 / NeurIPS 2026

🛠️ Technical Stack

Hardware: RTX 5080 (16GB VRAM) · 64GB RAM · CUDA 12.8 · PyTorch 2.9+

📄 Publications

📘 [Paper Title] — Journal Name, Springer, 2025. [DOI]
📗 [Paper Title] — Journal Name, Elsevier, 2025. [DOI]

Manuscripts in preparation: CVPR 2026 Workshop · MICCAI 2026 · NeurIPS 2026

🌍 Open Source Contributions

Project	Contribution
PyTorch Lightning	[Add your PR title + link here]
OpenCV	[Add your PR title + link here]

📊 GitHub Statistics

🤝 Currently Open To

🔬 Research Internships (Summer 2026) — University labs, AI research institutes, industrial research groups
📝 Research Collaborations — 3D medical imaging, efficient attention, event-based vision, sparse GNNs
💬 Discussions — Architecture design, sub-quadratic attention, state space models, CUDA optimization

📩 [email protected]

"Efficient by design. Principled by necessity."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly