Wayner Barrios

PhD Researcher in AI

Wayner Barrios

Teaching machines to see, understand, and reason like humans

About

Researching the intersection of vision & language

I work at the frontier where machines learn to see, reason, and understand the world like we do.

Ph.D. expected Winter 2026 at Dartmouth College, advised by SouYoung Jin. Dissertation committee: Soroush Vosoughi, Nikhil Singh, and Juan Carlos Niebles. My work focuses on Multimodal Large Language Models, evaluation, multimodal reasoning, and multimodal fusion through learnable masks and cross-modal alignment.

I've built AI and machine learning systems for enterprise and government clients, including computer vision pipelines, large-scale data processing, and real-time analytics. Research collaborations with DARPA/IARPA, CMU, KAUST, Adobe Research, Samsung Research, Northeastern University, Mount Sinai, Lunenfeld Institute, Universidad del Norte, EAFIT, Universidad CES, Universidad de Antioquia, among others. Founder of Wiqonn, a LATAM-based initiative delivering research-backed AI solutions for real-world impact.

For details about my recent research, visit Google Scholar. For previous work experience, get in touch.

Research Focus
Multimodal AI Deep Learning LLMs Video Understanding Post-Training Alignment Large-Scale Pretraining Efficient Models Multimodal Reasoning

Open Source Projects

vLLM-MLX

OpenAI-compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support on M1/M2/M3/M4 chips.

Apple Silicon MLX LLM Inference Multimodal

DGX Spark Fine-tune LLM

LLM fine-tuning with LoRA + NVFP4/MXFP8 quantization on NVIDIA DGX Spark (Blackwell GB10).

Blackwell LoRA Quantization

Guidance Video Grounding

Official PyTorch implementation of ICCV 2023 paper on moment retrieval in long videos.

ICCV 2023 PyTorch Video

ActivityNet

Large-scale benchmark for human activity understanding in videos.

Benchmark Dataset

GeoNode

Open source geospatial platform. Contributed to service virtualization and networking.

Open Source GIS
Tech Stack
Languages & Frameworks Python · C++ · PyTorch · JAX · MLX
Acceleration CUDA · ROCm · FPGA (HLS/Vitis)
Distributed Training FSDP · DeepSpeed · Dynamo · Mixed Precision
Optimization Quantization · Pruning · Distillation
Infrastructure K8s · Docker · AWS · GCP · Vector DBs
Beyond Research
Dog lover: Tala, Luna, Raissa & Naia Passionate traveler Audiophile & musician Water sports enthusiast