PhD Researcher in AI

Wayner Barrios

Teaching machines to see, understand, and reason like humans

About

Researching the intersection of vision & language

I work at the frontier where machines learn to see, reason, and understand the world like we do.

Ph.D. expected Winter 2026 at Dartmouth College, advised by SouYoung Jin. Dissertation committee: Soroush Vosoughi, Nikhil Singh, and Juan Carlos Niebles. My work focuses on Multimodal Large Language Models, evaluation, multimodal reasoning, and multimodal fusion through learnable masks and cross-modal alignment.

I've built AI and machine learning systems for enterprise and government clients, including computer vision pipelines, large-scale data processing, and real-time analytics. Research collaborations with DARPA/IARPA, CMU, KAUST, Adobe Research, Samsung Research, Northeastern University, Mount Sinai, Lunenfeld Institute, Universidad del Norte, EAFIT, Universidad CES, Universidad de Antioquia, among others. Founder of Wiqonn, a LATAM-based initiative delivering research-backed AI solutions for real-world impact.

For details about my recent research, visit Google Scholar. For previous work experience, get in touch.

Research Focus

Multimodal AI Deep Learning LLMs Video Understanding Post-Training Alignment Large-Scale Pretraining Efficient Models Multimodal Reasoning

Latest Paper

arXiv 2026

Native LLM and MLLM Inference at Scale on Apple Silicon

W. Barrios

vllm-mlx: A framework delivering 21-87% higher throughput than llama.cpp on Apple Silicon, with content-based prefix caching for multimodal workloads and up to 525 tokens/sec.

arXiv 2025

MoDA: Modulation Adapter for Fine-Grained Visual Grounding in Instructional MLLMs

W. Barrios, A. Villa, J. L. Alcázar, S. Jin, B. Ghanem

Paper

WACV 2025

FT2TF: First-Person Statement Text-To-Talking Face Generation

X. Diao, M. Cheng, W. Barrios, S. Jin

Paper

arXiv 2024

Multi-layer Learnable Attention Mask for Multimodal Tasks

W. Barrios, S. Jin

Paper

ICCV 2023

Localizing Moments in Long Video Via Multimodal Guidance

W. Barrios, M. Soldan, A. M. Ceballos-Arroyo, F. Caba, B. Ghanem

Paper Code

J. Pathology Informatics 2022

Bladder Cancer Prognosis Using Deep Neural Networks and Histopathology Images

W. Barrios, B. Abdollahi, M. Goyal, Q. Song, et al.

Paper

Frontiers Med. Tech. 2022

Medical Decision Support System Using Weakly-Labeled Lung CT Scans

A. Murillo-González, D. González, L. Jaramillo, ..., W. Barrios, ..., O.L. Quintero

Paper

Int. J. Artificial Intelligence 2021

Influence of Preprocessing and Segmentation on the Complexity of Learning Machines in Medical Imaging

J.G. Paniagua, D. Restrepo, L. Ariza-Jiménez, ..., W. Barrios, ..., O.L. Quintero

Paper

WACVW 2019

Minding the Gaps in a Video Action Analysis Pipeline

J. Chen, J. Liu, J. Liang, T.Y. Hu, W. Ke, W. Barrios, D. Huang, A.G. Hauptmann

Paper

TRECVID 2018

Informedia @ TRECVID 2018

J. Chen, P.Y. Huang, J. Liu, J. Liang, T.Y. Hu, W. Ke, W. Barrios, et al.

Paper

CVPR 2017

SCC: Semantic Context Cascade for Efficient Action Detection

F. Caba, W. Barrios, V. Escorcia, B. Ghanem

112 citations

Paper

View all publications

Open Source Projects

vLLM-MLX

OpenAI-compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support on M1/M2/M3/M4 chips.

Apple Silicon MLX LLM Inference Multimodal

DGX Spark Fine-tune LLM

LLM fine-tuning with LoRA + NVFP4/MXFP8 quantization on NVIDIA DGX Spark (Blackwell GB10).

Blackwell LoRA Quantization

Guidance Video Grounding

Official PyTorch implementation of ICCV 2023 paper on moment retrieval in long videos.

ICCV 2023 PyTorch Video

ActivityNet

Large-scale benchmark for human activity understanding in videos.

Benchmark Dataset

GeoNode

Open source geospatial platform. Contributed to service virtualization and networking.

Open Source GIS

@waybarrios View all repositories

Tech Stack

Languages & Frameworks Python · C++ · PyTorch · JAX · MLX

Acceleration CUDA · ROCm · FPGA (HLS/Vitis)

Distributed Training FSDP · DeepSpeed · Dynamo · Mixed Precision

Optimization Quantization · Pruning · Distillation

Infrastructure K8s · Docker · AWS · GCP · Vector DBs

Beyond Research

Dog lover: Tala, Luna, Raissa & Naia Passionate traveler Audiophile & musician Water sports enthusiast

Get in Touch

Academic [email protected] Personal [email protected] Business [email protected]