Skip to content
View Nano1337's full-sized avatar

Block or report Nano1337

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Nano1337/README.md

Haoli Yin

Member of Technical Staff at Datology working across multimodal pretraining, evaluation systems, research infrastructure, and data-centric ML.

I build systems and research loops for training, evaluating, and improving multimodal models at scale. My recent work sits at the intersection of algorithmic data mixing, evaluation quality, distributed training and inference infrastructure, and agentic tooling for research.

Current focus

  • Algorithmic multimodal pretraining data mixing
  • Curating evals for better signal and coverage
  • Research infrastructure for large-scale distributed training and vLLM-based eval inference
  • Data pipeline infrastructure for large-scale data processing and synthetic data generation using vLLM on Ray orchestrated by Kubernetes
  • Agentic tooling for research and harnesses with verifiable signals for autoresearch

What I'm working on now

At Datology, I work on multimodal model development and the systems around it: training pipelines, eval workflows, large-scale data processing, and the infrastructure needed to iterate quickly on pretraining and post-training decisions.

I am especially interested in building tight loops between:

  • data mixture design and downstream capability
  • eval coverage and trustworthy research signals
  • research ideas and the infrastructure needed to test them quickly

Selected public work

  • benchmark-dataloader: multimodal dataloader benchmarking for understanding throughput and systems bottlenecks
  • Multimodal Dataloaders Go Brrrrrrr: write-up on dataloader performance and why it matters for practical multimodal training
  • UniCat: stronger fusion baseline for multimodal re-identification, with the corresponding paper
  • SpecReFlow: official implementation for SPIE Photonics West 2023 on reflection-aware video restoration

Links

If you're working on multimodal pretraining, evaluation, or research infrastructure, feel free to connect.

Pinned Loading

  1. UniCat UniCat Public

    UniCat: Crafting a Stronger Fusion Baseline for Multimodal Re-Identification, accepted for poster at UniReps Workshop @ NeurIPS 2023

    Python

  2. GraFT GraFT Public

    GraFT: Gradual Fusion Transformer for Multimodal Re-Identification

    HTML 2

  3. benchmark-dataloader benchmark-dataloader Public

    Multimodal Dataloader Benchmarking

    Python 4

  4. ume-fakenews ume-fakenews Public

    Multimodal Ensemble Model for Fake News Detection Benchmarking

    Python 3

  5. SpecReFlow SpecReFlow Public

    The official implementation of SpecReFlow, SPIE Photonics West 2023

    Python 6

  6. vlm-clustering vlm-clustering Public

    Topic Modeling of VLM Generated Image Captions from NSD dataset

    Python