Skip to content
View zhyang2226's full-sized avatar
🏃
Always on the way
🏃
Always on the way

Highlights

  • Pro

Block or report zhyang2226

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. OPA-DPO OPA-DPO Public

    [CVPR 2025 (Oral)] Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key

    Python 106 4

  2. DMBP DMBP Public

    [ICLR 2024] DMBP: Diffusion Model-Based Predictor for Robust Offline Reinforcement Learning against State Observations Perturbations.

    Python 17 1

  3. AR-Lopti AR-Lopti Public

    [ICLR 2026] Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs

    Python 41 2