- π Iβm currently working on speech synthesis (TTS) for game scenarios, focusing on controllable emotional TTS, audio-visual joint generation, and MIDI-based singing voice synthesis.
- π± Iβm currently learning multi-modal large language models, reinforcement learning for generation tasks (DPO/GRPO), advanced speech tokenization techniques, and exploring AI Agents for daily life integration.
- π― Iβm looking to collaborate on projects related to speech/audio generation, multi-modal AI, and creative applications of AIGC in gaming or entertainment.
- π€ Iβm looking for help with efficient data cleaning pipelines, scaling up model training, and exploring novel evaluation metrics for generative models.
- π¬ Ask me about speech synthesis, controllable TTS, audio-visual generation, AI Agents or anything related to AI and technology.
- π« How to reach me: [email protected]
- π Pronouns: He/Him
π
I may be slow to respond.
Pinned Loading
-
audio-evaluation-tool
audio-evaluation-tool PublicA web-based tool for comparing audio model outputs side-by-side with content diff and quality evaluation.
TypeScript
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.

