I am a Ph.D. candidate in Computer Science at the University of North Carolina at Charlotte, supervised by Dr. Pu Wang in the GENIUS Lab. In industry, I work as a researcher with the Computer Vision teams at Amazon and Loweβs where I am developing large-scale, multimodal language models (MLLMs) to enhance operational efficiency and customer experience in complex, real-world environments. Moreover, I joined Google as a Student Researcher in the Extended Reality (AR/VR) team, working on advancing multimodal and generative AI for immersive technologies.
My research interests lie at the intersection of computer vision and generative AI, with a focus on 3D human modeling. Specifically, I focus on 3D human pose estimation and mesh reconstruction via generative masked modeling. Moreover, Iβm interested in developing multimodal motion synthesis frameworks that synthesize controllable, high-fidelity 3D human animations for real time applications.
If you have any research opportunities, please feel free to reach out:
- Email: [email protected]
β‘ I like to play cricket, I enjoy cooking π¨.
π§ Email: [email protected]
π¨π»βπΌ LinkedIn: m-usamasaleem

