| CARVIEW |
About me
I am a Ph.D. candidate in Computer Science at the University of North Carolina at Charlotte, supervised by
Dr. Pu Wang
in the GENIUS Lab. In industry, I work as a researcher with the Computer Vision teams at
Amazon and
Lowe’s where I am developing large-scale, multimodal language models (MLLMs) to enhance operational efficiency and customer experience in complex, real-world environments. Moreover, I joined
Google as a Student Researcher in the Extended Reality (AR/VR) team, working on advancing multimodal and generative AI for immersive technologies.
Research Interests
My research interests lie at the intersection of computer vision and generative AI, with a focus on 3D human modeling. Specifically, I focus on 3D human pose estimation and mesh reconstruction via generative masked modeling. Moreover, I’m interested in developing multimodal motion synthesis frameworks that synthesize controllable, high-fidelity 3D human animations for real time applications.
If you have any research opportunities or open positions, please feel free to reach out at msaleem2@charlotte.edu .
News
- Nov 2025: “LiveGesture: Streamable Co-Speech Gesture Generation Model” will be available on arXiv!
- Nov 2025: “Monocular Models are Strong Learners for Multi-View Human Mesh Recovery” will be available on arXiv!
- Nov 2025: “Walk Before You Dance: High-fidelity and Editable Dance Synthesis via Generative Masked Motion Prior” was accepted to AAAI 2026!
- October 2025: Joined Google as Student Researcher in Extended Reality (AR/VR) Team!
- October 2025: MaskControl paper selected for Oral Presentation and 🏆 Award Candidate at ICCV 2025!
- Aug 2025: Available for Research Scientist / Engineer oppertunities. Please reach out if there’s a good match.
- July 2025: Poster selected at Amazon WWAS Science Fair Seattle, presented next-gen multimodal shopping demo to VPs!
- June 2025: Joined Amazon as Applied Scientist II Intern
- June 2025: A paper on “MaskHand: Generative Masked Modeling for Robust Hand Mesh Reconstruction in the Wild” is accepted to ICCV 2025!
- June 2025: A paper on “Spatio-Temporal Control for Masked Motion Synthesis” is accepted to ICCV 2025 (Oral)!
- April 2025: “Walk Before You Dance: High-fidelity and Editable Dance Synthesis via Generative Masked Motion Prior” is now available on arXiv.
- Dec 2024: “GenHMR: Generative Human Mesh Recovery” was accepted to AAAI 2025, presented in Philadelphia, and received a travel award.
- Oct 2024: “BioPose: Biomechanically-Accurate 3D Pose Estimation from Monocular Videos” is accepted to WACV 2025!
- July 2024: “BAMM: Bidirectional Autoregressive Motion Model” is accepted to ECCV 2024!
- Sept 2023: Joined Lowe’s as Research Lead of the Computer Vision UNCC Team
- June 2023: “Private Data Synthesis from Decentralized Non-IID Data” accepted to IJCNN 2023, presented in Queensland, Australia, and received a $5500 travel grant!
- April 2023: Presented at the SIAM International Conference on Data Mining (SDM’23) Doctoral Forum; awarded NSF $1400 travel grant.
- July 2022: “Privacy Enhancement for Cloud-Based Few-Shot Learning” accepted to IJCNN 2022!
- Jan 2022: “DP-Shield: Face Obfuscation with Differential Privacy” accepted to EDBT 2022!
Work Experience
Student Researcher at Extended Reality (AR/VR) Team
Google, San Francisco, CA Nov. 2025 – Present
Applied Scientist II Intern
Amazon Inc., Boston, MA June 2025 – Present
Research Lead, Computer Vision UNCC Team
Lowe’s, Charlotte, NC Sept. 2023 – Present
Recent Publications
LiveGesture: Streamable Co-Speech Gesture Generation Model
Monocular Models are Strong Learners for Multi-View Human Mesh Recovery
Walk Before You Dance: High-fidelity and Editable Dance Synthesis via Generative Masked Motion Prior
MaskHand: Generative Masked Modeling for Robust Hand Mesh Reconstruction in the Wild
MaskControl: Spatio-Temporal Control for Masked Motion Synthesis
GenHMR: Generative Human Mesh Recovery
BioPose: Biomechanically-accurate 3D Pose Estimation from Monocular Videos
BAMM: Bidirectional Autoregressive Motion Model
Private Data Synthesis from Decentralized Non-IID Data
Privacy Enhancement for Cloud-Based Few-Shot Learning