You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, this is Nahid. I am an independent researcher with Cohere Labs community, working on Multimodal Learning, Computer Vision and Embodied AI.
I recently created Maya – a multilingual multimodal LLM. I work at the intersection of multimodal learning, computer vision and embodied ai, developing models that perceive, reason, and act in the physical world.
My current interests include:
Spatial understanding in VLMs for real-world perception
Physics-aware world models
Multimodal Learning
Simulation and Embodied AI
Publications
Behind Maya: Building a Multilingual Vision-Language Model.
Nahid Alam et al. CVPR 2025 Workshop (VLMs4All). arXiv · Google Scholar
Understanding and Mitigating Toxicity in Image-Text Pretraining Datasets: A Case Study on LLaVA.
Nahid Alam, Karthik Reddy Kanjula, Surya Guthikonda, Shayekh Islam.
CVPR 2025 Workshop (ReGenAI), Oral. arXiv · Google Scholar
Embedding Geometries of Contrastive Language-Image Pre-Training.
Jason Chuan-Chih Chou, Nahid Alam. ECCV 2024 Workshop (Beyond Euclidean). arXiv · Google Scholar