| CARVIEW |
Syed Talal Wasim
About Me
I am a PhD student, currently affiliated with the Computer Vision Group at the University of Bonn, Germany. I am supervised by Professor Dr. Jürgen Gall, and am working in the domain of Long-Term Multimodal Video Understanding.
Previously I was an Associate Researcher in computer vision, affiliated with the Intelligent Visual Analytics Lab (IVAL) at the Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI). I was supervised by Dr. Salman Khan.
I completed my master’s degree in Image Processing and Computer Vision (IPCV) funded by the Erasmus Mundus Joint Master’s Degree (EMJMD) scholarship program. During the master’s program, I was fortunate to have interned at the Empathic Computing Lab supervised by Dr. Mark Billinghurst. I completed my master’s thesis in the CVLAB at EPFL supervised by Dr. Mathieu Salzmann.
I hold an undergraduate degree in Electrical Engineering, with a minor in computer science, from Habib University in Karachi, Paksitan.
My previous website listing high-school, undrgraduate and graduate courses and projects can be found at talalwasim.weebly.com.
Research Interests
- Computer Vision: image and video understanding, action anticipation, multimodal learning
- Machine Learning: self-supervised learning, out-of-distribution generalization
News
- [Jul. 2025] Our paper titled "MixANT: Observation-dependent Memory Propagation for Stochastic Dense Action Anticipation" is accepted in ICCV 2025.
- [Feb. 2025] Three of our papers (Video-Panda, GroupMamba, and STING-BEE) have been accepted in CVPR 2025.
- [Dec. 2024] Our paper titled "Efficient Video Object Segmentation via Modulated Cross-Attention Memory" is accepted in WACV 2025.
- [Oct. 2024] New preprint released titled "Distillation-free Scaling of Large SSMs for Images and Videos".
- [Mar. 2024] Our paper titled "VideoGrounding-DINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding" is accepted in CVPR 2024.
- [Feb. 2024] My student Muhammad Zain Yousuf's bachelor thesis titled "AR-VPT: Simple Auto-Regressive Prompts for Adapting Frozen ViTs to Videos" is accepted in VISAPP 2024.
- [Jan. 2024] I started a PhD at the University of Bonn, Germany working on Long-Term Multimodal Video Understanding, under the supervision of Professor Dr. Juergen Gall.
- [Oct. 2023] Our paper titled "Hardware Resilience Properties of Text-Guided Image Classifiers" is accepted in NeurIPS 2023.
- [Aug. 2023] Our paper titled "Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition" is accepted in ICCV 2023.
- [Aug. 2023] Our paper titled "Self-regulating Prompts: Foundational Model Adaptation without Forgetting" is accepted in ICCV 2023.
- [Jun. 2023] Our paper titled "Toward Automatic Typography Analysis: Serif Classification and Font Similarities" is accepted in the Journal of Data Mining in Digital Humanities (JDMDH).
- [Mar. 2023] Our paper titled "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" is accepted in CVPR 2023.
- [Jun. 2022] Our paper titled "Using Facial Micro-Expressions in Combination With EEG and Physiological Signals for Emotion Recognition" is accepted in the Frontiers in Psychology.
- [Apr. 2022] I started working as a researcher at MBZUAI. I was supervised by Dr. Salman Khan, working on multimodal video understanding.
- [Jul. 2021] I was accepted in the ETH Robotics Summer School and Symposium.
- [Jun. 2021] I defended my master's thesis and graduated from the IPCV master's program.
- [May. 2021] Our paper on synthetic data for object detection is accepted to CVPR 2021 CV4Animals workshop.
- [Feb. 2021] I started my master's thesis in the CVLAB at EPFL supervised by Dr. Mathieu Salzmann. I worked on automated typography analysis on figurative content.
- [Jul. 2020] I started a remote research internship at the Empathic Computing Lab supervised by Dr. Mark Billinghurst.
- [Sep. 2019] I started my master's degree in Image Processing and Computer Vision (IPCV) funded by the Erasmus Mundus Joint Master's Degree (EMJMD) scholarship program.
- [Jun. 2019] I completed my undergraduate degree in Electrical Engineering with a Minor in computer science. Graduated first in class with the Dean's Medal.
Publications
-
ICCV
MixANT: Observation-dependent Memory Propagation for Stochastic Dense Action AnticipationSyed Talal Wasim, Hamid Suleman, Olga Zatsarynna, Muzammal Naseer and Juergen GallICCV, 2025×@inproceedings{wasim2025mixant, title={MixANT: Observation-dependent Memory Propagation for Stochastic Dense Action Anticipation}, author={Syed Talal Wasim and Hamid Suleman and Olga Zatsarynna and Muzammal Naseer and Juergen Gall}, booktitle={ICCV} year={2025}} -
CVPR
Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language ModelsJinhui Yi*, Syed Talal Wasim*, Yanan Luo*, Muzammal Naseer and Juergen GallCVPR, 2025×@inproceedings{yi2025vpanda, title={Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models}, author={Jinhui Yi* and Syed Talal Wasim* and Yanan Luo* and Muzammal Naseer and Juergen Gall}, booktitle={CVPR} year={2025}} -
CVPR
GroupMamba: Parameter-Efficient and Accurate Group Visual State Space ModelAbdelrahman Shaker, Syed Talal Wasim, Salman Khan, and Fahad Shahbaz KhanCVPR, 2025×@inproceedings{shaker2024groupmamba, title={GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model}, author={Abdelrahman Shaker and Syed Talal Wasim and Salman Khan and Juergen Gall and Fahad Shahbaz Khan}, booktitle={CVPR}, year={2025}} -
CVPR
STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security InspectionD. Velayudhan, A. Ahmed, M. Alansari, N. Gour, A. Behouch, T. Hassan, Syed Talal Wasim, N. Maalej, M. Naseer, J. Gall, M. Bennamoun, E. Damiani and N. WerghiCVPR, 2025×@inproceedings{velayudhan2024sting, title={STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection}, author={Divya Velayudhan and Abdelfatah Ahmed and Mohamad Alansari and Neha Gour and Abderaouf Behouch and Taimur Hassan and Syed Talal Wasim and Nabil Maalej and Muzammal Naseer and Juergen Gall and Mohammed Bennamoun and Ernesto Damiani and Naoufel Werghi}, booktitle={CVPR} year={2025}} -
WACV
Efficient Video Object Segmentation via Modulated Cross-Attention MemoryAbdelrahman Shaker, Syed Talal Wasim, Martin Danelljan, Salman Khan, Ming-Hsuan Yang and Fahad Shahbaz KhanWACV, 2025×@inproceedings{shaker2025mavos, title={Efficient Video Object Segmentation via Modulated Cross-Attention Memory}, author={Abdelrahman Shaker and Syed Talal Wasim and Martin Danelljan and Salman Khan and Ming-Hsuan Yang and Fahad Shahbaz Khan}, booktitle={WACV} year={2025}} -
Under Review
Distillation-free Scaling of Large SSMs for Images and VideosHamid Suleman*, Syed Talal Wasim*, Muzammal Naseer and Juergen GallUnder Review×@article{suleman2024stablemamba, title={Distillation-free Scaling of Large SSMs for Images and Videos}, author={Hamid Suleman* and Syed Talal Wasim* and Muzammal Naseer and Juergen Gall}, journal={arxiv preprint, arxiv:2409.11867}, year={2024}} -
CVPR
Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video GroundingSyed Talal Wasim, Muzammal Naseer, Salman Khan, Ming-Hsuan Yang and Fahad Shahbaz KhanCVPR, 2024×@inproceedings{wasim2024vgdino, title={Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding}, author={Syed Talal Wasim and Muzammal Naseer and Salman Khan and Ming-Hsuan Yang and Fahad Shahbaz Khan}, booktitle={CVPR} year={2024}} -
VISAPP
AR-VPT: Simple Auto-Regressive Prompts for Adapting Frozen ViTs to VideosMuhammad Zain Yousuf, Syed Talal Wasim, Syed Nouman Hasany and Muhammad FarhanVISAPP, 2024×@inproceedings{yousuf2024arvpt, title={AR-VPT: Simple Auto-Regressive Prompts for Adapting Frozen ViTs to Videos}, author={Muhammad Zain Yousuf and Syed Talal Wasim and Syed Nouman Hasany and Muhammad Farhan}, booktitle={VISAPP} year={2024}} -
NeurIPS
Hardware Resilience Properties of Text-Guided Image ClassifiersSyed Talal Wasim, Kabila Haile Soboka, Abdulrahman Mahmoud, Salman Khan, David Brooks and Gu-Yeon WeiNeurIPS, 2023×@inproceedings{wasim2023textres, title={Hardware Resilience Properties of Text-Guided Image Classifiers}, author={Syed Talal Wasim and Kabila Haile Soboka and Abdulrahman Mahmoud and Salman Khan and David Brooks and Gu-Yeon Wei}, booktitle={NeurIPS} year={2023}} -
ICCV
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action RecognitionSyed Talal Wasim*, Muhammad Uzair Khattak*, Muzammal Naseer, Salman Khan, Mubarak Shah and Fahad Shahbaz KhanICCV, 2023×@inproceedings{wasim2023vfn, title={Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition}, author={Syed Talal Wasim* and Muhammad Uzair Khattak* and Muzammal Naseer and Salman Khan and Mubarak Shah and Fahad Shahbaz Khan}, booktitle={ICCV} year={2023}} -
ICCV
Self-regulating Prompts: Foundational Model Adaptation without ForgettingMuhammad Uzair Khattak*, Syed Talal Wasim*, Muzammal Naseer, Salman Khan, Ming-Hsuan Yang and Fahad Shahbaz KhanICCV, 2023×@inproceedings{khattak2023promptsrc, title={Self-regulating Prompts: Foundational Model Adaptation without Forgetting}, author={Muhammad Uzair Khattak* and Syed Talal Wasim* and Muzammal Naseer and Salman Khan and Ming-Hsuan Yang and Fahad Shahbaz Khan}, booktitle={ICCV} year={2023}} -
CVPR
Vita-CLIP: Video and text adaptive CLIP via Multimodal PromptingSyed Talal Wasim, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan and Mubarak ShahCVPR, 2023×@inproceedings{wasim2023vita, title={Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting}, author={Syed Talal Wasim and Muzammal Naseer and Salman Khan and Fahad Shahbaz Khan and Mubarak Shah}, booktitle={CVPR} year={2023}} -
JDMDH
Toward Automatic Typography Analysis: Serif Classification and Font SimilaritiesSyed Talal Wasim, Romain Collaud, Lara Défayes, Nicolas Henchoz, Mathieu Salzmann and Delphine RibesJournal of Data Mining in Digital Humanities, 2023×@article{wasim2023gest, title={Toward automatic typography analysis: serif classification and font similarities}, author={Syed Talal Wasim and Romain Collaud and Lara Défayes and Nicolas Henchoz and Mathieu Salzmann and Delphine Ribes}, journal={Journal of Data Mining in Digital Humanities (JDMDH)}, year={2023}} -
Frontiers
Using Facial Micro-Expressions in Combination With EEG and Physiological Signals for Emotion RecognitionNastaran Saffaryazdi, Syed Talal Wasim, Kuldeep Dileep, Alireza Farrokhi Nia, Suranga Nanayakkara, Elizabeth Broadbent and Mark BillinghurstFrontiers in Psychology, 2022×@article{wasim2022ecl, title={Using facial micro-expressions in combination with EEG and physiological signals for emotion recognition}, author={Nastaran Saffaryazdi and Syed Talal Wasim and Kuldeep Dileep and Alireza Farrokhi Nia and Suranga Nanayakkara and Elizabeth Broadbent and Mark Billinghurst}, journal={Frontiers in Psychology}, year={2022}} -
CVPRW
Sim-to-Real Transfer for Object Detection and Localization on AnimalsSyed Talal Wasim, Syed N. Hasany, Kainat Abbasi, Huda Feroz, Anisa A. Ahmed, Mudasir H. Shaikh and Muhammad FarhanCV4Animals Workshop, CVPR 2021×@inproceedings{wasim2021cv4animals, title={Sim-to-Real Transfer for Object Detection and Localization on Animals}, author={Syed Talal Wasim and Syed N. Hasany and Kainat Abbasi and Huda Feroz and Anisa A. Ahmed and Mudasir H. Shaikh and Muhammad Farhan}, booktitle={CV4Animals CVPR Workshop}, year={2021}}
Services
Journal Reviewers
Transactions on Pattern Analysis and Machine Intelligence (TPAMI) Transactions on Neural Networks and Learning Systems (TNNLS) Transactions on Image Processing (TIP) Transactions on Machine Learning Research (TMLR) International Journal of Computer Vision (IJCV) Pattern Recognition
Conference Reviewers
Computer Vision (CVPR, ICCV, ECCV, WACV, ACCV) Artificial Intelligence and Machine Learning (NeurIPS, ICLR, ICML, AAAI)
Project Supervision
Co-supervise undergraduate projects in computer vision at Habib University Co-supervise high-school students in Pakistan for the Intel International Science and Engineering Fair (ISEF)
Powered by Jekyll and Minimal Light theme.