| CARVIEW |
Applied Scientist · Computer Vision · Multimodal AI · Synthetic Data
I build practical machine learning systems at the intersection of computer vision, natural language processing, and reasoning.
- Email: bjasani@alumni.cmu.edu
- LinkedIn: /in/bhavan-jasani
- Google Scholar: Scholar Profile
- Location: San Francisco Bay Area
About
I’m an Applied Scientist who brings research to production, specializing in computer vision and multimodal machine learning. My work focuses on two areas:
- Multi-modal learning — across images, text, video, audio, and structured data
- Synthetic data generation & annotation — with humans and AI in the loop to overcome scarce or hard-to-label data
Areas of expertise: multi-modal machine learning, synthetic data generation, document intelligence (layout-aware transformers), visual grounding, chart reasoning & visual question answering, and scalable training/inference.
I’m increasingly motivated to apply my skills in AI to healthcare, genomics, and drug discovery — with the broader goal of contributing to research and products that have real clinical and societal impact.
Experience
-
Amazon AWS AI Labs
Applied Scientist (Computer Vision Research)
Sep 2019 – Present
-
Carnegie Mellon University, Robotics Institute
Research Assistant (Multi-modal Emotion Recognition)
Oct 2017 – Aug 2019
-
Nanyang Technological University, Singapore
Research Assistant (Hardware-efficient Computer Vision)
Jan 2016 – May 2017
Selected Publications
-

Synthesize Step-by-Step: Tools, Templates and LLMs as Data Generators for Reasoning-Based Chart VQA
Conference on Computer Vision and Pattern Recognition (CVPR), 2024
B Jasani*, Z Li*, P Tang, S Ghadar
-

YORO: Lightweight End-to-End Visual Grounding
European Conference on Computer Vision (ECCV) Workshops, 2022
CH Ho, S Appalaraju, B Jasani, R Manmatha, N Vasconcelos
-

DocFormer: End-to-End Transformer for Document Understanding
International Conference on Computer Vision (ICCV), 2021
S Appalaraju, B Jasani, BU Kota, Y Xie, R Manmatha
-

End-to-End Visual Question Answering on Document Images
Amazon Machine Learning Conference (AMLC), 2021
B Jasani*, Y Xie*, R Manmatha
-

Exploiting Spatial Layout in Document Question Answering using Transformers
Amazon Machine Learning Conference (AMLC), 2021
Y Xie, B Pang, Y Zhang, B Jasani, V Mahadevan, R Manmatha
-

Are We Asking the Right Questions in MovieQA?
International Conference on Computer Vision (ICCV) Workshops, 2019
[Spotlight oral presentation]
B Jasani, R Girdhar, D Ramanan
-

Skeleton-based Zero Shot Action Recognition in Joint Pose-Language Semantic Space
arXiv, 2019
B Jasani, A Mazagonwalla
-

Automatic detection of human affective behavior in dyadic conversations
CMU RI Technical Report (Master's Thesis), 2019
B Jasani
-

-

Data-path unrolling with logic folding for area-time-efficient FPGA-based FAST corner detector
Journal of Real-Time Image Processing, 2019
SK Lam, T Lim, M Wu, B Cao, B Jasani
-

Threshold-Guided Design and Optimization for Harris Corner Detector Architecture
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), Journal, 2017
B Jasani, SK Lam, PK Meher, M Wu
-

Area-time efficient FAST corner detector using data-path transposition
IEEE Transactions on Circuits and Systems II: Express Briefs, Journal, 2017
SK Lam, T Lim, M Wu, B Cao, B Jasani
-

Accelerating Feature Detectors For Real-Time Vision-Based Applications
Bachelor's Thesis, 2016
B Jasani
Patents
-
Global Prompts with Linear Adapter Tuning for Regression-Free Model Update
US Patent 85,779,920 — filed 2023
B. Jasani, P. Tan, P. Zhu, R. Manmatha, V. Mahadevan, Y. Xie
-
Document Visual Question Answering with Multimodal Transformer Encoder–Decoder Models
US Patent 85,528,792 — filed 2022
B. Jasani, N. Sankaran, P. Zhu, R. Manmatha, Y. Xie
Academic Services
-
Program Committee
- International Conference on Document Analysis and Recognition (ICDAR), 2025
- Transferring and Adapting Source Knowledge in Computer Vision (TASK-CV) Workshop, ICCV 2019
-
Reviewer
- Conference on Computer Vision and Pattern Recognition (CVPR)
- International Conference on Computer Vision (ICCV)
- European Conference on Computer Vision (ECCV)
- Amazon Computer Vision Conference (ACVC)
- Amazon Research Awards
- Book chapters – “Data Augmentation with Python”, Packt Publishing, 2023
Curriculum Vitae
Download my latest CV (updated 2025):
Fun Stuff
I enjoy partner dancing (Fusion and Salsa), have a deep passion for aviation and am working toward my private pilot license. I also love attending meditation retreats and going biking in my free time.