CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 last-modified: Fri, 19 Dec 2025 06:54:51 GMT access-control-allow-origin: * strict-transport-security: max-age=31556952 etag: W/"6944f6bb-b838" expires: Mon, 29 Dec 2025 12:27:11 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: B42D:1F53DD:8C69A6:9DA5CE:69527147 accept-ranges: bytes age: 0 date: Mon, 29 Dec 2025 12:17:11 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210065-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1767010632.707451,VS0,VE218 vary: Accept-Encoding x-fastly-request-id: f06ed45a466b698125fe19bb2f0a19db04ea1578 content-length: 11164 Mohammad Aflah Khan

Mohammad Aflah Khan

Research Software Engineer @ MPI-SWS · OSS @ EleutherAI

Hi, I’m Aflah, a research software engineer at the Max Planck Institute for Software Systems. My work centers on deepening our understanding of large language models (LLMs) and rigorously evaluating their capabilities. I’m also passionate about the systems side of LLMs, with hands-on experience in large-scale pretraining and inference. In the past, I’ve contributed to projects targeting hate speech reduction and other NLP applications for social good.

Open to roles in research, research engineering, or backend engineering

Contact me

Experience

Ongoing first (rest sorted by end date)

Max Planck Institute for Software Systems (MPI-SWS)

Research Software Engineer • April, 2024 — Present [Full Time] | Nov, 2023 — March, 2024 [Part time] | Aug, 2023 — Oct, 2023 [Intern]

Working under Dr Krishna Gummadi to explore different aspects of LLMs. Some areas we've explored/are exploring are

Optimizing pre-training and inference for LLMs
LLM memorization and the impact of Parameter-Efficient Fine-Tuning (PEFT) on memorization
Knowledge acquisition and evaluation of factual knowledge in LLMs
Built and currently maintain key internal tools OpenChat (An internal chatbot), MaxCast (A research paper-to-podcast conversion service) & MaxChat (A document-based chat service). These services were developed from scratch, including hosting models on-premises and fine-tuning for optimal performance.
Published and submitted research to top-tier (A*) conferences

EleutherAI

Open Source Contributor • Dec, 2022 — Present

Currently working on the Multilingual Natural Instructions project to build a massive instruction tuning corpus for Hindi. Previously worked on -

Pythia - A Suite for Analyzing Large Language Models Across Training and Scaling (Accepted ICML'23) - Majorly contributed to the gender bias evals and intervention case study. The models have over 18 million downloads (as of April 2025)
Recite, Reconstruct, Recollect - Memorization in LMs as a Multifaceted Phenomenon (Accepted ICLR'25) - An intuitive taxonomy to classify memorized sequences and then build predictors based on these classes

Laboratory for Computational Social Systems (LCS2)

Undergraduate Student Researcher • June, 2021 — May, 2024

I've worked on a variety of projects, from hate speech normalization to designing recommendations for fine-tuning improved hate speech detectors. I also led the QUENCH project, a benchmark aimed at evaluating advanced reasoning abilities in large language models, with a particular emphasis on Indic contexts.

Goldman Sachs

Summer Analyst • May, 2023 — July, 2023

Worked in the Finance, Planning & Analysis Engineering division towards revamping the central hub of the department. Also built POCs based on user feedback to improve the search and access experience on the webapp. Also recieved a return offer to join full time as an Analyst.

Google Summer of Code - TensorFlow

Open Source Developer • May, 2022 — Sept, 2022

Worked with Matthew Watson & Chen Qian towards adding support for data augmentation layers to KerasNLP a library under the Keras/TensorFlow Ecosystem which aims to build industry oriented NLP Solutions. I also contributed to several bug fixes and other utilities such as tokenizers and transformer encoder & decoder.

Publications

* indicates equal contribution

2026

In Agents We Trust, but Who Do Agents Trust? Latent Source Preferences Steer LLM Generations

Mohammad Aflah Khan, Mahsa Amani, Soumi Das, Bishwamittra Ghosh, Qinyuan Wu, Krishna P. Gummadi, Manish Gupta, Abhilasha Ravichander

IASEAI 2026 - The International Association for Safe & Ethical AI (An earlier version of this work was presented at R2-FM, ICML 2025)

Rote Learning Considered Useful: Generalizing over Memorized Data in LLMs

Qinyuan Wu, Soumi Das, Mahsa Amani, Bishwamittra Ghosh, Mohammad Aflah Khan, Krishna P. Gummadi, Muhammad Bilal Zafar

IASEAI 2026 - The International Association for Safe & Ethical AI (An earlier version of this work was presented at MemFM, ICML 2025)

Revisiting Privacy, Utility, and Efficiency Trade-offs when Fine-Tuning Large Language Models

Soumi Das, Camila Kolling, Mohammad Aflah Khan, Mahsa Amani, Bishwamittra Ghosh, Qinyuan Wu, Till Speicher, Krishna P. Gummadi

IASEAI 2026 - The International Association for Safe & Ethical AI

2025

TokenSmith: Streamlining Data Editing, Search, and Inspection for Large-Scale Language Model Training and Interpretability

*Mohammad Aflah Khan**, Ameya Godbole*, Johnny Tian-Zheng Wei, Ryan Wang, James Flemings, Krishna Gummadi, Willie Neiswanger, Robin Jia

EMNLP 2025 System Demonstrations - The 2025 Conference on Empirical Methods in Natural Language Processing

Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon

USVSN Sai Prashanth, Alvin Deng, Kyle O'Brien, Jyothir S V, Mohammad Aflah Khan, Jaydeep Borkar, Christopher A. Choquette-Choo, Jacob Ray Fuehne, Stella Biderman, Tracy Ke, Katherine Lee, Naomi Saphra

ICLR 2025 - The Thirteenth International Conference on Learning Representations

Towards Reliable Latent Knowledge Estimation in LLMs: Zero-Prompt Many-Shot Based Factual Knowledge Extraction

Qinyuan Wu, Mohammad Aflah Khan, Soumi Das, Vedant Nanda, Bishwamittra Ghosh, Camila Kolling, Till Speicher, Laurent Bindschaedler, Krishna P Gummadi, Evimaria Terzi

WSDM 2025 - Proceedings of the 18th ACM International Conference on Web Search and Data Mining

QUENCH: Measuring the gap between Indic and Non-Indic Contextual General Reasoning in LLMs

*Mohammad Aflah Khan**, Neemesh Yadav*, Sarah Masud, Md Shad Akhtar

COLING 2025 - Proceedings of the 31st International Conference on Computational Linguistics

In Agents We Trust, but Who Do Agents Trust? Latent Source Preferences Steer LLM Generations

Mohammad Aflah Khan, Mahsa Amani, Soumi Das, Bishwamittra Ghosh, Qinyuan Wu, Krishna P. Gummadi, Manish Gupta, Abhilasha Ravichander

R2-FM @ ICML 2025 - Workshop on Reliable and Responsible Foundation Models

Rethinking Memorization Measures in LLMs: Recollection vs. Counterfactual vs. Contextual Memorization

Bishwamittra Ghosh, Soumi Das, Qinyuan Wu, Mohammad Aflah Khan, Krishna P. Gummadi, Evimaria Terzi, Deepak Garg

MemFM @ ICML 2025 - The Impact of Memorization on Trustworthy Foundation Models

Rote Learning Considered Useful: Generalizing over Memorized Data in LLMs

Qinyuan Wu, Soumi Das, Mahsa Amani, Bishwamittra Ghosh, Mohammad Aflah Khan, Krishna P. Gummadi, Muhammad Bilal Zafar

MemFM @ ICML 2025 - The Impact of Memorization on Trustworthy Foundation Models

Preprints

Fractional Rotation, Full Potential? Investigating Performance and Convergence of Partial RoPE

Mohammad Aflah Khan, Krishna P. Gummadi, Manish Gupta, Abhilasha Ravichander

Under Review

Hubble: a Model Suite to Advance the Study of LLM Memorization

Johnny Tian-Zheng Wei, Ameya Godbole, *Mohammad Aflah Khan**, Ryan Wang, Xiaoyuan Zhu, James Flemings, Nitya Kashyap, Krishna P. Gummadi, Willie Neiswanger, Robin Jia

Under Review

Understanding the Mechanics and Dynamics of Memorisation in Large Language Models: A Case Study with Random Strings

Till Speicher, Mohammad Aflah Khan, Qinyuan Wu, Vedant Nanda, Soumi Das, Bishwamittra Ghosh, Krishna P. Gummadi, Evimaria Terzi

2024

The Duality of Hope: A Critical Examination of Controversial Annotations in HopeEDI

*Mohammad Aflah Khan**, Neemesh Yadav, Diksha Sethi, Raghav Sahni*

The Second Tiny Papers Track at ICLR 2024

Probing Critical Learning Dynamics of PLMs for Hate Speech Detection

Sarah Masud, Mohammad Aflah Khan**, Vikram Goyal, Md Shad Akhtar, Tanmoy Chakraborty

EACL 2024 - Findings of the Association for Computational Linguistics

2023

Overview of the HASOC Subtracks at FIRE 2023: Detection of Hate Spans and Conversational Hate-Speech

Shrey Satapara, Sarah Masud, Hiren Madhu, Mohammad Aflah Khan, Md Shad Akhtar, Tanmoy Chakraborty, Sandip Modha, Thomas Mandl

FIRE 2023 - Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation

Overview of the HASOC Subtrack at FIRE 2023: Identification of Tokens Contributing to Explicit Hate in English by Span Detection

Sarah Masud, Mohammad Aflah Khan, Md. Shad Akhtar, Tanmoy Chakraborty

In Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

Stella Biderman, Hailey Schoelkopf, Quentin Gregory Anthony, Herbie Bradley, Kyle O’Brien, Eric Hallahan, Mohammad Aflah Khan, Shivanshu Purohit, USVSN Sai Prashanth, Edward Raff, Aviya Skowron, Lintang Sutawika, Oskar van der Wal

ICML 2023 - The Fortieth International Conference on Machine Learning

The Art of Embedding Fusion: Optimizing Hate Speech Detection

*Mohammad Aflah Khan**, Neemesh Yadav*, Mohit Jain, Sanyam Goyal

The First Tiny Papers Track at ICLR 2023

Beyond Negativity: Re-Analysis and Follow-Up Experiments on Hope Speech Detection

Neemesh Yadav, Mohammad Aflah Khan**, Diksha Sethi, Raghav Sahni

The First Tiny Papers Track at ICLR 2023

2022

Proactively Reducing the Hate Intensity of Online Posts via Hate Speech Normalization

Sarah Masud, Manjot Bedi, Mohammad Aflah Khan, Md Shad Akhtar, Tanmoy Chakraborty

KDD 2022 - Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Talks

[Paper Reading & Discussion] Metadata Conditioning Accelerates Language Model Pre-training (MeCo)

Max Planck Institute for Software Systems: MPI-SWS (Internal Paper Reading Group) • September, 2025

[Talk] LLMs at Scale

Max Planck Computing and Data Facility: MPCDF (AI Kick-off Workshop) • April, 2025

Max Planck Institute for the Science of Light (Hosted by Florian Marquardt) • April, 2025

[Talk] An Overview of DeepSeek-{V3/R1}

Max Planck Institute for Software Systems: MPI-SWS (Part of AI, Computing & Society Initiative) • February, 2025

[Talk] Democratizing and Accelerating Research with LLMs: Making Science More Accessible Whilst Finding Interesting Research Problems

Max Planck Institute for Security and Privacy: MPI-SP (Hosted by Meeyoung Cha) • December, 2024

[Demo + Lightning Talk] Empowering Research with Open-Access LLMs: From Tools to Copilots

AI, Computing & Society Initiative Launch Event (At Max Planck Institute for Software Systems: MPI-SWS) • December, 2024

[Paper Reading & Discussion] Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps

Max Planck Institute for Software Systems: MPI-SWS (Internal Paper Reading Group) • July, 2024

[Paper Reading & Discussion] Deduplicating Training Data Makes Language Models Better

Max Planck Institute for Software Systems: MPI-SWS (Internal Paper Reading Group) • May, 2024

[Talk] Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

Max Planck Institute for Software Systems: MPI-SWS (Hosted by Krishna Gummadi) • July, 2023

Goldman Sachs (Internal NLP/IR Reading Group) • June, 2023

Organizing, Reviewing & Volunteering

The First Workshop on Large Language Model Memorization (L2M2) @ ACL'25

Program Committee Member • 2025

ACL Rolling Review (ARR), International Conference on Computational Linguistics (COLING), Workshop on Online Abuse and Harms (WOAH), The Technical Symposium on Computer Science Education (SIGCSE TS)

Max Planck Institute for Software Systems (MPI-SWS)

Research Software Engineer • April, 2024 — Present [Full Time] | Nov, 2023 — March, 2024 [Part time] | Aug, 2023 — Oct, 2023 [Intern]

EleutherAI

Open Source Contributor • Dec, 2022 — Present

Laboratory for Computational Social Systems (LCS2)

Undergraduate Student Researcher • June, 2021 — May, 2024

Goldman Sachs

Summer Analyst • May, 2023 — July, 2023

Google Summer of Code - TensorFlow

Open Source Developer • May, 2022 — Sept, 2022

In Agents We Trust, but Who Do Agents Trust? Latent Source Preferences Steer LLM Generations

Mohammad Aflah Khan, Mahsa Amani, Soumi Das, Bishwamittra Ghosh, Qinyuan Wu, Krishna P. Gummadi, Manish Gupta, Abhilasha Ravichander

Rote Learning Considered Useful: Generalizing over Memorized Data in LLMs

Qinyuan Wu, Soumi Das, Mahsa Amani, Bishwamittra Ghosh, Mohammad Aflah Khan, Krishna P. Gummadi, Muhammad Bilal Zafar

Soumi Das, Camila Kolling, Mohammad Aflah Khan, Mahsa Amani, Bishwamittra Ghosh, Qinyuan Wu, Till Speicher, Krishna P. Gummadi

Mohammad Aflah Khan*, Ameya Godbole*, Johnny Tian-Zheng Wei, Ryan Wang, James Flemings, Krishna Gummadi, Willie Neiswanger, Robin Jia

USVSN Sai Prashanth, Alvin Deng, Kyle O'Brien, Jyothir S V, Mohammad Aflah Khan, Jaydeep Borkar, Christopher A. Choquette-Choo, Jacob Ray Fuehne, Stella Biderman, Tracy Ke, Katherine Lee, Naomi Saphra

Qinyuan Wu, Mohammad Aflah Khan, Soumi Das, Vedant Nanda, Bishwamittra Ghosh, Camila Kolling, Till Speicher, Laurent Bindschaedler, Krishna P Gummadi, Evimaria Terzi

Mohammad Aflah Khan*, Neemesh Yadav*, Sarah Masud, Md Shad Akhtar

Mohammad Aflah Khan, Mahsa Amani, Soumi Das, Bishwamittra Ghosh, Qinyuan Wu, Krishna P. Gummadi, Manish Gupta, Abhilasha Ravichander

Bishwamittra Ghosh, Soumi Das, Qinyuan Wu, Mohammad Aflah Khan, Krishna P. Gummadi, Evimaria Terzi, Deepak Garg

Qinyuan Wu, Soumi Das, Mahsa Amani, Bishwamittra Ghosh, Mohammad Aflah Khan, Krishna P. Gummadi, Muhammad Bilal Zafar

Fractional Rotation, Full Potential? Investigating Performance and Convergence of Partial RoPE

Mohammad Aflah Khan, Krishna P. Gummadi, Manish Gupta, Abhilasha Ravichander

Johnny Tian-Zheng Wei*, Ameya Godbole*, Mohammad Aflah Khan*, Ryan Wang, Xiaoyuan Zhu, James Flemings, Nitya Kashyap, Krishna P. Gummadi, Willie Neiswanger, Robin Jia

Till Speicher, Mohammad Aflah Khan, Qinyuan Wu, Vedant Nanda, Soumi Das, Bishwamittra Ghosh, Krishna P. Gummadi, Evimaria Terzi

Mohammad Aflah Khan*, Neemesh Yadav*, Diksha Sethi*, Raghav Sahni*

Sarah Masud*, Mohammad Aflah Khan*, Vikram Goyal, Md Shad Akhtar, Tanmoy Chakraborty

Shrey Satapara, Sarah Masud, Hiren Madhu, Mohammad Aflah Khan, Md Shad Akhtar, Tanmoy Chakraborty, Sandip Modha, Thomas Mandl

Sarah Masud, Mohammad Aflah Khan, Md. Shad Akhtar, Tanmoy Chakraborty

Stella Biderman, Hailey Schoelkopf, Quentin Gregory Anthony, Herbie Bradley, Kyle O’Brien, Eric Hallahan, Mohammad Aflah Khan, Shivanshu Purohit, USVSN Sai Prashanth, Edward Raff, Aviya Skowron, Lintang Sutawika, Oskar van der Wal

Mohammad Aflah Khan*, Neemesh Yadav*, Mohit Jain, Sanyam Goyal

Neemesh Yadav*, Mohammad Aflah Khan*, Diksha Sethi, Raghav Sahni

Sarah Masud, Manjot Bedi, Mohammad Aflah Khan, Md Shad Akhtar, Tanmoy Chakraborty

Max Planck Institute for Software Systems: MPI-SWS (Internal Paper Reading Group) • September, 2025

[Talk] LLMs at Scale

Max Planck Computing and Data Facility: MPCDF (AI Kick-off Workshop) • April, 2025

Max Planck Institute for the Science of Light (Hosted by Florian Marquardt) • April, 2025

Max Planck Institute for Software Systems: MPI-SWS (Part of AI, Computing & Society Initiative) • February, 2025

[Talk] Democratizing and Accelerating Research with LLMs: Making Science More Accessible Whilst Finding Interesting Research Problems

Max Planck Institute for Security and Privacy: MPI-SP (Hosted by Meeyoung Cha) • December, 2024

[Demo + Lightning Talk] Empowering Research with Open-Access LLMs: From Tools to Copilots

AI, Computing & Society Initiative Launch Event (At Max Planck Institute for Software Systems: MPI-SWS) • December, 2024

Max Planck Institute for Software Systems: MPI-SWS (Internal Paper Reading Group) • July, 2024

Max Planck Institute for Software Systems: MPI-SWS (Internal Paper Reading Group) • May, 2024

[Talk] Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

Max Planck Institute for Software Systems: MPI-SWS (Hosted by Krishna Gummadi) • July, 2023

Goldman Sachs (Internal NLP/IR Reading Group) • June, 2023

Program Committee Member • 2025

ACL Rolling Review (ARR), International Conference on Computational Linguistics (COLING), Workshop on Online Abuse and Harms (WOAH), The Technical Symposium on Computer Science Education (SIGCSE TS)

Served as a reviewer for the above mentioned conferences • 2023 Onwards

Organizer • 2023

Volunteer • 2022

Indraprastha Institute of Information Technology (IIIT-D)

B.Tech. in Computer Science and Engineering • 2020 — 2024

EleutherAI • September, 2023

Selected for Amazon ML Summer School

Amazon • 2022

All India Rank 491

JEE Mains Paper 2 • 2020

Top 0.66 Percentile

JEE Mains Paper 1 • 2020

All India Rank 130

Undegraduate Entrance Examination (UGEE) • 2020

*Mohammad Aflah Khan**, Ameya Godbole*, Johnny Tian-Zheng Wei, Ryan Wang, James Flemings, Krishna Gummadi, Willie Neiswanger, Robin Jia

*Mohammad Aflah Khan**, Neemesh Yadav*, Sarah Masud, Md Shad Akhtar

Johnny Tian-Zheng Wei, Ameya Godbole, *Mohammad Aflah Khan**, Ryan Wang, Xiaoyuan Zhu, James Flemings, Nitya Kashyap, Krishna P. Gummadi, Willie Neiswanger, Robin Jia

*Mohammad Aflah Khan**, Neemesh Yadav, Diksha Sethi, Raghav Sahni*

Sarah Masud, Mohammad Aflah Khan**, Vikram Goyal, Md Shad Akhtar, Tanmoy Chakraborty

*Mohammad Aflah Khan**, Neemesh Yadav*, Mohit Jain, Sanyam Goyal

Neemesh Yadav, Mohammad Aflah Khan**, Diksha Sethi, Raghav Sahni