| CARVIEW |
I do research in Machine Learning on various interesting aspects surrounding Large Language Models (LLMs). My work focuses on understanding and evaluating the abilities and limitations of LLMs with respect to their generalization, scaling, or reasoning behaviours. My hope is that the knowledge derived from my analysis works will help design more robust systems capable of better out-of-distribution (OOD) generalization, and eventually of exhibiting superintelligence.
Keywords: generalization, reasoning, scaling, evaluation, safety, analysis and interpretability
I graduated with B.E. (Hons.) in Computer Science from BITS Pilani - Goa Campus, India in 2020. For more details about my background, refer to my CV. If you'd like to chat with me about my work or research in general, feel free to reach out!
Our Thoughtology paper investigating the reasoning chains-of-thoughts of the Large Reasoning Model DeepSeek-R1 is out!
Our paper on AI safety investigating the transferability of adversarial triggers in LLMs has been accepted to TACL!
I'm a visiting graduate student at the Simons Institute at UC Berkeley as a part of their special year on LLMs and Transformers.
Our paper proposing SafeArena, a benchmark for evaluating the safety of autonomous web agents is out!
Our paper proposing the CHASE method to automatically generate challenging synthetic data for evaluating LLMs is out!
Presented my AI2 internship work on evaluating code generation in LLMs at NAACL 2024 in Mexico City!
Universal Adversarial Triggers Are Not Universal
Nicholas Meade, Arkil Patel, Siva Reddy
Preprint
pdf
code
abstract
Evaluating In-Context Learning of Libraries for Code Generation
Arkil Patel, Siva Reddy, Dzmitry Bahdanau, Pradeep Dasigi
NAACL'24
pdf
code
abstract
Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions
Satwik Bhattamishra, Arkil Patel, Phil Blunsom, Varun Kanade
ICLR'24 [Oral]
pdf
code
abstract
MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language Models to Generalize to Novel Interpretations
Arkil Patel, Satwik Bhattamishra, Siva Reddy, Dzmitry Bahdanau
EMNLP'23 [Oral]
pdf
code
abstract
Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions
Satwik Bhattamishra, Arkil Patel, Varun Kanade, Phil Blunsom
ACL'23
pdf
code
abstract
When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks
Ankur Sikarwar, Arkil Patel, Navin Goyal
EMNLP'22 [Oral]
pdf
code
abstract
Revisiting the Compositional Generalization Abilities of Neural Sequence Models
Arkil Patel, Satwik Bhattamishra, Phil Blunsom, Navin Goyal
ACL'22
pdf
code
abstract
Are NLP Models really able to Solve Simple Math Word Problems?
Arkil Patel, Satwik Bhattamishra, Navin Goyal
NAACL'21
pdf
code
abstract
article
On the Computational Power of Transformers and its Implications in Sequence Modeling
Satwik Bhattamishra, Arkil Patel, Navin Goyal
CoNLL'20
pdf
code
abstract
VehicleChain: Blockchain-based Vehicular Data Transmission Scheme for Smart City
Arkil Patel, Naigam Shah, Trupil Limbasiya, Debasis Das
IEEE SMC'19
pdf
- Winter 2024: Teaching Assistant for COMP 596: From Natural Language to Data Science - McGill University
- Winter 2023: Teaching Assistant for COMP 596: From Natural Language to Data Science - McGill University
- Winter 2020: Teaching Assistant for BITS F312: Neural Networks and Fuzzy Logic - BITS Goa
- Winter 2019: Teaching Assistant for CS F415: Data Mining - BITS Goa