Hello! I am a Research Scientist at Reality Labs Research @ Meta working on egocentric video language models for wearable devices. Previously, I was a Ph.D. student advised by Dr. Virginia de Sa at the UCSD Cognitive Science department where I worked on computer vision and human vision. My main contributions from my PhD research are (1) a bio-inspired adaptive RNN architecture that learned to dynamically scale its computation with input task-difficulty (see NeurIPS23), and (2) studying human vision’s sensitivity to adversarial image perturbations (see NatComm23).
During my PhD, I did internships at Google Brain, Facebook AI Research and Qualcomm AI Research working on adversarial images, human visual perception, self-supervised learning and unsupervised video representation learning.
Humans solving algorithmic (or) reasoning problems typically exhibit solution times that grow as a function of problem difficulty. Adaptive recurrent neural networks have been shown to exhibit this property for various language-processing tasks. However, little work has been performed to assess whether such adaptive computation can also enable vision models to extrapolate solutions beyond their training distribution’s difficulty level, with prior work focusing on very simple tasks. In this study, we investigate a critical functional role of such adaptive processing using recurrent neural networks: to dynamically scale computational resources conditional on input requirements that allow for zero-shot generalization to novel difficulty levels not seen during training using two challenging visual reasoning tasks: PathFinder and Mazes. We combine convolutional recurrent neural networks (ConvRNNs) with a learnable halting mechanism based on Graves (2016). We explore various implementations of such adaptive ConvRNNs (AdRNNs) ranging from tying weights across layers to more sophisticated biologically inspired recurrent networks that possess long/short-range lateral connections and gating. We show that 1) AdRNNs learn to dynamically halt processing early (or late) to solve easier (or harder) problems, 2) these RNNs zero-shot generalize to more difficult problem settings not shown during training by dynamically increasing the number of recurrent iterations at test time. Our study provides modeling evidence supporting the hypothesis that recurrent processing enables the functional advantage of adaptively allocating compute resources conditional on input requirements and hence allowing generalization to harder difficulty levels of a visual reasoning problem without training.
Feedforward deep neural networks have become the standard class of models in computer vision. Yet, they possess a striking difference relative to their biological counterparts which predominantly perform...
Nature 2023
Subtle adversarial image manipulations influence both human and machine perception
Vijay Veerabadran, Josh Goldman, Shreya Shankar, and 8 more authors
Although artificial neural networks (ANNs) were inspired by the brain, ANNs exhibit a brittleness not generally observed in human perception. One shortcoming of ANNs is their susceptibility to adversarial perturbations-subtle modulations of natural images that result in changes to classification decisions, such as confidently mislabelling an image of an elephant, initially classified correctly, as a clock. In contrast, a human observer might well dismiss the perturbations as an innocuous imaging artifact. This phenomenon may point to a fundamental difference between human and machine perception, but it drives one to ask whether human sensitivity to adversarial perturbations might be revealed with appropriate behavioral measures. Here, we find that adversarial perturbations that fool ANNs similarly bias human choice. We further show that the effect is more likely driven by higher-order statistics of natural images to which both humans and ANNs are sensitive, rather than by the detailed architecture of the ANN.
2021
NeurIPS 2021
Bio-inspired learnable divisive normalization for ANNs
Vijay Veerabadran, Ritik Raina, and Virginia R De Sa
3rd Workshop on Shared Visual Representations in Human and Machine Intelligence (SVRHM), NeurIPS, Aug 2021
In this work we introduce DivNormEI, a novel bio-inspired convolutional network that performs divisive normalization, a canonical cortical computation, along with lateral …
2020
CVPRW 2020
Adversarial distortion for learned video compression
Vijay Veerabadran, R Pourreza, and others
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Aug 2020
In this paper, we present a novel adversarial lossy video compression model. At extremely low bit-rates, standard video coding schemes suffer from unpleasant reconstruction artifacts …
2018
NeurIPS 2018
Learning long-range spatial dependencies with horizontal gated recurrent units
D Linsley, J Kim, Vijay Veerabadran, and 1 more author
Proceedings of the 32nd International Conference on Neural Information Processing Systems, Aug 2018
Progress in deep learning has spawned great successes in many engineering applications. As a prime example, convolutional neural networks, a type of feedforward neural networks …