Add to Calendar2025-10-14 15:00:002025-10-14 16:30:00America/New_York[Scale ML] Simo Ryu: Minimally Training Video Foundation Model in the wild, from scratchSpeaker: Simo Ryu
Topic: Minimally Training Video Foundation Model in the wild, from scratch
Date: Tuesday, Oct 14
Time: 3 PM (EST)
Zoom: https://mit.zoom.us/j/91697262920 (password: mitmlscale)
Abstract:
In this talk, we share our experience on training a >10B diffusion model from scratch. This includes pre-processing, hardware efficiency, architecture, parameterization, and optimizer choices.
Bio:
Simo Ryu is a research lead at fal.ai.TBD
Add to Calendar2025-10-14 16:00:002025-10-14 17:00:00America/New_YorkHCI Seminar - Parastoo Abtahi - From Perceptual Illusions to Beyond-Real Interactions in Extended RealityAbstract:Advances in audiovisual rendering have led to the commercialization of virtual reality (VR); however, haptic technology has not kept up with these advances. While haptic devices aim to bridge this gap by simulating the sensation of touch, many hardware limitations make realistic touch interactions in VR challenging. In my research, I explore how, by understanding human perception, we can design interactions that not only overcome the current limitations of VR hardware but also extend our abilities beyond what is possible in the real world.In the first part of this talk, I will present my work on redirection illusions that leverage the limits of human perception to improve the perceived performance of encountered-type haptic devices in VR, such as the position accuracy of drones and the resolution of shape displays. In the second part, I will present a framework I have developed through the lens of sensorimotor control theory to argue for the exploration and evaluation of VR interactions that go beyond mimicking reality—what I call beyond-real interactions. Finally, I will share recent work from my group extending these interactions from virtual environments into physical spaces using augmented reality (AR) and robots. Bio:Parastoo Abtahi is an Assistant Professor of Computer Science at Princeton University. She leads Princeton’s Situated Interactions Lab (Ψ Lab), which explores virtual and augmented reality interactions grounded in human perception and cognition. Before joining Princeton, Parastoo was a visiting research scientist at Meta Reality Labs Research. She received her PhD in Computer Science from Stanford University, where she was a Gerald J. Lieberman Fellow, and her bachelor’s degree in Engineering Science and Electrical and Computer Engineering from the University of Toronto. Parastoo’s work has been recognized with a Google Research Scholar Award and best paper and honorable mention awards at top human-computer interaction venues, such as ACM CHI and UIST.This talk will also be streamed over Zoom: https://mit.zoom.us/j/99397568751.TBD
Add to Calendar2025-10-14 16:15:002025-10-14 17:15:00America/New_YorkIntroducing Algorithmic Thinking Theory for Foundation ModelsThe last few months have witnessed tremendous advances on Large Language Model (LLM) reasoning capabilities with Gemini and GPT winning a gold medal at the International Mathematical Olympiad (IMO) [1] and International Collegiate Programming Contest (ICPC) [2]. Several papers have shown that inference scaling techniques significantly improve the reasoning performances of the LLM, in particular for the IMO [3]. We will discuss these results and how one can frame the problem as an optimization problem, relate it to empirical results shown in [3], and derive optimal (algorithmic) thinking strategies. We will also discuss avenues for refining the model and improving inference scaling methods.[1] https://deepmind.google/discover/blog/advanced-version-of-gemini-with-deep-think-officially-achieves-gold-medal-standard-at-the-international-mathematical-olympiad/[2] https://deepmind.google/discover/blog/gemini-achieves-gold-level-performance-at-the-international-collegiate-programming-contest-world-finals/[3] https://arxiv.org/abs/2507.15855 TBD
Add to Calendar2025-10-15 11:30:002025-10-15 13:00:00America/New_YorkFrom Data to Knowledge: Integrating Clinical and Molecular Data for Predictive MedicineAbstractAlzheimer’s disease (AD) remains one of the most pressing medical challenges, with limited therapeutic options and heterogeneous disease trajectories complicating diagnosis and treatment. Recent advances in computational biology and artificial intelligence (AI) together with availability of rich molecular and clinical data, offer new opportunities to address these challenges by integrating molecular, clinical, and systems-level insights. In our recent studies, we developed a cell-type-directed, network-correcting approach to identify and prioritize rational drug combinations for AD, enabling targeted modulation of disease-relevant pathways across distinct cellular contexts (Li et al., Cell 2025). Complementarily, by leveraging large-scale electronic medical records (EMRs) integrated with biological knowledge networks, we demonstrated the ability to predict disease onset and progression while uncovering mechanistic insights into AD heterogeneity (Tang et al., Nature Aging 2024). Together, these complementary approaches illustrate the power of combining real-world clinical data, knowledge networks, and systems pharmacology to advance precision medicine for AD. This work highlights a paradigm shift toward AI-enabled, data-driven strategies that bridge molecular discovery and clinical application, ultimately informing novel therapeutic interventions and improving patient care.Speaker BioMarina is currently a Professor and the Interim Director at the Bakar Computational Health Sciences Institute at UCSF. Prior to that she has worked as a Senior Research Scientist at Pfizer where she focused on developing Precision Medicine strategies in drug discovery. She completed her PhD in Biomedical Informatics at Stanford University. Dr. Sirota’s research experience in translational bioinformatics spans nearly 20 years during which she has co-authored over 170 scientific publications. Her research interests lie in developing computational integrative methods and applying these approaches in the context of disease diagnostics and therapeutics with a special focus on women’s health. The Sirota laboratory is funded by NIA, NLM, NIAMS, Pfizer, March of Dimes, and the Burroughs Wellcome Fund. As a young leader in the field, she has been awarded the AMIA Young Investigator Award in 2017. She leads the UCSF March of Dimes Prematurity Research Center at UCSF as well as co-directs ENACT, a center to study precision medicine for endometriosis. Dr. Sirota also is the founding director of the AI4ALL program at UCSF, with the goal of introducing high school girls to applications of AI and machine learning in biomedicine.TBD
Add to Calendar2025-10-15 12:00:002025-10-15 14:00:00America/New_YorkUnderscore_VC: How VCs Help Ventures Scalehttps://cap.csail.mit.edu/members/events/underscorevc-how-vcs-help-ventures-scale Moderated by Lily Lyman, Partner @ Underscore_VC, join us for an afternoon discussion with ventures Pagos Solutions, Transfyr, and TetraScience. Questions will range from idea formation, company building at an early stage, seeking VC funds, process interacting with VCs like Underscore, lessons learned, and growing ventures. Come with your own questions and enjoy this great event!Ari Chadda is a Member of Technical Staff at Transfyr—building the API layer for science to improve the distribution of scientific innovations to the real world. He was previously a Senior Data Scientist at In-Q-Tel where he worked on building robust physical and digital autonomous systems. Ari holds an A.B. in Computer Science from Dartmouth College.Patrick Grady serves as the Chairman and CEO of TetraScience, the world's Scientific Data and AI Cloud. TetraScience is committed to solving humanity’s grand challenges by radically accelerating and improving scientific breakthroughs. This begins with replatforming and reengineering the world's scientific data to enable AI-based discovery, development, and manufacturing.TBD
Location: Zoom (register at https://mit.zoom.us/meeting/register/XTd1GacuTeuOVvN2R47ZmA)
Zoom
Add to Calendar2025-10-15 14:00:002025-10-15 15:00:00America/New_YorkFAST-CODE Seminar: When communicating more saves time -- reconsidering the role of computation-communication overlapAbstract:When designing algorithms for distributed-memory systems, reducing communication volume is the main proxy for reducing execution time. However, this strategy neglects an important consideration: overlapping computation and communication. Such overlap is often considered a performance-engineering afterthought, something you obviously should do but not something considered especially deep or impactful. To counter this view, I'll summarize two recent performance engineering case studies where overlap—combined with fine-grained asynchronous messaging—saves time, even at the cost of higher communication volume. The case studies come from bioinformatics (k-mer counting) and graph analytics (influence maximization), where algorithmic techniques are designed to match a runtime for asynchronous 1-sided active messaging. I'll conclude with a recent theoretical result for distributed matrix multiplication that further suggests overlap may be more important than we generally think. This talk synthesizes work done jointly with Souvadra Hati, Akihiro Hayashi, Shubhendra Pal Singh, Vivek Sarkar (all at Georgia Tech), Mikhail Isaev (NVIDIA), and Srinivas Eswar (Argonne).---Bio:Richard (Rich) Vuduc is a professor in the School of Computational Science and Engineering at Georgia Tech. His research lab, The HPC Garage, is interested in performance "by any means necessary," whether by algorithms, performance modeling and analysis, ninja programming, or exploiting novel hardware. These pursuits helped produce one ACM Gordon Bell Prize winner and two more finalists and a SIAG/SC Best Paper Prize, among other nods. He also co-directs the Center for Scientific Software Engineering at Georgia Tech, one of four centers worldwide that compose the Schmidt Sciences Virtual Institute for Scientific Software.---Contact:Bruce Hoppebehoppe@mit.edu---Location: https://mit.zoom.us/meeting/register/XTd1GacuTeuOVvN2R47ZmA Event URL: https://fastcode.org/events/fastcode-seminar/rich-vuduc/TBD
Add to Calendar2025-10-15 16:00:002025-10-15 17:00:00America/New_YorkQuality Control on Random Graphs in Sublinear TimeMany algorithms are designed to perform well on random instances. However, when running such an algorithm on a specific input, can we trust its output? We define a new class of problems whose formulation allows algorithms to benefit from the randomness of instances while remaining reliable to use on any given input. We call these “quality control” problems, specified by a distribution D and a real-valued quality function Q. We say that an input x is “high quality” if Q(x) is approximately 1, and assume that samples from D are high quality with high probability. The task is to accept x if it is drawn from D and reject x if Q(x) is far from 1.We study this problem in the sublinear setting, where inputs are graphs and the algorithm has access to adjacency matrix queries. Focusing on the case where D is a (sparse) Erdős–Rényi random graph model and Q is proportional to the number of k-cliques in a graph, we show that quality control can be solved superpolynomially faster than the related task of approximating Q in worst-case graphs in the sublinear access model. Joint work with Ronitt Rubinfeld (MIT) and Madhu Sudan (Harvard).TBD
Add to Calendar2025-10-15 16:00:002025-10-15 17:00:00America/New_YorkML Tea: Chain-of-Thought Degrades Abstention in LLMs, Unless Inverted / Context-aware sequence-to-function model of human gene regulationSpeakers: Emanuele Sansone and Ittai RubinsteinBios:Abinitha is a second year EECS PhD student in MIT LIDS. She is co-advised by Professors Marzyeh Ghassemi and Collin Stultz. Her research interests lie broadly in trustworthy machine learning, with a particular focus on sensitive domains like healthcare. She completed her B.S.E. in Operations Research and Financial Engineering at Princeton University. Ekin is a PhD student in the Max Planck Institute for Molecular Genetics in Berlin, working on computational biology to uncover the regulatory code of the human genome, he is an MD and has also worked on cancer biology and infectious diseases before.Abstracts:(1) For Large Language Models (LLMs) to be reliably deployed, models must effectively know when not to answer: abstain. Chain-of-Thought (CoT) prompting has been gained popularity for improving model performance by ensuring structured outputs that follow a logical sequence. In this paper, we first investigate how current abstention methods perform with CoT outputs, finding that direct use of reasoning traces can degrade performance of existing abstention methods by more than 5%. As a result, we introduce a new framework for thinking about hallucinations in LLMs not as answering a question incorrectly but instead as LLMs answering the wrong question. Based on this framework, we develop a new class of state-of-the-art abstention methods called Trace Inversion. First, we generate the reasoning trace of a model. Based on only the trace, we then reconstruct the most likely query that the model responded to. Finally, we compare the initial query with the reconstructed query. Low similarity score between the initial query and reconstructed query suggests that the model likely answered the question incorrectly and is flagged to abstain. We perform extensive experiments to find impressive performance gains with our Trace Inversion methods.(2) Sequence-to-function models have been very successful in predicting gene expression, chromatin accessibility, and epigenetic marks from DNA sequences alone. However, current state-of-the-art models have a fundamental limitation: they cannot extrapolate beyond the cell types and conditions included in their training dataset. Here, we introduce a new approach that is designed to overcome this limitation: Corgi, a new context-aware sequence-to-function model that accurately predicts genome-wide gene expression and epigenetic signals, even in previously unseen cell types. We designed an architecture that strives to emulate the cell: Corgi integrates DNA sequence and trans-regulator expression to predict the coverage of multiple assays including chromatin accessibility, histone modifications, and gene expression. We define trans-regulators as transcription factors, histone modifiers, transcriptional coactivators, and RNA binding proteins, which directly modulate chromatin states, gene expression, and mRNA decay. Trained on a diverse set of bulk and single cell human datasets, Corgi has robust predictive performance, approaching experimental-level accuracy in gene expression predictions in previously unseen cell types, while also setting a new state-of-the-art level for joint cross-sequence and cross-cell type epigenetic track prediction. Corgi can be used in practice to impute many assays including DNA accessibility and histone ChIP-seq from RNA-seq data. TBD
Add to Calendar2025-10-17 10:30:002025-10-17 12:00:00America/New_YorkCompiling Any MIP* into a (Succinct) Classical Interactive ArgumentWe present a generic compiler that converts any MIP* protocol into a succinct interactive argument where the communication and the verifier are classical, and where post-quantum soundness relies on the post-quantum sub-exponential hardness of the Learning with Errors (LWE) problem. Prior to this work, such a compiler for MIP* was given by Kalai, Lombardi, Vaikuntanathan and Yang (STOC 2022), but the post-quantum soundness of this compiler is still under investigation.More generally, our compiler can be applied to any QIP protocol which is sound only against semi-malicious provers which follow the prescribed protocol, but with possibly malicious initial state. Our compiler consists of two steps. We first show that if a language L has a QIP with semi-malicious soundness, where the prover runs in time T, then L is in QMATIME(T). We then construct a succinct classical argument for any such language, where the communication complexity grows only poly-logarithmically with T, under the post-quantum sub-exponential hardness of LWE.The result is joint work with Yael Kalai. TBD
Add to Calendar2025-10-17 13:00:002025-10-17 14:00:00America/New_YorkRethinking Caching: Algorithms and Bounds for Parallel Memory Systems [Zoom]Title: Rethinking Caching: Algorithms and Bounds for Parallel Memory SystemsAbstract: In this talk, I will introduce an algorithmic framework for analyzing paging in multi-core shared memory systems. While the classical paging problem has been extensively studied and well understood in the sequential setting for decades, its parallel counterpart has remained largely unresolved for over twenty years. In the parallel paging scenario, p processors share a limited, fast-access cache of size k, and the challenge is to dynamically allocate cache space among the processors to minimize metrics such as average or maximum completion time. I will present matching upper and lower bounds of \Theta(log p) on the competitive ratio, achievable with only constant-factor resource augmentation.Following this, I will highlight recent theoretical progress in paging for emerging memory architectures, particularly systems equipped with high-bandwidth memory (HBM).Speaker Bio: Rathish Das is an Assistant Professor in the Department of Computer Science at the University of Houston. Before joining UH, he held a faculty position at the University of Liverpool in the UK and was a postdoctoral fellow at the University of Waterloo. He earned his Ph.D. in Computer Science from Stony Brook University.His research focuses on leveraging approximation and randomization techniques in two key areas: (1) designing parallel and distributed algorithms for multiprocessor systems, and (2) developing algorithms for large-scale data analysis. His work has been recognized with three Outstanding Paper Awards at ACM SPAA.This event is a special seminar affiliated with the Parallel Reading Group. Subscribe to this mailing list for announcement of regular meetings.Location: https://mit.zoom.us/j/94523742291TBD
Add to Calendar2025-10-20 12:00:002025-10-20 13:00:00America/New_YorkLLMs unlock new paths to monetizing exploitsAbstract: We argue that Large language models (LLMs) will soon alter the economics of cyberattacks. Instead of attacking the most commonly used software and monetizing exploits by targeting the lowest common denominator among victims, LLMs enable adversaries to launch tailored attacks on a user-by-user basis. On the exploitation front, instead of human attackers manually searching for one difficult-to-identify bug in a product with millions of users, LLMs can find thousands of easy-to-identify bugs in products with thousands of users. And on the monetization front, instead of generic ransomware that always performs the same attack (encrypt all your data and request payment to decrypt), an LLM-driven ransomware attack could tailor the ransom demand based on the particular content of each exploited device.We show that these two attacks (and several others) are imminently practical using state-of-the-art LLMs. For example, we show that without any human intervention, an LLM finds highly sensitive personal information in the Enron email dataset (e.g., an executive having an affair with another employee) that could be used for blackmail. While some of our attacks are still too expensive to scale widely today, the incentives to implement these attacks will only increase as LLMs get cheaper. Thus, we argue that LLMs create a need for new defense-in-depth approaches.Bio: Edoardo Debenedetti is fourth year a PhD student in Computer Science at ETH Zurich, advised by Prof. Florian Tramèr. His research focuses on real-world machine learning security and privacy. Most recently, he's been looking into the security of AI agents, working on evaluation frameworks and defenses. He is currently a Research Scientist Intern at Meta and he recently worked as a Student Researcher at Google.Zoom info: Meeting ID: 945 5603 5878 Password: 865039TBD
Add to Calendar2025-10-20 16:00:002025-10-20 17:00:00America/New_YorkMemory as a lens to understand learning and optimizationWhat is the role of memory in learning and optimization? The optimal convergence rates (measures in terms of the number of oracle queries or samples needed) for various optimization problems are achieved by computationally expensive optimization techniques, such as second-order methods and cutting-plane methods. We will discuss if simpler, faster and memory-limited algorithms such as gradient descent can achieve these optimal convergence rates for the prototypical optimization problem of minimizing a convex function with access to a gradient. Our results hint at a perhaps curious dichotomy---it is not possible to significantly improve on the convergence rate of known memory efficient techniques (which are linear-memory variants of gradient descent for many of these problems) without using substantially more memory (quadratic memory for many of these problems). Therefore memory could be a useful discerning factor to provide a clear separation between 'efficient' and 'expensive' techniques. This perspective can also be applied to understand mechanisms which transformers use to solve certain algorithmic tasks. We will show empirical evidence that transformers learn to achieve second-order convergence rates for solving linear-regression tasks, leveraging some of the theory of optimization and memory-limited learning.This is mostly based on joint work with Annie Marsden, Aaron Sidford, Gregory Valiant, Deqing Fu, Tian-Qi Chen and Robin Jia.TBD
Add to Calendar2025-10-21 12:00:002025-10-21 13:00:00America/New_YorkVisual Computing Seminar: PhysiOpt: Physics-Driven Shape Optimization for 3D Generative ModelsAbstract:Generative models have recently demonstrated impressive capabilities in producing high-quality 3D shapes from a variety of user inputs (e.g., text or images). However, generated objects often lack physical integrity. We introduce PhysiOpt, a differentiable physics optimizer designed to improve the physical behavior of 3D generative outputs, enabling them to transition from virtual designs to physically plausible, real-world objects. While most generative models represent geometry as continuous implicit fields, physics-based approaches often rely on the finite element method (FEM), requiring ad hoc mesh extraction to perform shape optimization. In addition, these methods are typically slow, limiting their integration in fast, iterative generative design workflows. Instead, we bridge the representation gap and propose a fast and effective differentiable simulation pipeline that optimizes shapes directly in the latent space of generative models using an intuitive and easy-to-implement differentiable mapping. This approach enables fast optimization while preserving semantic structure, unlike traditional methods relying on local mesh-based adjustments. We demonstrate the versatility of our optimizer across a range of shape priors, from global and part-based latent models to a state-of-the-art large-scale 3D generator, and compare it to a traditional mesh-based shape optimizer. Our method preserves the native representation and capabilities of the underlying generative model while supporting user-specified materials, loads, and boundary conditions. The resulting designs exhibit improved physical behavior, remain faithful to the learned priors, and are suitable for fabrication. We demonstrate the effectiveness of our approach on both virtual and fabricated objects.TBD
Add to Calendar2025-10-22 11:30:002025-10-22 13:00:00America/New_YorkCreating the next generation of genome analysis tools with deep learningDeep learning is fueling a revolution in genomics, enabling the development of a new generation of analysis tools that offer unprecedented accuracy. This talk presents a suite of deep learning models designed to address fundamental challenges in variant calling and generating high-quality genome assemblies. We begin with DeepVariant, a convolutional neural network that redefined the standard for germline variant calling, and its extension, DeepSomatic, which adapts this technology to the critical task of identifying low-frequency somatic mutations in cancer genomes. Moving from variant analysis to genome construction, we introduce DeepPolisher. This tool leverages a powerful Transformer-based architecture to significantly reduce errors in genome assemblies, providing a more accurate and reliable foundation for downstream research. Finally, we explore the future of variant calling by integrating these methods with emerging pangenome references. We demonstrate how a pangenome-aware approach allows for a more comprehensive survey of human genetic diversity, resolving variation in previously intractable regions of the genome. Together, these tools represent a cohesive framework that is building the next generation of genomic analysis, transforming our ability to accurately read and interpret the code of life.TBD
Add to Calendar2025-10-23 16:00:002025-10-23 17:00:00America/New_YorkRe-visiting Authorized Private Set Intersection: A New Privacy-Preserving VariantWe revisit the problem of Authorized Private Set Intersection (APSI), which allows mutually untrusting parties to authorize their items using a trusted third-party judge before privately computing the intersection. We also initiate the study of Partial-APSI, a novel privacy-preserving generalization of APSI in which the client only reveals a subset of their items to a third-party semi-honest judge for authorization. Partial-APSI allows for partial verification of the set, preserving the privacy of the party whose items are being verified. Both APSI and Partial-APSI have a number of applications, including genome matching, ad conversion, and compliance with privacy policies such as the GDPR.TBD
Add to Calendar2025-10-29 16:00:002025-10-29 17:00:00America/New_YorkThe Mysterious Query Complexity of Tarski Fixed PointsTarski's fixed point theorem has extensive applications across many fields, including verification, semantics, game theory, and economics. Recently, the complexity of finding a Tarski fixed point has attracted significant attention.In this talk, I will introduce the problem of computing a Tarski fixed point over a grid $[N]^d$, highlight our recent progress toward a better understanding of it, and discuss the surprising journey and the mysteries surrounding its complexity.Based on joint work with Xi Chen and Mihalis Yannakakis.TBD
Add to Calendar2025-10-30 15:00:002025-10-30 16:30:00America/New_YorkMyco: Unlocking Polylogarithmic Accesses in Metadata-Private MessagingAbstract: As billions of people rely on end-to-end encrypted messaging, the exposure of metadata, such as communication timing and participant relationships, continues to deanonymize users. Asynchronous metadata-hiding solutions with strong cryptographic guarantees have historically been bottlenecked by quadratic O(N²) server computation in the number of users N due to reliance on private information retrieval (PIR). We present Myco, a metadata-private messaging system that preserves strong cryptographic guarantees while achieving O(N log² N) efficiency. To achieve this, we depart from PIR and instead introduce an oblivious data structure through which senders and receivers privately communicate. To unlink reads and writes, we instantiate Myco in an asymmetric two-server distributed-trust model where clients write messages to one server tasked with obliviously transmitting these messages to another server, from which clients read. Myco achieves throughput improvements of up to 302x over multi-server and 2,219x over single-server state-of-the-art systems based on PIR.Bio: Darya Kaviani is a second-year PhD student at UC Berkeley, advised by Raluca Ada Popa. Her research interests are in applied cryptography and systems security.Zoom info: Meeting ID: 945 5603 5878 Password: 865039 TBD