| CARVIEW |
We are a collocation of collaborators working on a diverse range of topics in computational linguistics, natural language processing and machine learning.
Credits to Afra for the lab logo and to Tim for the logo idea.
Current Foci
- Formal Aspects of Language Modeling
- Cognitive and (Psycho-)Linguistics
- Information Theory
- Computational Typology and Morphology
- Bias and Fairness in NLP Systems
- Algorithms for Parsing
- Interpreting Neural Representations of Language
- Computational Social Science
Publications
A Close Analysis of the Subset Construction
In this paper we examine the difficulty of finding an equivalent deterministic automaton when confronted with a non-deterministic one. While for some …
Investigating Critical Period Effects in Language Acquisition through Neural Language Models
Humans appear to have a critical period (CP) for language acquisition: Second language (L2) acquisition becomes harder after early childhood, and …
A Distributional Perspective on Word Learning in Neural Language Models
A Practical Method for Generating String Counterfactuals
A Spatio-Temporal Point Process for Fine-Grained Modeling of Reading Behavior
Bigger is not always better: The importance of human-scale language modeling for psycholinguistics
Can Language Models Learn Typologically Implausible Languages?
Controllable Context Sensitivity and the Knob Behind It
Gumbel Counterfactual Generation from Language Models
Incremental Alternative Sampling as a Lens into the Temporal and Representational Resolution of Linguistic Prediction
This study presents a new model of processing difficulty rooted in resource allocation theory, Incremental Alternative Sampling (IAS). Differential …
Information Locality as an Inductive Bias for Neural Language Models
Language Models over Canonical Byte-Pair Encodings
On the challenges and opportunities in generative AI
Pointwise Mutual Information as a Performance Gauge for Retrieval-Augmented Generation
Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo
Syntactic Control of Language Models by Posterior Inference
The Foundations of Tokenization: Statistical and Computational Concerns
The Harmonic Structure of Information Contours
Training Neural Networks as Recognizers of Formal Languages
Unique Hard Attention: A Tale of Two Sides
Variational Best-of-$N$ Alignment
Best-of-N (BoN) is a popular and effective algorithm for aligning language models to human preferences. The algorithm works as follows: at inference …
On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning
The performance of modern language models (LMs) has been improved by chain-of-thought (CoT) reasoning, i.e., the process of generating intermediate …
Towards Explainability in Legal Outcome Prediction Models
Current legal outcome prediction models - a staple of legal NLP - do not explain their reasoning. However, to employ these models in the real world, …
What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages
What can large language models learn? By definition, language models (LM) are distributions over strings. Therefore, an intuitive way of addressing …
Teaching
Natural Language Processing
ETH Zürich Fall 2025 This course presents topics in natural language processing with an emphasis on modern techniques, primarily focusing on statistical and deep learning approaches. The course provides an overview of the primary areas of research in language processing as well as a detailed exploration of the models and techniques used both in research and in commercial natural language systems.
Machine Learning and Computational Complexity
ETH Zürich Fall 2025 This Bachelor’s seminar investigates the computational foundations of modern machine learning. We examine how model complexity, data availability, and algorithmic design interact to determine what learning systems can and cannot achieve, highlighting classic results from computational learning theory.
Advanced Formal Language Theory
ETH Zürich Spring 2025 This course explores the connection between automata and formal logic. More precisely, it covers the algebraic characterization of the regular languages definable in many different logical theories, the complexity theory of boolean circuits, and the connection between the two.
Philosophy of Language and Computation II
ETH Zürich Spring 2025 This graduate class, partly taught like a seminar, is designed to help you understand the philosophical underpinnings of modern work in natural language processing (NLP), most of which is centered around statistical machine learning applied to natural language data.
Understanding Context-Free Parsing Algorithms
ETH Zürich Spring 2025 In the first part of the seminar, we study some of the most popular parsing algorithms, which are a fundamental tool both in natural language processing and in programming languages. Each week, a student will present a paper on parsing, including the papers that first described celebrated parsing algorithms like Earley’s and CKY. We will also put a lot of focus on weighted parsing, which is fundamental in applications to language modeling. In the second part, we’ll examine advanced NLP topics through analysis of pivotal (and often controversial) papers that are shaping the field’s future direction.
Thesis Projects
Please send in your inquiry using this form.
If you are a BSc or MSc student at ETH Zurich interested in writing your thesis with us, we would be delighted to hear from you! Unfortunately, we do not have the capacity to consider students from outside ETH for thesis projects. To obtain a better understanding of what currently interests us, we invite you to check our most recent publications. However, feel free to express interest in any topic you think our group might be well suited to advise you on.
Specifically for Bachelor theses or semester projects, we typically assign you one of our published papers to replicate, so it would be ideal if you indicate 3-4 of our publications that you are interested in.
It helps us a lot with finding a matching project if you are able to state more concrete topics that you would like to work on. We are looking forward to receiving your inquiry!
Joining Our Lab
Thank you very much for your interest in joining our group – unfortunately, we are not accepting PhD students anymore!
If you are interested in working with us as a Master’s student, please see here. Ryan has previously co-advised Master’s students on NLP topics with Mrinmaya Sachan and others, if co-advising is an option you would like to pursue.
Contact us
- contact-rycolab@lists.inf.ethz.ch
- Andreasstrasse 5, 8050 Zürich, Switzerland
- Institute for Machine Learning
- Department of Computer Science, ETH Zürich