HOME
ABOUT
- RESULTS
- differences
- BENEFITS
- HISTORY
- TEAM
- LOCATION
- FACILITIES
- BANKING
- MEMBERSHIPS
- APPROVALS
- LICENCES
- SUPPLIERS
- SPONSORSHIPS
- MEDIA
- PRIVACY
AUCTIONS
SHIPPING
FEES
- TS REWARDS
TOOLS
guides
FAQ
CONTACT
- CONNECT

VEHICLES
BRAND
- JAPANESE CARS
  - DAIHATSU
  - EUNOS
  - FORD
  - HONDA
  - ISUZU
  - LEXUS
  - MAZDA
  - MITSUBISHI
  - MITSUOKA
  - NISSAN
  - SUBARU
  - SUZUKI
  - TOYOTA
- GERMAN CARS
- AMERICAN CARS
- BRITISH CARS
- ITALIAN CARS
- FRENCH CARS
- SWEDISH CARS
- KOREAN CARS
TYPE
- mobility
- VENDING
- instruction
- TAXIS
- AMBULANCES
- FIRE ENGINES
- HEARSES
- LIMOUSINES
- COMMERCIAL
CLASS
FUEL
TRUCKS
minitrucks
- DAIHATSU
- HONDA
- MAZDA
- MITSUBISHI
- NISSAN
- SUBARU
- SUZUKI
- DUMP
- CRANE
- CAMPER
- REFRIGERATED
- 4WD
- NEW
BUSES
MOTORHOMES
- YAHOO!
- RAKUTEN
- DEALER

PARTS
- FREE REPORT
- PARTS CONTAINERS
- PARTS SYSTEMS
- PARTS PROTECTION
- BODY SHELLS
- DISMANTLING
- ONLINE PARTS
- NEW PARTS
- INTERIOR PARTS
- EXTERIOR PARTS
  - BONNETS
  - BUMPERS
  - GRILLES
  - FENDERS
  - DOORS
  - TRUNKS
  - SPOILERS
  - LIGHTS
  - EMBLEMS
  - CAMERAS
- ENGINES
- TRANSMISSIONS
- WHEELS & TYRES
  - WHEELS
  - TYRES
CUTS
PERFORMANCE PARTS
TRUCK PARTS
MOTORBIKE PARTS
- MOTORBIKE ENGINES
- MOTORBIKE ACCESSORIES

MOTORBIKES
MARINE
FORKLIFTS
MACHINERY
AGRICULTURAL
OTHER
COUNTRY
- AUSTRALIA
- CANADA
- KENYA
- MYANMAR
- NEW ZEALAND
- PAKISTAN
- TANZANIA
- UNITED STATES

CARVIEW

MOTORHOMES

Select Language

HTTP/2 301 location: https://research.google/people/papers content-type: text/html; charset=UTF-8 x-content-type-options: nosniff date: Wed, 08 Oct 2025 23:20:00 GMT expires: Wed, 08 Oct 2025 23:50:00 GMT cache-control: public, max-age=1800 server: sffe content-length: 234 x-xss-protection: 0 alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000 HTTP/2 301 content-type: text/html; charset=utf-8 location: https://research.google/pubs/ x-frame-options: DENY vary: Cookie strict-transport-security: max-age=31536000; includeSubDomains; preload x-content-type-options: nosniff referrer-policy: same-origin cross-origin-opener-policy: same-origin cache-control: no-cache x-wagtail-cache: skip x-cloud-trace-context: b0e7e3273aacbd013a13d6be8a0b26e3 date: Wed, 08 Oct 2025 23:20:00 GMT server: Google Frontend content-length: 0 alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000 HTTP/2 200 content-type: text/html; charset=utf-8 vary: Accept-Encoding content-security-policy: script-src 'self' 'unsafe-inline' 'strict-dynamic' http: https: 'sha256-zBmfIicekWsk+Q02/57n6lzm2HIgbBeWN/st19KJYBM=' 'sha256-nKvv2YwBUD93NJaZ6VA5aP7XwmGV/S3G2FkCSI49/gE=' 'sha256-8Tmnm4NhLMrRqh1ZhctvStRyWVVRfk4CHaicfEzZUuI=' 'sha256-Nj7VfcL03AiQQy3lfhSluB1hFwylXDUm+VI2NCh34/w=' 'sha256-HbfYgUUu54uUYLd8WNbMYbcHGHThlfdYPhZmxdlxx3k=' 'sha256-h+sPBVMkWSsyFrQfEmLAhGUET0J7IU8+e68UpCsNdWE=' 'sha256-xdXe7bsAE8jwMFwvzClLp6sF7kElTj3p6FLnfy5neGc=' 'sha256-F+KNqDpRAu0lnbnkzC0Nkgg/m4aDWLk0PCZJY+T4oiM=' 'sha256-x2q8GGYj0PIvCV8AfX2Lv4CKDmK6d3w8YhMV8BwCGqg=' 'sha256-HOMlxQ7t6Wh2T6NDsmOtVTa44+aepnSs1J9eYen32Xk=' 'sha256-KwxvtB46oTihNSE+ggiI4oyvgiSHdj9E5+wG+P6DTD0=' 'sha256-KO07c+2Siu0kHdu/DmM+rvrdVUgTcNPjkSbmTAO8QrE='; media-src 'self' https://*.googleusercontent.com/ https://storage.googleapis.com/gweb-research2023-stg-media-mvp/ https://storage.googleapis.com/gweb-research2023-stg-media/ https://storage.googleapis.com/gweb-research2023-media/ https://gstatic.com/ https://storage.googleapis.com/bioacoustics-www1/ https://storage.googleapis.com/chirp-public-bucket/ https://storage.googleapis.com/h01-release/ https://storage.googleapis.com/brain-genomics-public/ https://github.com/ https://implicitbc.github.io/ https://google.github.io/ https://dynibar.github.io/ https://google-research.github.io/ https://innermonologue.github.io/ https://iterative-refinement.github.io/ https://infinite-nature-zero.github.io/ https://google-research-datasets.github.io/ https://language-to-reward.github.io/ https://*.gstatic.com/ https://raw.githubusercontent.com/ https://karolhausman.github.io/mt-opt/img/mt-opt-grid.mp4 https://palm-e.github.io/videos/palm-e-teaser.mp4 https://research-il.github.io/ https://transporternets.github.io/ https://code-as-policies.github.io https://robotics-transformer.github.io/ https://michelleramanovich.github.io/ https://interactive-language.github.io/video/realtime_30.mp4 https://services.google.com/fh/files/blogs/aiblog_cinematicphotos.mp4 https://vlmaps.github.io/static/images/vlmaps_blog_post.mp4; connect-src 'self' *.google-analytics.com *.googletagmanager.com *.analytics.google.com *.gstatic.com *.google.com; frame-src 'self' *.google.com *.withgoogle.com www.youtube.com https://google.earthengine.app/view/ocean https://mmeka-ee.projects.earthengine.app/view/temporal-demo https://storage.googleapis.com; img-src 'self' data: https://storage.cloud.google.com/gweb-research2023-stg-media-mvp/ https://*.googleusercontent.com/ https://storage.googleapis.com/gweb-research2023-stg-media-mvp/ https://storage.googleapis.com/gweb-research2023-stg-media/ https://storage.googleapis.com/gweb-research2023-media/ https://research.google *.googletagmanager.com *.google-analytics.com https://*.googleusercontent.com/ https://blogger.googleusercontent.com *.ytimg.com *.bp.blogspot.com https://docs.google.com/a/google.com/ https://i.imgur.com/WZocAi7.png https://i.imgur.com/oPCeEcZ.png https://i.imgur.com/eVbbGwD.png https://upload.wikimedia.org/wikipedia/commons/e/ed/Becky_Hammon.jpg https://ngrams.googlelabs.com/ https://research.googleblog.com/uploaded_images/first06-777007.jpg https://googleresearch.blogspot.com/uploaded_images/first06-777007.jpg https://blog.research.google/uploaded_images/first06-777007.jpg https://work.fife.usercontent.google.com/fife/ https://www.gstatic.com/images/branding/googleg_gradient/; style-src 'self' 'unsafe-inline' *.google.com *.gstatic.com fonts.googleapis.com; base-uri 'none'; default-src 'self' *.gstatic.com https://www.youtube.com/embed/kTvHIDKLFqc https://www.youtube.com/embed/Qh-4qF07V1s https://www.youtube.com/embed/gBfynvifkOY https://www.youtube.com/embed/ZMZr83rwdNI https://www.youtube.com/embed/LVFe6P-C7iY https://www.youtube.com/embed/OY2vWMtSsIM https://www.youtube.com/embed/wRCPCNtViGA https://www.youtube.com/embed/iGTM6xs2sck x-frame-options: DENY strict-transport-security: max-age=31536000; includeSubDomains; preload x-content-type-options: nosniff referrer-policy: same-origin cross-origin-opener-policy: same-origin expires: Wed, 08 Oct 2025 23:25:44 GMT cache-control: max-age=1800 x-wagtail-cache: hit content-encoding: gzip x-cloud-trace-context: 24ef4873ff5d8ad13a13d6be8a0b2636 date: Wed, 08 Oct 2025 23:20:01 GMT server: Google Frontend content-length: 34563 alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000 Publications – Google Research

Jump to Content

Research

Defining the technology of today and tomorrow.

Philosophy

We strive to create an environment conducive to many different types of research across many different time scales and levels of risk.
Learn more about our Philosophy

Philosophy

People

Our researchers drive advancements in computer science through both fundamental and applied research.
Learn more about our People

People
Research areas

Explore all research areas

Explore all research areas

Foundational ML & Algorithms

Algorithms & Theory

Data Management

Data Mining & Modeling

Information Retrieval & the Web

Machine Intelligence

Machine Perception

Machine Translation

Natural Language Processing

Speech Processing

Algorithms & Theory

Data Management

Data Mining & Modeling

Information Retrieval & the Web

Machine Intelligence

Machine Perception

Machine Translation

Natural Language Processing

Speech Processing

Computing Systems & Quantum AI

Distributed Systems & Parallel Computing

Hardware & Architecture

Mobile Systems

Networking

Quantum Computing

Robotics

Security, Privacy, & Abuse Prevention

Software Engineering

Software Systems

Distributed Systems & Parallel Computing

Hardware & Architecture

Mobile Systems

Networking

Quantum Computing

Robotics

Security, Privacy, & Abuse Prevention

Software Engineering

Software Systems

Science, AI & Society

Climate & Sustainability

Economics & Electronic Commerce

Education Innovation

General Science

Health & Bioscience

Human-Computer Interaction and Visualization

Responsible AI

Climate & Sustainability

Economics & Electronic Commerce

Education Innovation

General Science

Health & Bioscience

Human-Computer Interaction and Visualization

Responsible AI
Projects

We regularly open-source projects with the broader research community and apply our developments to Google products.
Learn more about our Projects

Projects

Publications

Publishing our work allows us to share ideas and work collaboratively to advance the field of computer science.
Learn more about our Publications

Publications

Resources

We make products, tools, and datasets available to everyone with the goal of building a more collaborative ecosystem.
Learn more about our Resources

Resources
Shaping the future, together.
Collaborate with us

Student programs

Supporting the next generation of researchers through a wide range of programming.
Learn more about our Student programs

Student programs

Faculty programs

Participating in the academic research community through meaningful engagement with university faculty.
Learn more about our Faculty programs

Faculty programs

Conferences & events

Connecting with the broader research community through events is essential for creating progress in every aspect of our work.
Learn more about our Conferences & events

Conferences & events

Collaborate with us
Careers
Blog

Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

results

Filter by:

Year

2025 227
2024 528
2023 783
2022 876
2021 1024
2020 1061
2019 966
2018 793
2017 598
2016 507
2015 386
2014 342
2013 363
2012 386
2011 334
2010 308
2009 326
2008 267
2007 216
2006 131
2005 56
2004 18
2003 14
2002 5
2001 5
2000 2
1999 2
1998 3

Team

Algorithms and Optimization 323
Applied science 186
Athena 549
Climate and Sustainability 10
Cloud AI 46
Euphonia 12
I-DRIM 133
Language 235
Perception 291

Research Area

Algorithms and Theory 1384
Climate and Sustainability 23
Data Management 176
Data Mining and Modeling 359
Distributed Systems and Parallel Computing 364
Economics and Electronic Commerce 361
Education Innovation 71
General Science 346
Hardware and Architecture 154
Health & Bioscience 395
Human-Computer Interaction and Visualization 854
Information Retrieval and the Web 430
Machine Intelligence 3985
Machine Perception 1507
Machine Translation 148
Mobile Systems 115
Natural Language Processing 1118
Networking 329
Quantum Computing 131
Responsible AI 240
Robotics 199
Security, Privacy and Abuse Prevention 535
Software Engineering 221
Software Systems 468
Speech Processing 556

Sort By

Title
Title, descending
Year
Year, descending

1 - 15 of 10501 publications

chip template

DORA 2025 State of AI-assisted Software Development Report

Derek DeBellis

Kevin Storer

Nathen Harvey

Matt Beane

Rob Edwards

Edward Fraser

Ben Good

Eirini Kalliamvakou

Gene Kim

Eric Maxwell

Sarah D'Angelo

Sarah Inman

Ambar Murillo

Daniella Villalba

DORA, Google (2025)

Preview abstract In 2025, the central question for technology leaders is no longer if they should adopt AI, but how to realize its value. DORA’s research includes more than 100 hours of qualitative data and survey responses from nearly 5,000 technology professionals from around the world. The research reveals a critical truth: AI’s primary role in software development is that of an amplifier. It magnifies the strengths of high-performing organizations and the dysfunctions of struggling ones. View details

Mind the GAP: Geometry Aware Passthrough Mitigates Cybersickness

Trishia Chemaly

Mohit Goyal

Francis Duan

Vrushank Phadnis

Sakar Khattar

Bjorn Vlaskamp

Ace Kulshrestha

Eric Turner

Aveek Purohit

Greg Neiswander

Konstantine Tsotsos

2025

Preview abstract Virtual Reality headsets isolate users from the real-world by restricting their perception to the virtual-world. Video See-Through (VST) headsets address this by utilizing world-facing cameras to create Augmented Reality experiences. However, directly displaying camera feeds can cause visual discomfort and cybersickness due to the inaccurate perception of scale and exaggerated motion parallax. This paper presents initial findings on the potential of geometry aware passthrough systems to mitigate cybersickness through enhanced depth perception. We introduce a promising protocol for quantitatively measuring cybersickness experienced by users in VST headsets. Using this protocol, we conduct a user study to compare direct passthrough and geometry aware passthrough systems. To the best of our knowledge, our study is the first one to reveal reduced nausea, disorientation, and total scores of cybersickness with geometry aware passthrough. It also uncovers several potential avenues to further mitigate visually-induced discomfort. View details

Binamix -- A Python Library for Generating Binaural Audio Datasets

Dan Barry

Davoud Shariat Panah

Alessandro Ragano

Jan Skoglund

Andrew Hines

AES 158th Audio Engineering Society Convention (2025)

Preview abstract The increasing demand for spatial audio in applications such as virtual reality, immersive media, and spatial audio research necessitates robust solutions to generate binaural audio data sets for use in testing and validation. Binamix is an open-source Python library designed to facilitate programmatic binaural mixing using the extensive SADIE II Database, which provides Head Related Impulse Response (HRIR) and Binaural Room Impulse Response (BRIR) data for 20 subjects. The Binamix library provides a flexible and repeatable framework for creating large-scale spatial audio datasets, making it an invaluable resource for codec evaluation, audio quality metric development, and machine learning model training. A range of pre-built example scripts, utility functions, and visualization plots further streamline the process of custom pipeline creation. This paper presents an overview of the library’s capabilities, including binaural rendering, impulse response interpolation, and multi-track mixing for various speaker layouts. The tools utilize a modified Delaunay triangulation technique to achieve accurate HRIR/BRIR interpolation where desired angles are not present in the data. By supporting a wide range of parameters such as azimuth, elevation, subject Impulse Responses (IRs), speaker layouts, mixing controls, and more, the library enables researchers to create large binaural datasets for any downstream purpose. Binamix empowers researchers and developers to advance spatial audio applications with reproducible methodologies by offering an open-source solution for binaural rendering and dataset generation. We release the library under the Apache 2.0 License at https://github.com/QxLabIreland/Binamix/ View details

A Recipe for Improving Remote Sensing Zero Shot Generalization

Aviad Barzilai

Yotam Gigi

Vered Silverman

Yehonathan Refael

Bolous Jaber

Amr Helmy

Tomer Shekel

George Leifman

Genady Beryozkin

3rd ML4RS Workshop at ICLR 2025

Preview abstract Foundation models have had a significant impact across various AI applications, enabling applications for use cases that were previously impossible. Visual language models (VLMs), in particular, have outperformed other techniques in many tasks. In remote sensing (RS), foundation models have shown improvements across various applications. However, unlike other fields, the use of VLMs with large-scale remote sensing image-text datasets remains limited. In this work, we first introduce two novel image-caption datasets for training of remote sensing foundation models. The first dataset pairs aerial and satellite imagery, aligned with Google-Maps data, with high-quality captions generated using Gemini. The second utilizes public web images and their corresponding alt-text, filtered for only remote sensing domain, resulting in a highly diverse dataset. We show that using these datasets to pre-train the Mammut [], a VLM architecture, results in state-of-the-art generalization performance in a zero-shot classification and cross-modal retrieval on well-known public benchmarks. Secondly, we leverage this newly pre-trained VLM to generate inference attention maps for a novel class query (i.e., a class unseen during training). We subsequently propose an iterative self-supervised fine-tuning approach where samples aligned with these attention maps are iteratively pseudo-labeled and utilized for model training. View details

An Empirical Study of Time of Day Breakpoints in Traffic Light Plans

Eliav Buchnik

Avishai Zagoury

Tom Kalvari

Jack Haddad

Dotan Emanuel

Dan Karliner

Danny Veikherman

Shai Ferster

Ori Rottenstreich

Avinatan Hassidim

2025

Preview abstract Fixed time strategy is a common approach in signal traffic control in which signal plans are simple and periodic, enjoying easy implementation without detection mechanisms. A traffic light is associated with several daily plans, each applied to several consecutive hours. Time-of-day breakpoints (TODs) refer to the times over the day in which the plan is changed. TODs are often selected based on traffic, aiming to divide the day into groups of consecutive hours with similar traffic characteristics within each group of hours. We present a methodology to study time-of-day breakpoints in practice. We use this methodology to estimate and analyze time-of-day breakpoints in the city of Rio de Janeiro, Brazil based on traffic properties derived from traffic trajectories. Our study examines over 900 of the city intersections. We refer to properties such as the number of daily plans and the times by which plans start. We also provide traffic-aware insights on the potential improvement in the selection of TODs and identify key intersections where adjusting TODs could reduce average delay times. We identify potential improvements in over 8% of the examined intersections. These findings provide valuable insights for traffic engineers seeking to optimize signal timing. View details

SMaCk: Efficient Instruction Cache Attacks via Self-Modifying Code Conflicts

Seonghun Son

Daniel Moghimi

Berk Gulmezoglu

ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2025)

Preview abstract Self-modifying code (SMC) allows programs to alter their own instructions, optimizing performance and functionality on x86 processors. Despite its benefits, SMC introduces unique microarchitectural behaviors that can be exploited for malicious purposes. In this paper, we explore the security implications of SMC by examining how specific x86 instructions affecting instruction cache lines lead to measurable timing discrepancies between cache hits and misses. These discrepancies facilitate refined cache attacks, making them less noisy and more effective. We introduce novel attack techniques that leverage these timing variations to enhance existing methods such as Prime+Probe and Flush+Reload. Our advanced techniques allow adversaries to more precisely attack cryptographic keys and create covert channels akin to Spectre across various x86 platforms. Finally, we propose a dynamic detection methodology utilizing hardware performance counters to mitigate these enhanced threats. View details

"It is important to consult" a linguist: Verb-Argument Constructions in ChatGPT and human experts' medical and financial advice

Chris Stewart

Alistair Windsor

J. Elliott Casal

PLOS One (2025)

Preview abstract This paper adopts a Usage-Based Construction Grammar perspective to compare human- and AI-generated language, focusing on Verb-Argument Constructions (VACs) as a lens for analysis. Specifically, we examine solicited advice texts in two domains—Finance and Medicine—produced by humans and ChatGPT across different GPT models (3.5, 4, and 4o) and interfaces (3.5 Web vs. 3.5 API). Our findings reveal broad consistency in the frequency and distribution of the most common VACs across human- and AI-generated texts, though ChatGPT exhibits a slightly higher reliance on the most frequent constructions. A closer examination of the verbs occupying these constructions uncovers significant differences in the meanings conveyed, with a notable growth away from human-like language production in macro level perspectives (e.g., length) and towards humanlike verb-VAC patterns with newer models. These results underscore the potential of VACs as a powerful tool for analyzing AI-generated language and tracking its evolution over time. View details

YETI (YET to Intervene) Proactive Interventions by Multimodal AI Agents in Augmented Reality Tasks

Saptarashmi Bandyopadhyay

Vikas Bahirwani

Lavisha Aggarwal

Bhanu Guda

Lin Li

Andrea Colaco

2025

Preview abstract Multimodal AI Agents are AI models that have the capability of interactively and cooperatively assisting human users to solve day-to-day tasks. Augmented Reality (AR) head worn devices can uniquely improve the user experience of solving procedural day-to-day tasks by providing egocentric multimodal (audio and video) observational capabilities to AI Agents. Such AR capabilities can help the AI Agents see and listen to actions that users take which can relate to multimodal capabilities of human users. Existing AI Agents, either Large Language Models (LLMs) or Multimodal Vision-Language Models (VLMs) are reactive in nature, which means that models cannot take an action without reading or listening to the human user's prompts. Proactivity of AI Agents, on the other hand, can help the human user detect and correct any mistakes in agent observed tasks, encourage users when they do tasks correctly, or simply engage in conversation with the user - akin to a human teaching or assisting a user. Our proposed YET to Intervene (YETI) multimodal Agent focuses on the research question of identifying circumstances that may require the Agent to intervene proactively. This allows the Agent to understand when it can intervene in a conversation with human users that can help the user correct mistakes on tasks, like cooking, using Augmented Reality. Our YETI Agent learns scene understanding signals based on interpretable notions of Structural Similarity (SSIM) on consecutive video frames. We also define the alignment signal which the AI Agent can learn to identify if the video frames corresponding to the user's actions on the task are consistent with expected actions. These signals are used by our AI Agent to determine when it should proactively intervene. We compare our results on the instances of proactive intervention in the HoloAssist multimodal benchmark for an expert agent guiding an user agent to complete procedural tasks. View details

Probing non-equilibrium topological order on a quantum processor

Melissa Will

Tyler Cochran

Eliott Rosenberg

Bernhard Jobst

Norhan Eassa

Pedram Roushan

Michael Knap

Adam Gammon-Smith

Frank Pollmann

Nature, 645 (2025), 348–353

Preview abstract Out-of-equilibrium phases in many-body systems constitute a new paradigm in quantum matter—they exhibit dynamical properties that may otherwise be forbidden by equilibrium thermodynamics. Among these non-equilibrium phases are periodically driven (Floquet) systems, which are generically difficult to simulate classically because of their high entanglement. Here we realize a Floquet topologically ordered state on an array of superconducting qubits. We image the characteristic dynamics of its chiral edge modes and characterize its emergent anyonic excitations. Devising an interferometric algorithm allows us to introduce and measure a bulk topological invariant to probe the dynamical transmutation of anyons for system sizes up to 58 qubits. Our work demonstrates that quantum processors can provide key insights into the thus-far largely unexplored landscape of highly entangled non-equilibrium phases of matter. View details

Calibration Properties of Time-Series Foundation Models: An Empirical Analysis

Coen Adler

Samar Abdi

Yuxin Chang

Padhraic Smyth

2025

Preview abstract Recent development of foundation models for time series data has generated considerable interest in using such models across a variety of applications. Although they achieve state-of-the-art predictive performance, the ability to produce well-calibrated probabilistic distributions is critical for practical applications and is relatively underexplored. In this paper, we investigate the calibration-related properties of five recent time series foundation models and two competitive baselines. We perform systematic evaluations and identify significant variation in calibration performances across models. View details

Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting

Zilong Wang

Zifeng Wang

Long Le

Steven Zheng

Swaroop Mishra

Vincent Perot

Yuwei Zhang

Anush Mattapalli

Ankur Taly

Jingbo Shang

Chen-Yu Lee

Tomas Pfister

ICLR 2025

Preview abstract Retrieval augmented generation (RAG) has attracted a lot of attention across both academia and industry due to its capability in inserting timely and accurate evidence to the generation by large language models. However, the introduction of retrieved evidence largely makes the input prompt longer, which would harm the understanding quality of large language models and make it slower in actual usage scenarios. To solve these issues, we propose SpeculativeRAG, which leverages a smaller LLM to conduct the retrieval augmented generation for a larger LLM. The smaller LLM can digest a few pieces of evidence and generate multiple pieces of drafts in parallel rapidly, and these drafts will be verified by a large LLM to guarantee the quality. We achieve a higher speed as well as a better quality in the RAG results. View details

ESAM++: Efficient Online 3D Perception on the Edge

Qin Liu

Lavisha Aggarwal

Vikas Bahirwani

Lin Li

Aleksander Holynski

Saptarashmi Bandyopadhyay

Zhengyang Shen

Marc Niethammer

Ehsan Adeli

Andrea Colaco

2025

Preview abstract Online 3D scene perception in real time is critical for robotics, AR/VR, and autonomous systems, particularly in edge computing scenarios where computational resources are limited. Recent state-of-the-art methods like EmbodiedSAM (ESAM) demonstrate the promise of online 3D perception by leveraging the 2D visual foundation model (VFM) with efficient 3D query lifting and merging. However, ESAM depends on a computationally expensive sparse 3D U-Net for point cloud feature extraction, which we identify as the primary efficiency bottleneck. In this paper, we propose a lightweight and scalable alternative for online 3D scene perception tailored to edge devices. Our method introduces a 3D Sparse FeaturePyramid Network (SFPN) that efficiently captures multi-scale geometric features from streaming 3D point clouds while significantly reducing computational over-head and model size. We evaluate our approach on four challenging segmentation benchmarks—ScanNet, ScanNet200, SceneNN, and 3RScan—demonstrating that our model achieves competitive accuracy with up to 3×faster inference and 3×small model size compared to ESAM, enabling practical deployment in real-world edge scenarios. Code and models will be released. View details

Reasoning-SQL: Reinforcement Learning with Partial Rewards for Reasoning-Enhanced Text-to-SQL

Mohammadreza Pourreza

Shayan Talaei

Ruoxi Sun

Xingchen Wan

Hailong Li

Azalia Mirhoseini

Amin Saberi

Sercan Arik

Conference on Language Modeling (COLM) (2025) (to appear)

Preview abstract Text-to-SQL is a challenging task involving multiple reasoning-intensive subtasks, including natural language understanding, database schema comprehension, and precise SQL query formulation. Existing approaches often rely on handcrafted reasoning paths with inductive biases that can limit their overall effectiveness. Motivated by the recent success of reasoning-enhanced models such as DeepSeek R1 and OpenAI o1, which effectively leverage reward-driven self-exploration to enhance reasoning capabilities and generalization, we propose a novel set of partial rewards tailored specifically for the Text-to-SQL task. Our reward set includes schema-linking, AI feedback, n-gram similarity, and syntax check, explicitly designed to address the reward sparsity issue prevalent in reinforcement learning (RL). Leveraging group relative policy optimization (GRPO), our approach explicitly encourages large language models (LLMs) to develop intrinsic reasoning skills necessary for accurate SQL query generation. With models of different sizes, we demonstrate that RL-only training with our proposed rewards consistently achieves higher accuracy and superior generalization compared to supervised fine-tuning (SFT). Remarkably, our RL-trained 14B-parameter model significantly outperforms larger proprietary models, e.g. o3-mini by 4% and Gemini-1.5-Pro-002 by 3% on the BIRD benchmark. These highlight the efficacy of our proposed RL-training framework with partial rewards for enhancing both accuracy and reasoning capabilities in Text-to-SQL tasks. View details

Circadian rhythm of heart rate and activity: a cross-sectional study

Maryam Khalid

Logan Schneider

Aravind Natarajan

Conor Heneghan

Karla Gleichauf

Chronobiology International (2025)

Preview abstract ABSTRACT Background: Circadian rhythms are commonly observed in a number of physiological processes. Consumer wearable devices have made it possible to obtain continuous time series data from a large number of individuals. We study circadian rhythms from measurements of heart rate, movement, and sleep, from a cohort of nearly 20,000 participants over the course of 30 days. Methods: Participation was restricted to Fitbit users of age 21 years or older residing in the United States or Canada. Participants were enrolled through a recruitment banner shown on the Fitbit App. The advertisement was shown to 531,359 Fitbit users, and 23,239 enrolled in the program. Of these, we obtained heart rate data from 19,350 participants. We obtain the underlying circadian rhythm from time series heart rate by modeling the circadian rhythm as a sum over the first two Fourier harmonics. The first Fourier harmonic accounts for the 24-hour rhythmicity, while the second harmonic accounts for non-sinusoidal perturbations. Findings: We observe a circadian rhythm in both heart rate and acceleration. From the diurnal modulation, we obtain the following circadian parameters: (i) amplitude of modulation, (ii) bathyphase, (iii) acrophase, (iv) non-sinusoidal fraction, and (v) fraction of day when the heart rate is greater than the mean. The amplitude, bathyphase, and acrophase depend on sex, and decrease with age. The waketime on average, follows the bathyphase by 2.4 hours. In most individuals, the circadian rhythm of heart rate lags the circadian rhythm of activity. Interpretation: Circadian metrics for heart rate and activity can be reliably obtained from commercially available wearable devices. Distributions of circadian metrics can be valuable tools for individual-level interpretation. View details

Towards Conversational AI for Disease Management

Anil Palepu

Valentin Liévin

Wei-Hung Weng

Khaled Saab

David Stutz

Yong Cheng

Kavita Kulkarni

Sara Mahdavi

Joelle Barral

Dale Webster

Avinatan Hassidim

Yossi Matias

James Manyika

Ryutaro Tanno

Vivek Natarajan

Adam Rodman

Tao Tu

Alan Karthikesalingam

Mike Schaekermann

arXiv (2025)

Preview abstract While large language models (LLMs) have shown promise in diagnostic dialogue, their capabilities for effective management reasoning - including disease progression, therapeutic response, and safe medication prescription - remain under-explored. We advance the previously demonstrated diagnostic capabilities of the Articulate Medical Intelligence Explorer (AMIE) through a new LLM-based agentic system optimised for clinical management and dialogue, incorporating reasoning over the evolution of disease and multiple patient visit encounters, response to therapy, and professional competence in medication prescription. To ground its reasoning in authoritative clinical knowledge, AMIE leverages Gemini's long-context capabilities, combining in-context retrieval with structured reasoning to align its output with relevant and up-to-date clinical practice guidelines and drug formularies. In a randomized, blinded virtual Objective Structured Clinical Examination (OSCE) study, AMIE was compared to 21 primary care physicians (PCPs) across 100 multi-visit case scenarios designed to reflect UK NICE Guidance and BMJ Best Practice guidelines. AMIE was non-inferior to PCPs in management reasoning as assessed by specialist physicians and scored better in both preciseness of treatments and investigations, and in its alignment with and grounding of management plans in clinical guidelines. To benchmark medication reasoning, we developed RxQA, a multiple-choice question benchmark derived from two national drug formularies (US, UK) and validated by board-certified pharmacists. While AMIE and PCPs both benefited from the ability to access external drug information, AMIE outperformed PCPs on higher difficulty questions. While further research would be needed before real-world translation, AMIE's strong performance across evaluations marks a significant step towards conversational AI as a tool in disease management. View details

of 701

of 701 pages

Search on Google Scholar

Meet the teams driving innovation

Our teams advance the state of the art through research, systems engineering, and collaboration across Google.

See our teams

About Google
Google Products
Privacy
Terms

Help

HOME
ABOUT
AUCTIONS
SHIPPING
FEES
TOOLS
HOW
FAQ
CONTACT

Original Source | Taken Source

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Publications

Filter by:

Year

Team

Research Area

Meet the teams driving innovation