What's New
Applications for the 2026 Data Science Summer Institute are now open! Apply below by January 30, 2026.
Undergraduate Posting Graduate Posting
Monthly Newsletter
Don't miss any DSI news. Subscribe to be on the pulse of innovation in data science.
Upcoming Seminar
Our seminar series is on a break. Contact DSI-Seminars [at] llnl.gov (DSI-Seminars[at]llnl[dot]gov) with questions.
Data Scientist Spotlight
Harleen Kaur
Data Engineer Lead
What is your role here at the Lab? I am currently a Data Engineer Lead on the Data Lifecycle Management team in the Computing Applications, Simulations, and Quality Division.
When did you first start at Livermore? I joined the lab on October 31st, 2022. This Halloween marked my third year at LLNL—and yes, I always try to dress up for the occasion.
What did you study in your path to this career? I studied at the University of California, Santa Barbara with a B.A. in economics and B.S. in statistics & data science. This combination helped me balance domain expertise with the technical skills needed to build and manage complex data solutions.
What project(s) are you currently working on? I lead a team of software and data engineers, developing data workflows, architecture, and analytic solutions for a wide range of stakeholders. My focus is on the full data lifecycle: designing robust workflows, ensuring experimental data is archived and accessible, and building dashboards that turn data into actionable insights. I am also working on data readiness to enable the use of AI in scientific applications.
Recent Research
A language model that thinks before speaking
AI technologies based on large language models (LLMs) have empowered users to automate workflows, improve their written communications, summarize long documents, and much more, both in their personal and professional lives. When it comes to applying LLMs to scientific endeavors, however, LLMs still have a long way to go before they can be trusted to provide key insights into experimental design, understand the significance of project results, or begin to appreciate the real-world scientific phenomena involved.
Lawrence Livermore computer scientist Bhavya Kailkhura explains, “Researchers are excited to leverage AI tools like LLMs for solving scientific challenges. However, existing approaches rely heavily on verbally articulated intermediate steps and often fall short in capturing complex, non-verbalized scientific patterns commonly encountered in the scientific applications encountered at Livermore and the DOE broadly. Take protein–protein interactions (PPI) as an example. Understanding PPIs, which are critical to cellular function, demands modeling multi-scale, context-dependent dynamics that words alone cannot sufficiently convey. Relying solely on verbal reasoning can introduce risks such as hallucination errors and omission of critical information, leading to incomplete or inaccurate scientific understanding.” Read more via DSI News.
Opportunities
Open Data Initiative
Careers & Internships











