| CARVIEW |
Samuel Clarke
I am a PhD Student in Stanford University's Computer Science Department, advised by Professor Jiajun Wu. Previously, I was a Graduate Research Assistant in Carnegie Mellon's Robotics Institute, co-advised by Professors Chris Atkeson and Oliver Kroemer. Recently, my research has been in the realm of learning for manipulation, especially how robots can use sound during manipulation.
Research
Hearing Anything Anywhere
We collect a dataset of real Room Impulse Responses (RIRs) from four rooms, then introduce a new Novel Viewpoint Acoustic Synthesis method based on differentiable audio rendering. Our method uses physics-based biases to achieve practical sample efficiency, only requiring RIRs to be measured at roughly 12 distinct locations in the room.
SoundCam: A Dataset for Finding Humans Using Room Acoustics
We record thousands of room impulse responses and music clips in different real rooms with humans standing in different positions in the room. Learning-based models can use these minute differences in the room's acoustics to track, identify, or detect humans in the room. Our data can be used to develop more robust and sample-efficient methods, with applications in home assistants, security, and robotics.
RoboCook: Long-Horizon Elasto-Plastic Object Manipulation with Diverse Tools
We introduce a framework for accomplishing long-horizon tasks in soft-body manipulation and show that it can learn to make dumplings with a variety of tools, with very little training data for each tool. We also show that our framework can learn to use tools to achieve other tasks in soft-body manipulation, such as shaping dough into target shapes, autonomously selecting tools for each step of the task.
RealImpact: A Dataset of Impact Sound Fields for Real Objects
We collect 150,000 annotated recordings of impacts of 50 everyday objects, recorded from 600 distinct microphone locations. We show how our data can be used to tune and validate acoustic simulations, or used directly in interesting downstream audio and audiovisual tasks.
DiffImpact: Differentiable Rendering and Identification of Impact Sounds
Differentiable physics-based models provide a useful bias for learning from impact sounds to solve both forward and backward problems on impact audio. We show we can both infer models from data in the wild, and then use these models to perform source separation better than generic learning-based alternatives.
Robot Learning for Manipulation of Granular Materials Using Vision and Sound
Deep learning-based data-driven models can both predict the effects of a scooping operation on a granular material using vision and can learn to use audio for feedback on scooping and pouring granular materials.
Learning Audio Feedback for Estimating Amount and Flow of Granular Material
Deep learning-based data-driven models can accurately predict the amount of granular materials a robot pours or shakes, based only on audio recordings. With machine learning, we can use recordings from a $3 microphone to outperform the measurement resolution of a $3,000 wrist force-torque sensor.
NaturalMotion: Exploring Gesture Controls for Visualizing Time-Evolving Graphs
We took the Matrix Cube, a tool for visualizing time-evolving graphs, and developed a new 3D interface controlled by hand gestures. Gestures were captured with a Leap Motion device.
LatentGesture: Active User Authentication through Background Touch Analysis
With data collected from a user's brief interactions with common UI elements, such as checkboxes and sliders, machine learning models can uniquely identify the user. Such a system could authenticate a mobile device's user continuously and seamlessly, without many of the vulnerabilities common to traditional authentication methods.
Projects
Deep Scoop
This project was the precursor to my paper on predicting the effects of scooping. I attempted to learn a scooping policy with deep reinforcement learning, posing scooping a goal mass as a Contextual Bandit Problem. I adapted different popular techniques such as Actor-Critic and Cross-Entropy Method. What I learned from my results was very helpful in refining my approach for my later publication.
CPR+ Mask
We designed a mask to walk an untrained rescuer through performing standard-of-care CPR to a cardiac arrest victim, in real time. The mask is equipped with sensors for monitoring the state of the victim and the quality of the CPR and uses a speaker and LEDs to give instructions and cues to the rescuer.
Experience
Research
- Synthesized datasets and developed learning-based methods for tracking object motion using sound.
Machine Learning Intern
- Developed deep learning-based methods to extract feedback and detect anomalies from audio, with applications in industrial automation.
Graduate Research Assistant
- Developed deep learning-based audio feedback framework for estimating masses for robotic pouring and scooping granular materials (see Research)
- Developed deep learning-based data-driven framework for predicting effects of robotic scooping granular materials (see Research)
- Designed and manufactured numerous parts for lab projects using SolidWorks
- Designed and trained many learning models using TensorFlow
Mechanical Engineering Intern
- Led multi-team design and implementation of an automated vision-based scanning device to ensure a secure exit of hard drives from data centers
- Developed APIs for automation machines to query desired data from sources on production networks with Go, Python, and C++ communicating through JSON
Software Engineering Intern
- Built Arduino-controlled motorized linear slide to automate Android device tests
- Developed Go adapter for communicating with Arduino devices over serial
- Developed tools for compiling Arduino code within company codebase
- Wrote automated communications tests for Android devices, using Go and Android Java
Propulsion Data Science Intern
- Collaborated in development of an automated system to detect anomalies in telemetry sensor data from rocket engine tests, targeting a 12x reduction in human review time
- Transferred legacy academic code modelling the Merlin 1D rocket engine from Fortran to Python to optimize performance of a new iteration of the engine
Undergraduate Researcher
- Developed gesture-based 3D data visualization in an Oculus Rift virtual reality environment, with demo developed in the Unity game engine (see Research)
- Developed method for behavioral authentication on Android featured in popular press (Engadget, Gizmodo, Yahoo, many more) (see Research)
Engineering Practicum Intern
- Developed backend RPC infrastructure for serving feature search requests to the search bar in Google Maps Engine viewer
- Mechanical 20% project with Glass: Studied repeatability of 4-axis automated manufacturing robots through designing tests and analyzing results in MATLAB
Freshman Engineering Practicum Intern
- Developed automated user interface tests for Google Play on Android
- Developed App Engine project in Python to organize and monitor test results
- 20% project with Android hardware team: Designed and prototyped testing tool and façade case for unannounced products in Creo Parametric, performed mechanical tests
Contractor
- Prototyped HBase infrastructure and an application to efficiently move billions of records from SQL Server to Hadoop and HBase, as a feasibility study
- Worked with C#, Pig, Java, and Thrift
Intern
- Data mining and analysis tasks using Hadoop, Pig, Tableau, and Excel
- Developed in-house click fraud detection algorithm which rivaled proprietary options
Education
Stanford University
Carnegie Mellon University
GPA: 4.17/4.33
Georgia Institute of Technology
GPA: 4.0/4.0