LAM-A2E: Audio to Expression

Description

This project leverages audio input to generate ARKit blendshapes-driven facial expressions in ⚡real-time⚡, powering ultra-realistic 3D avatars generated by LAM.

To enable ARKit-driven animation of the LAM model, we adapted ARKit blendshapes to align with FLAME's facial topology through manual customization. The LAM-A2E network follows an encoder-decoder architecture, as shown below. We adopt the state-of-the-art pre-trained speech model Wav2Vec for the audio encoder. The features extracted from the raw audio waveform are combined with style features and fed into the decoder, which outputs stylized blendshape coefficients.

Demo

LAM-A2E_Demo.mp4

📢 News

[May 21, 2025] We have released a Avatar Export Feature, enabling users to generate facial expressions from audio using any LAM-generated 3D digital humans.
[April 21, 2025] We have released the ModelScope Space !
[April 21, 2025] We have released the WebGL Interactive Chatting Avatar SDK on OpenAvatarChat (including LLM, ASR, TTS, Avatar), with which you can freely chat with our generated 3D Digital Human ! 🔥

To do list

Release Huggingface space.
Release Modelscope space.
Release the LAM-A2E model based on the Flame expression.
Release Interactive Chatting Avatar SDK with OpenAvatarChat, including LLM, ASR, TTS, LAM-Avatars.

🚀 Get Started

Environment Setup

git clone git@github.com:aigc3d/LAM_Audio2Expression.git
cd LAM_Audio2Expression
# Create conda environment (currently only supports Python 3.10)
conda create -n lam_a2e python=3.10
# Activate the conda environment
conda activate lam_a2e
# Install with Cuda 12.1
sh  ./scripts/install/install_cu121.sh
# Or Install with Cuda 11.8
sh ./scripts/install/install_cu118.sh

Download

# HuggingFace download
# Download Assets and Model Weights
huggingface-cli download 3DAIGC/LAM_audio2exp --local-dir ./
tar -xzvf LAM_audio2exp_assets.tar && rm -f LAM_audio2exp_assets.tar
tar -xzvf LAM_audio2exp_streaming.tar && rm -f LAM_audio2exp_streaming.tar
# Or OSS Download (In case of HuggingFace download failing)
# Download Assets
wget https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/aigc3d/data/LAM/LAM_audio2exp_assets.tar
tar -xzvf LAM_audio2exp_assets.tar && rm -f LAM_audio2exp_assets.tar
# Download Model Weights
wget https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/aigc3d/data/LAM/LAM_audio2exp_streaming.tar
tar -xzvf LAM_audio2exp_streaming.tar && rm -f LAM_audio2exp_streaming.tar
Or Modelscope Download
git clone https://www.modelscope.cn/Damo_XR_Lab/LAM_audio2exp.git ./modelscope_download

Quick Start Guide

Using Gradio Interface:

We provide a simple Gradio demo with WebGL Render, and you can get rendering results by uploading audio in seconds.

Gradio_Demo.mp4

python app_lam_audio2exp.py

Inference

# example: python inference.py --config-file configs/lam_audio2exp_config_streaming.py --options save_path=exp/audio2exp weight=pretrained_models/lam_audio2exp_streaming.tar audio_input=./assets/sample_audio/BarackObama_english.wav
python inference.py --config-file ${CONFIG_PATH} --options save_path=${SAVE_PATH} weight=${CHECKPOINT_PATH} audio_input=${AUDIO_INPUT}

Acknowledgement

This work is built on many amazing research works and open-source projects:

Thanks for their excellent works and great contribution.

Related Works

Welcome to follow our other interesting works:

LAM
LHM

Citation

@inproceedings{he2025LAM,
  title={LAM: Large Avatar Model for One-shot Animatable Gaussian Head},
  author={
    Yisheng He and Xiaodong Gu and Xiaodan Ye and Chao Xu and Zhengyi Zhao and Yuan Dong and Weihao Yuan and Zilong Dong and Liefeng Bo
  },
  booktitle={arXiv preprint arXiv:2502.17796},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets/images		assets/images
configs		configs
engines		engines
models		models
scripts/install		scripts/install
utils		utils
wheels		wheels
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app_lam_audio2exp.py		app_lam_audio2exp.py
inference.py		inference.py
inference_streaming_audio.py		inference_streaming_audio.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LAM-A2E: Audio to Expression

Description

This project leverages audio input to generate ARKit blendshapes-driven facial expressions in ⚡real-time⚡, powering ultra-realistic 3D avatars generated by LAM.

Demo

📢 News

To do list

🚀 Get Started

Environment Setup

Download

Quick Start Guide

Using Gradio Interface:

Inference

Acknowledgement

Related Works

Citation

About

Uh oh!

Releases

Packages

Languages

License

aigc3d/LAM_Audio2Expression

Folders and files

Latest commit

History

Repository files navigation

LAM-A2E: Audio to Expression

Description

This project leverages audio input to generate ARKit blendshapes-driven facial expressions in ⚡real-time⚡, powering ultra-realistic 3D avatars generated by LAM.

Demo

📢 News

To do list

🚀 Get Started

Environment Setup

Download

Quick Start Guide

Using Gradio Interface:

Inference

Acknowledgement

Related Works

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages