Carview!

CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 last-modified: Tue, 05 Mar 2024 07:57:38 GMT access-control-allow-origin: * etag: W/"65e6d072-23de" expires: Tue, 30 Dec 2025 09:57:21 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: 1878:292AC1:9E7E74:B20077:69539FA7 accept-ranges: bytes age: 0 date: Tue, 30 Dec 2025 09:47:21 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210030-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1767088041.916850,VS0,VE209 vary: Accept-Encoding x-fastly-request-id: 205f8fadc8100884832e33271c4c2a15a368b28f content-length: 3039 Welcome to Deli Zhao's Homepage

Deli Zhao | 赵德丽

Alibaba DAMO Academy

E-mail: zhaodeli AT gmail.com

Publication: Google Scholar | DBLP

About Me

Dr. Jingren Zhou

Alibaba Tongyi

Composer

VideoComposer

I2VGen-XL

Prof. Xiaoou Tang

Prof. Michael Jordan

Dr. Yujun Shen

News

[11/2023]: We released our new video generation model I2VGen-XL!
[09/2023]: 5 papers accepted by NeurIPS. The acceptance rate in my team is 55.6% (official 26.1%).
[09/2023]: Keynote on "Towards On-Chain AI" at Shanghai Blockchain Week. The slides are available here.
[07/2023]: 12 papers accepted by ICCV. The acceptance rate in my team is 70.6% (official 26.8%).
[06/2023]: We released VideoComposer, the Alibaba big model on video generation and editing.
[06/2023]: Keynote and round-table discussion on AIGC and design at U Design Week.
[04/2023]: 3 papers accepted by ICML. The acceptance rate in my team is 75% (official 27.9%).
[04/2023]: Talk on Composer at The University of Hong Kong.
[04/2023]: Talk on Composer at The Hong Kong University of Science and Technology.
[04/2023]: Keynote on AIGC at Hong Kong Web3 Festival.
[02/2023]: 10 papers accepted by CVPR.
[02/2023]: We released Composer, the Alibaba big model on image generation and editing.

Research

Generative Foundation Models

Text-to-image: Composer, Cones, AnyDoor
Text-to-video: VideoComposer, VideoFusion, FaceComposer
Image-to-video: I2VGen-XL
Diffusion model: Dimensionality-Varying Diffusion Process, Eliminating Lipschitz Singularities
Generative Adversarial Network: Latently Invertible Autoencoder (LIA), IDInvert, LowRankGAN, ReSeFa, LinkGAN

Multi-Modal Learning

Image: RLIP, RA-CLIP, RLEG
Video: MomentDiff
Unified framework: ViM, U-Tuning

Deep-Learning Theory

Deep neural networks: Rank Diminishing in DNNs (cowork with Michael Jordan), Neural Dependencies (cowork with Michael Jordan)
Generative models: Noise injection in GANs, Uncertainty Principles of Encoding GANs

Projects on AIGC

Text-to-image: Alibaba Tongyi Wanxiang (通义万相)
Text-to-video: Video generation, Open-source models on Hugging Face, Open-source models on ModelScope

My Book (in writing)

Original Source | Taken Source