✍️ InstructEdit

InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions.
Qian Wang, Biao Zhang, Michael Birsak, Peter Wonka
This code base is modified based on the repo Grounded Segment Anything.

Pipeline

Our proposed framework has three components: language processor, segmenter, and image editor. Language processor processes the user instruction using a large language model. The goal of this processing is to parse the user instruction and output prompts for the segmenter and captions for the image editor. We adopt ChatGPT and optionally BLIP2 for this step. Segmenter uses the segmentation prompt provided by the language processor. We employ a state-of-the-art segmentation framework Grounded Segment Anything to automatically generate a high-quality mask based on the segmentation prompt. Image editor uses the captions from the language processor and the masks from the segmenter to compute the edited image. We adopt Stable Diffusion and the mask-guided generation from DiffEdit for this purpose.

Set up environment

Please set the environment variable manually as follows if you want to build a local GPU environment:

export AM_I_DOCKER=False
export BUILD_WITH_CUDA=True
export CUDA_HOME=/path/to/cuda-11.3/

Install Segment Anything:

python -m pip install -e segment_anything

Install Grounding DINO:

python -m pip install -e GroundingDINO

Install diffusers:

pip install --upgrade diffusers[torch]

The following optional dependencies are necessary for mask post-processing, saving masks in COCO format, the example notebooks, and exporting the model in ONNX format. jupyter is also required to run the example notebooks.

pip install opencv-python pycocotools matplotlib onnxruntime onnx ipykernel

More details can be found in install segment anything and install GroundingDINO and install OSX

After setting up the environment, please also specify the openai key in chatgpt.py.

Playground

We provide a notebook (grounded_sam_instructedit_demo.ipynb), a python script (grounded_sam_instructedit_demo.py) and a gradio app (gradio_intructedit.py) for you to play around with.

Citation

@misc{wang2023instructedit,
      title={InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions}, 
      author={Qian Wang and Biao Zhang and Michael Birsak and Peter Wonka},
      year={2023},
      eprint={2305.18047},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
GroundingDINO		GroundingDINO
docs		docs
example_input		example_input
output_image		output_image
output_mask		output_mask
segment_anything		segment_anything
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
chatgpt.py		chatgpt.py
ddim_inversion.py		ddim_inversion.py
gradio_instructedit.py		gradio_instructedit.py
grounded_sam_instructedit_demo.ipynb		grounded_sam_instructedit_demo.ipynb
grounded_sam_instructedit_demo.py		grounded_sam_instructedit_demo.py
requirements.txt		requirements.txt
setup.py		setup.py
stable_diffusion_masked_diffedit.py		stable_diffusion_masked_diffedit.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

✍️ InstructEdit

Pipeline

Set up environment

Playground

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

QianWangX/InstructEdit

Folders and files

Latest commit

History

Repository files navigation

✍️ InstructEdit

Pipeline

Set up environment

Playground

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages