| CARVIEW |
SEGA: A Stepwise Evolution Paradigm for Content-Aware Layout Generation with Design Prior
ICCV 2025 (Highlight)
Abstract
In this paper, we study the content-aware layout generation problem, which aims to automatically generate layouts that are harmonious with a given background image. Existing methods usually deal with this task with a single-step reasoning framework. The lack of a feedback-based self-correction mechanism leads to their failure rates significantly increasing when faced with complex element layout planning. To address this challenge, we introduce SEGA, a novel Stepwise Evolution Paradigm for Content-Aware Layout GenerAtion. Inspired by the systematic mode of human thinking, SEGA employs a hierarchical reasoning framework with a coarse-to-fine strategy: first, a coarse-level module roughly estimates the layout planning results; then, another refining module is leveraged to perform fine-level reasoning regarding the coarse planning results. Furthermore, we incorporate layout design principles as prior knowledge into the module to enhance its layout planning ability. Moreover, we present a new large-scale poster dataset, namely BIG-Poster with rich meta-information annotation. We conduct extensive experiments and obtain remarkable state-of-the-art performance improvement on multiple benchmark datasets.
Approach: SEGA
The training pipeline of SEGA includes two stages:
1) Stage-1: A Coarse-level Estimation (CE) module is trained to predict the layouts from scratch.
2) Stage-2: A Fine-level Refinement (FR) module is initialized by weights of CE module, and generate the final layout based on the results of Stage-1.
Dataset: BigPoster-100K
Our dataset includes a wide range of poster categories in real world scenario such as commercial, event, and product posters.
The hierarchical structure of each poster is preserved.
Category distribution of layout elements.
Comparison to State-of-the-Art Methods
Quantitative Results
| Method | PKU | CGL | Graphic | Content | Graphic | Content | Ali ↓ | Ove ↓ | Und_l ↑ | Und_s ↑ | Read ↓ | Occ ↓ | Ali ↓ | Ove ↓ | Und_l ↑ | Und_s ↑ | Read ↓ | Occ ↓ |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Non-LLM Based | ||||||||||||
| CGL-GAN (IJCAI, 2022) | - | 0.0380 | - | 0.4800 | 0.0158 | 0.1320 | - | 0.0470 | - | 0.6500 | 0.0213 | 0.1400 |
| LayoutDM (CVPR, 2023) | - | 0.1720 | - | 0.4600 | 0.0201 | 0.1520 | - | 0.0260 | - | 0.7900 | 0.0192 | 0.1270 |
| RALF (CVPR, 2024) | 0.0031 | 0.0095 | 0.9686 | 0.8981 | 0.0138 | 0.1243 | 0.0023 | 0.0059 | 0.9858 | 0.9652 | 0.0180 | 0.1263 |
| LLM Based | ||||||||||||
| PosterLlama (ECCV, 2024) | 0.0036 | 0.0080 | 0.9874 | 0.9497 | 0.0170 | 0.1380 | 0.0022 | 0.0042 | 0.9823 | 0.9463 | 0.0294 | 0.2453 |
| SEGA w/o FR 7B | 0.0037 | 0.0052 | 0.9873 | 0.9471 | 0.0150 | 0.1336 | 0.0023 | 0.0032 | 0.9817 | 0.9522 | 0.0298 | 0.2442 |
| SEGA w/o FR (Ens-2) 7B | 0.0035 | 0.0041 | 0.9892 | 0.9673 | 0.0144 | 0.1305 | 0.0021 | 0.0024 | 0.9879 | 0.9657 | 0.0296 | 0.2438 |
| SEGA(7B) | 0.0035 | 0.0033 | 0.9897 | 0.9731 | 0.0142 | 0.1286 | 0.0020 | 0.0017 | 0.9913 | 0.9782 | 0.0294 | 0.2430 |
| GT | 0.0036 | 0.0009 | 0.9950 | 0.9444 | 0.0119 | 0.1185 | 0.0023 | 0.0003 | 0.9937 | 0.9402 | 0.0296 | 0.2390 |
| Method | Rule-based Metrics | Aesthetic Scores | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Graphic | Content | SDL ↑ | SQL ↑ | STV ↑ | SIO ↑ | SMean ↑ | ||||||
| Ali ↓ | Ove ↓ | Und_l ↑ | Und_s ↑ | Read ↓ | Occ ↓ | |||||||
| Non-LLM Based | ||||||||||||
| FlexDM (CVPR, 2023) | 0.0122 | 0.1139 | 0.6889 | 0.5034 | 0.0516 | 0.4850 | 4.563 | 5.126 | 4.873 | 5.239 | 4.950 | |
| LLM Based | ||||||||||||
| PosterLlama (ECCV, 2024) | 0.0099 | 0.0238 | 0.9204 | 0.7378 | 0.0395 | 0.4041 | 5.292 | 5.796 | 5.263 | 5.819 | 5.542 | |
| SEGA w/o FR 7B | 0.0102 | 0.0121 | 0.8206 | 0.6698 | 0.0304 | 0.4002 | 5.553 | 6.332 | 5.693 | 5.448 | 5.756 | |
| SEGA w/o FR (Ens-2) 7B | 0.0100 | 0.0093 | 0.8501 | 0.7202 | 0.0285 | 0.3957 | 5.642 | 6.418 | 5.811 | 5.529 | 5.850 | |
| SEGA 7B | 0.0086 | 0.0040 | 0.9337 | 0.8978 | 0.0282 | 0.3964 | 5.792 | 6.411 | 5.824 | 5.708 | 5.941 | |
| SEGA w/o FR 13B | 0.0102 | 0.0093 | 0.8485 | 0.7315 | 0.0271 | 0.3948 | 5.923 | 6.624 | 6.253 | 5.991 | 6.197 | |
| SEGA w/o FR (Ens-2) 13B | 0.0097 | 0.0075 | 0.8715 | 0.7715 | 0.0260 | 0.3874 | 6.128 | 6.652 | 6.058 | 5.822 | 6.165 | |
| SEGA 13B | 0.0095 | 0.0025 | 0.9541 | 0.9270 | 0.0260 | 0.3907 | 6.149 | 6.745 | 6.348 | 6.038 | 6.320 | |
| GT | 0.0100 | 0.0116 | 0.9643 | 0.8187 | 0.0259 | 0.3797 | 6.882 | 7.543 | 6.863 | 6.025 | - | |
Qualitative Results
BibTeX
@article{wang2025SEGA,
author = {Haoran, Wang and Bo, Zhao and Jinghui, Wang and Hanzhang, Wang and Huan, Yang and Wei, Ji and Hao, Liu and Xinyan, Xiao},
title = {SEGA: A Stepwise Evolution Paradigm for Content-Aware Layout Generation with Design Prior},
journal = {ICCV},
year = {2025},
}