EL-GAN: Edge-Enhanced Generative Adversarial Network for Layout-to-Image Generation

Gao, Lin; Wu, Lei; Meng, Xiangxu

dc.contributor.author	Gao, Lin	en_US
dc.contributor.author	Wu, Lei	en_US
dc.contributor.author	Meng, Xiangxu	en_US
dc.contributor.editor	Umetani, Nobuyuki	en_US
dc.contributor.editor	Wojtan, Chris	en_US
dc.contributor.editor	Vouga, Etienne	en_US
dc.date.accessioned	2022-10-04T06:41:27Z
dc.date.available	2022-10-04T06:41:27Z
dc.date.issued	2022
dc.identifier.issn	1467-8659
dc.identifier.uri	https://doi.org/10.1111/cgf.14687
dc.identifier.uri	https://diglib.eg.org:443/handle/10.1111/cgf14687
dc.description.abstract	Although some progress has been made in the layout-to-image generation of complex scenes with multiple objects, object-level generation still suffers from distortion and poor recognizability. We argue that this is caused by the lack of feature encodings for edge information during image generation. In order to solve these limitations, we propose a novel edge-enhanced Generative Adversarial Network for layout-to-image generation (termed EL-GAN). The feature encodings of edge information are learned from the multi-level features output by the generator and iteratively optimized along the generator's pipeline. Two new components are included at each generator level to enable multi-scale learning. Specifically, one is the edge generation module (EGM), which is responsible for converting the output of the multi-level features by the generator into images of different scales and extracting their edge maps. The other is the edge fusion module (EFM), which integrates the feature encodings refined from the edge maps into the subsequent image generation process by modulating the parameters in the normalization layers. Meanwhile, the discriminator is fed with frequency-sensitive image features, which greatly enhances the generation quality of the image's high-frequency edge contours and low-frequency regions. Extensive experiments show that EL-GAN outperforms the state-of-the-art methods on the COCO-Stuff and Visual Genome datasets. Our source code is available at https://github.com/Azure616/EL-GAN.	en_US
dc.publisher	The Eurographics Association and John Wiley & Sons Ltd.	en_US
dc.subject	CCS Concepts: Computing methodologies → Scene understanding; Image processing
dc.subject	Computing methodologies → Scene understanding
dc.subject	Image processing
dc.title	EL-GAN: Edge-Enhanced Generative Adversarial Network for Layout-to-Image Generation	en_US
dc.description.seriesinformation	Computer Graphics Forum
dc.description.sectionheaders	Image Synthesis
dc.description.volume	41
dc.description.number	7
dc.identifier.doi	10.1111/cgf.14687
dc.identifier.pages	407-418
dc.identifier.pages	12 pages

Files in this item

Name:: v41i7pp407-418.pdf
Size:: 1.220Mb
Format:: PDF

View/Open

Name:: el-gan_supplement.pdf
Size:: 1.485Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

41-Issue 7
Pacific Graphics 2022 - Symposium Proceedings

Show simple item record