Multi-attribute Pizza Generator (MPG2) - Cross-domain Attribute Control with Conditional StyleGAN

Abstract

Multi-attribute conditional image generation is a challenging problem in computervision. We propose Multi-attribute Pizza Generator (MPG), a conditional Generative Neural Network (GAN) framework for synthesizing images from a trichotomy of attributes: content, view-geometry, and implicit visual style. We design MPG by extending the state-of-the-art StyleGAN2, using a new conditioning technique that guides the intermediate feature maps to learn multi-scale multi-attribute entangled representationsof controlling attributes. Because of the complex nature of the multi-attribute image generation problem, we regularize the image generation by predicting the explicit conditioning attributes (ingredients and view). To synthesize a pizza image with view attributesoutside the range of natural training images, we design a CGI pizza dataset PizzaView using 3D pizza models and employ it to train a view attribute regressor to regularize the generation process, bridging the real and CGI training datasets. To verify the efficacy of MPG, we test it on Pizza10, a carefully annotated multi-ingredient pizza image dataset. MPG can successfully generate photo-realistic pizza images with desired ingredients and view attributes, beyond the range of those observed in real-world training data.

MPG2 Framework

Citation

@misc{han2021multiattribute,
      title={Multi-attribute Pizza Generator: Cross-domain Attribute Control with Conditional StyleGAN}, 
      author={Fangda Han and Guoyao Hao and Ricardo Guerrero and Vladimir Pavlovic},
      year={2021},
      eprint={2110.11830},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

License

Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)