Disclaimer: We may earn a commission if you make any purchase by clicking our links. Please see our detailed guide here.

Follow us on:

Google News

Nvidia Unveils Super Stitched Body, PoE GAN

Yusuf Balogun
Yusuf Balogun
Yusuf is a law graduate and freelance journalist with a keen interest in tech reporting.

Join the Opinion Leaders Network

Join the Techgenyz Opinion Leaders Network today and become part of a vibrant community of change-makers. Together, we can create a brighter future by shaping opinions, driving conversations, and transforming ideas into reality.

Today, the American multinational technology company Nvidia has announced the unveiling of the super stitched body PoE GAN, equipped with an input text sketch semantic map that can generate realistic photos.

 The PoE GAN can receive a wide range of modal input, including text descriptions, image segmentation, sketches, and styles, all of which can be turned into images. The definition of PoE is that it can accept any two combinations of the aforementioned numerous input modes at the same time.

 Hinton proposed the “product of experts” notion in 2002, which became known as PoE. On the input space, each expert is defined as a probability model.

 Each individual input modality represents a constraint condition that the composite image must meet; therefore, the intersection of all constraint sets yields a collection of images that satisfy all requirements.

 The product of the single conditional probability distribution is used to define the distribution of the intersection, assuming that each constraint’s joint conditional probability distribution obeys the Gaussian distribution.

To satisfy each requirement, each distribution must have a high density in the region to make the multiplication and integral distribution have a high density in the region. The focus of PoE GAN is on how to combine each input.

To blend the changes of different types of inputs, the PoE GAN generator uses a global PoE-Net. Each modal input is encoded as a feature vector, which is subsequently summarized into the global PoE-Net using PoE. The decoder uses the global PoE-output, Net’s, and connects the segmentation and sketch encoders directly to the output images.

The global PoE-Net has the following structure: a possible feature vector z0 is used as a sample to use PoE, and then MLP processes the feature to produce the feature vector w.

The author suggests a multi-modal projection discriminator in the discriminator section and extends the projection discriminator to accommodate multiple conditional inputs.

The inner product of each input mode is calculated and added to obtain the final loss, unlike the normal projection discriminator, which calculates a single inner product between image embedding and conditional embedding.




Partner With Us

Digital advertising offers a way for your business to reach out and make much-needed connections with your audience in a meaningful way. Advertising on Techgenyz will help you build brand awareness, increase website traffic, generate qualified leads, and grow your business.

Power Your Business

Solutions you need to super charge your business and drive growth

More from this topic