Disclaimer: We may earn a commission if you make any purchase by clicking our links. Please see our detailed guide here.

Follow us on:

Google News

NetEase Lab to use a real-time high-resolution face replay algorithm based on a single image on desktop and mobile phones

Yusuf Balogun
Yusuf Balogun
Yusuf is a law graduate and freelance journalist with a keen interest in tech reporting.

Join the Opinion Leaders Network

Join the Techgenyz Opinion Leaders Network today and become part of a vibrant community of change-makers. Together, we can create a brighter future by shaping opinions, driving conversations, and transforming ideas into reality.

It is reported today that Netease Interactive Entertainment AI Lab proposes a single-image real-time high-resolution face replay algorithm that can generate 1440×1440 and 256×256 resolution face replays at real-time frame rates on desktop GPUs and mobile CPUs, respectively. The main idea behind this method is to decouple and encode the appearance and motion information of the face before using a large number of videos to concentrate their prior knowledge through self-supervised learning.

According to the different encoding methods of motion information, related work can be divided into two categories: warp-based and direct synthesis. The deformation-based method, for example, displays the motion information as a motion field, whereas the direct synthesis method encodes the appearance and motion information of the face in a low-dimensional latent space and then decodes it to obtain the synthesis result.

The core concept of this synthesis method is incorporated into the deformation-based algorithm flow, which consists primarily of two modules: First, because the deformation-based algorithm does not need to recreate all face information, it has the potential to create a network structure that supports real-time applications.

As a result, this scheme uses the deformation-based algorithm framework as the foundation and proposes a lightweight U-shaped Deformation network structure; at the same time, the pose encoding method combined with the direct synthesis method encodes the three-dimensional pose of the head that drives the face and injects it into the network to improve the quality of its large pose generation.

Second, to improve the algorithm’s efficiency even further, this scheme proposes a hierarchical motion field prediction network to estimate pixel motion from the source face to the driving face. This scheme, unlike existing single-scale motion field estimation algorithms, can be based on a variety of features. The scale’s feature point image predicts the motion field from coarse to fine, reducing the algorithm’s complexity and ensuring calculation accuracy.

The source image and driving image pair are used as input in the training phase of this method. First, the 3DMM algorithm is used to fit the shape, expression, and head pose parameters of the photo’s face, and then the corresponding feature point image is calculated.


Partner With Us

Digital advertising offers a way for your business to reach out and make much-needed connections with your audience in a meaningful way. Advertising on Techgenyz will help you build brand awareness, increase website traffic, generate qualified leads, and grow your business.

Power Your Business

Solutions you need to super charge your business and drive growth

More from this topic