NetEase Office Building China | Image credit: Frank An/Unsplash

It is reported today that Netease Interactive Entertainment AI Lab proposes a single-image real-time high-resolution face replay algorithm that can generate 1440×1440 and 256×256 resolution face replays at real-time frame rates on desktop GPUs and mobile CPUs, respectively. The main idea behind this method is to decouple and encode the appearance and motion information of the face before using a large number of videos to concentrate their prior knowledge through self-supervised learning.

According to the different encoding methods of motion information, related work can be divided into two categories: warp-based and direct synthesis. The deformation-based method, for example, displays the motion information as a motion field, whereas the direct synthesis method encodes the appearance and motion information of the face in a low-dimensional latent space and then decodes it to obtain the synthesis result.

The core concept of this synthesis method is incorporated into the deformation-based algorithm flow, which consists primarily of two modules: First, because the deformation-based algorithm does not need to recreate all face information, it has the potential to create a network structure that supports real-time applications.

As a result, this scheme uses the deformation-based algorithm framework as the foundation and proposes a lightweight U-shaped Deformation network structure; at the same time, the pose encoding method combined with the direct synthesis method encodes the three-dimensional pose of the head that drives the face and injects it into the network to improve the quality of its large pose generation.

Second, to improve the algorithm’s efficiency even further, this scheme proposes a hierarchical motion field prediction network to estimate pixel motion from the source face to the driving face. This scheme, unlike existing single-scale motion field estimation algorithms, can be based on a variety of features. The scale’s feature point image predicts the motion field from coarse to fine, reducing the algorithm’s complexity and ensuring calculation accuracy.

The source image and driving image pair are used as input in the training phase of this method. First, the 3DMM algorithm is used to fit the shape, expression, and head pose parameters of the photo’s face, and then the corresponding feature point image is calculated.

ViaJiqi Zhixin

The Latest

Partner With Us

Digital advertising offers a way for your business to reach out and make much-needed connections with your audience in a meaningful way. Advertising on Techgenyz will help you build brand awareness, increase website traffic, generate qualified leads, and grow your business.

Know More

NetEase Lab to use a real-time high-resolution face replay algorithm based on a single image on desktop and mobile phones

Bitcoin Surge Ahead: A Powerful April 2025 Crypto Pivot

AlphaEvolve AI Evolution Unleashed: Breakthrough Models Reshape 2025

Digital Assets Transformed by Bold June 2025 Regulations

Benefits of Using A Password Manager in Business

7 Effective Strategies to Enhance Your Customer Support Team’s...

LG Chem and Enilive Sign Pact For Biorefinery in South Korea

Vodafone Inks a Whopping $1.5 Billion Pact with Microsoft

Apple Top Global Premium Smartphone Market in 2023 with 60 Percent I...