Apple Granted a Patent for Next-Gen Realistic Avatar Generation using Machine Learning-based Blood Flow Tracking
Today, the U.S. Patent and Trademark Office officially granted Apple a patent that relates to next-generation realistic avatar generation using Machine learning based on blood flow tracking. The best example of this was presented during the introduction of Apple Vision Pro by Mike Rockwell, VP, Technology Development Group.
Rockwell stated: "For digital communications like FaceTime, Vision Pro goes beyond conveying just your eyes and creates an authentic representation of you. This was one of the most difficult challenges we faced in building Vision Pro. There's no video conferencing camera looking at you, and even if there were, you're wearing something over your eyes. Using our most advanced machine learning techniques, we created a novel solution.
After a quick enrollment process using the front sensors on Vision Pro, the system uses as advanced encoder-decoder neural network to create your digital Persona. This network was trained on a diverse group of thousands of individuals. It delivers a natural representation, which dynamically matches your facial and hand movement. With your Persona, you can communicate with over a billion FaceTime-capable devices. When viewed by someone in another Vision Pro, your Persona has volume and depth no possible in traditional video."
Machine Learning-Based Blood Flow Tracking
Apple's granted patent pertains to systems, methods, and computer readable media to utilize a machine learning based blood flow tracking technique for generating an avatar. To generate a photorealistic avatar, blood flow can be mimicked based on facial expressions a subject may make. That is, blood moves around a face differently when a person talks or makes different facial expressions, or performs any other movement that deforms the face. As the blood moves, the coloration of the subject's face may change due to the change in blood flow (e.g., where the subject's blood is concentrated under the skin). The process may include a training phase and an application phase.
The first phase involves training a texture autoencoder based on blood flow image data captured using a photogrammetry system. Many images of a subject or subjects are captured making different expressions such that ground truth data can be obtained between an expression and how blood flow appears in the face. Blood flow may be determined by extracting the lighting component as it is displaced from the albedo map. The albedo map describes a texture of a face with perfectly diffused light, and in the static version of a subject's skin. Accordingly, the extracted lighting component indicates the offset from the albedo map for a particular expression. As a result, the texture autoencoder may map a subject's expression to a 2D blood flow texture map. In one or more embodiments, the texture autoencoder may consider as input a series of expressions which results in a particular 2D blood flow texture map.
The second phase involves utilizing the 2D blood texture map to generate an avatar. The avatar may be generated, for example, using a multipass rendering technique in which the 2D blood texture map is rendered as an additional pass during the multipass rendering process.
As another example, the blood flow texture for a particular expression may be overlaid on a 3D mesh for a subject based on the 2D blood texture map.
For purposes of this patent, an autoencoder refers to a type of artificial neural network used to classify data in an unsupervised manner. The aim of an autoencoder is to learn a representation for a set of data in an optimized form. A trained autoencoder will have an encoder portion, a decoder portion, and latent variables, which represent the optimized representation of the data.
The patent covers natural avatar creation that could be used with future iPhones, iPads, Macs and Apple Vision Pro.
Apple's patent FIG. 2 below shows a flowchart in which mesh and texture autoencoders are trained; FIG. 5 shows a flowchart illustrating a method for rendering an avatar utilizing a blood texture map; FIG. 6 shows a flow diagram illustrating avatar generation.
Apple's patent FIG. 3 below shows a flowchart in which a neural network is trained to provide a mapping between an expression and a blood flow texture.
For more details, review Apple's granted patent 11830182.
Comments