Apple advances a Mixed Reality Headset with Hand, Body, Positional and Environmental Tracking Technologies
Today the US Patent & Trademark Office published a patent application from Apple that relates to the field of head-mounted displays (HMD) used for applications that immerse a user in a virtual reality (VR) or an augmented/mixed reality (MR) environment. Apple advances their work that began in 2017.
Apple's Patent Background
The objective of immersion in a virtual world is to convince a user's mind to perceive a non-physical world as if it were real. The concept of reality here refers more to the notion of perceptual plausibility rather than representing a real world. In virtual reality (VR), immersion is achieved by displaying computer generated graphics that simulate a visual experience of a real or imagined world. The quality of immersion is subject to several important factors. For instance, characteristics of the display such as image quality, frame rate, pixel resolution, high dynamic range (HDR), persistence and screen-door effect (i.e., the visible lines between pixels on the screen). The quality of the immerse experience decreases when the displayed field of view is too narrow or if the various tracking functions are slow and/or inaccurate (leading to disorientation and nausea; otherwise known as simulation sickness). Immersion is also impacted by the camera system performance such as the image quality (noise, dynamic range, resolution, absence of artifacts) and the coherence between the virtual graphics (3D modeling, textures and lighting) and the pass-through images. In mixed reality (MR), virtual elements are composited in real-time into the real world environment seen by the user. Physical interaction between the virtual elements and real world surfaces and objects can be simulated and displayed in real-time.
Tracking of various elements is generally recognized as an essential prerequisite for achieving a high end VR and MR application experience. Among these elements, positional head tracking, user body tracking and environment tracking play a key role in achieving great immersion.
Positional head tracking (referred to as positional tracking from here on), which aims to estimate the position and orientation of the HMD in an environment, has to be both low latency and accurate. The reason for this being that the rendered graphics must closely match the user's head motion in order to produce great immersion in VR and the need to correctly align the virtual content in the real world in MR. Some methods try to solve positional tracking in a room whose size is approximately 5.times.5 meters or smaller by using a setup external to the HMD. For instance, a stationary infrared (IR) or color (RGB) camera can be positioned to see an IR or RGB light-emitting diode (LED) array located on the surface of the HMD that would be used to estimate the head position. Other methods are be based on flooding and sweeping the room with IR light generated by one or two base stations, synchronized with multiple IR photosensors precisely positioned on the HMD. The head pose can be calculated in real-time at a high frame rate by considering the detection times of the photosensors. Note that both these approaches limit the area within which the user can move in order to maintain tracking. The user has to be visible to the IR or RGB cameras or alternately be covered by the base station IR emitters. Occlusion may cause tracking inaccuracies.
User body tracking estimates the position and orientation of the user's body (in particular, but not limited to hands and fingers) relative to the HMD. It can provide in both VR and MR, a means of user input (e.g. hand gestures) enabling interaction with virtual elements. While some positional tracking methods can be used for hand tracking as well (e.g. an IR camera with an array of LEDs on hand-held controllers), other methods take advantage of a smaller analysis space, typically within one meter from the HMD, to increase the robustness of the hand and finger tracking algorithms. For instance, close-range Time-of-Flight (ToF) cameras can be integrated with or in the HMD. These cameras can yield a depth map of the hands from which a skeletal model of the hands can be constructed. Another approach uses an IR LED flood light together with cameras to segment out and estimate 3D points on the hands and fingers.
Environment tracking is meant to be very general and involves recognizing and tracking objects in the environment. The notion of objects ranges from a simple flat surface to more complex shapes including moving objects such as humans, translucid objects and light sources. Environment tracking estimates the position and shape of surfaces and objects in the vicinity of the HMD. Virtual elements can then interact with the detected (estimated) objects. An occlusion mask can be extracted from the tracking information to avoid situations where real objects may inadvertently be hidden by a virtual element that should be located further away or behind the object. In practice, computer vision methods are used to recover features (corners, edges, etc.) and scene depths, which are then used to learn and recognize object descriptions.
The use of external components for tracking purposes typically impose a limit on the freedom of the user to move in space and often adds calibration steps before the HMD can be used.
Accordingly, there is a need for an HMD that integrates all of the required tracking components in a compact user-friendly product enabling mobility for the application.
Apple Invention:
Head-Mounted Display for Virtual Reality and Mixed Reality wit Inside-out Positional, user Body and Environmental Tracking
Apple's invention covers a wearable head-mounted display (HMD) that integrates of all required tracking components therein allowing for a more compact user-friendly device.
The Head-Mounted Display (HMD) device is to be used for applications that immerse a user in a virtual reality (VR) or an augmented/mixed reality (MR) environment, comprising: [0012] a pair of RGB camera sensors and associated lenses with infrared (IR) cut-off filters; [0013] a pair of mono camera sensors with near infrared (NIR) bandpass filters and associated lenses; [0014] an inertial measurement unit (IMU); [0015] a time of flight (ToF) camera sensor with an associated IR emitter; [0016] a speckle pattern projector; [0017] a display; and [0018] at least one processing unit operatively connected to the pair of RGB camera sensors, the pair of mono cameras sensors, the IMU, the ToF camera sensor and associated IR emitter, speckle projector and display via at least one communication link, the at least one processing unit generating graphic content using data streams from the pair of RGB camera sensors, the pair of mono cameras sensors, the IMU and the ToF camera sensor and displaying the graphic content through the display. Apple's description is covered in patent FIG. 1 below.
Apple's patent FIG. 2 above is a schematic top view of an exemplary embodiment of the optics, display and cameras used to achieve both virtual and mixed reality.
Apple further notes that the pair of RGB camera sensors and the pair of mono camera sensors are combined into a pair of RGB/IR cameras with associated lenses, the pair of RGB/IR cameras using a Bayer format with a R-G-IR-B pattern instead of the standard R-G-G-B pattern.
Another aspect of the invention covers an HMD device with embedded tracking that includes performing positional tracking and user body tracking, and may also include performing environment tracking.
Steps of performing positional tracking includes: detecting rotationally and scaled invariant 2D image features in the pass-through stereo view images and the stereo images; estimating a depth of each detected feature using stereoscopic matching, yielding a cloud of 3D points; and tracking in real-time the cloud of 3D points to infer head position changes;
the step of performing positional tracking may further include using the inertial measurements to temporarily compute positional changes when the pass-through stereo view images and the stereo images do not provide enough information.
The step of performing user body tracking includes: performing body segmentation on the dense depth map; extracting a body mesh from the dense depth map and the body segmentation; extracting a skeletal model the body mesh; and recognizing predefined gestures by tracking body motion of the user and matching the skeleton model and body motion of the user to gesture models.
The steps of performing environment tracking includes: generating a motion model using the pass-through stereo view images, the stereo images and the positional tracking; detecting key-points; extracting features local to the keypoints using robust feature descriptors; and estimating surface descriptors by fusing the dense depth map with the extracted features.
In accordance with an aspect of the disclosure, there is also provided a method for immersing a user in a virtual reality (VR) or an augmented/mixed reality (MR) environment, the method comprising the steps implemented by the HMD device.
Apple's patent FIG. 2B below is a schematic view of an exemplary embodiment of the optics, in a close-up view, illustrating how light rays of the display focus on the retina of the eye of the user; FIG. 3 is a flow diagram of the visual sensory generation process of the HMD along with an exemplary embodiment for each capability.
Apple's patent FIG. 4A below shows the front view of a first exemplary embodiment of the HMD device, with two RGB cameras optimized for pass-through purposes (MR) and two IR cameras that provide visual data for tracking; FIG. 4B shows the front view of a second exemplary embodiment of the HMD device, with two RGB/IR cameras that achieve both MR and positional tracking;
Apple's patent FIG. 11 above is a schematic representation of the speckle projector.
Apple's patent FIG. 9 below is a flow diagram of an exemplary process to achieve environment tracking; FIG. 12A is a schematic representation of an exemplary embodiment of the time-multiplexing setup.
The foundation of this invention was derived from a company called Vrvana that Apple acquired back in November 2017. Patently Apple first covered the original invention back in March 2019. Today's published patent application advances that invention.
Apple's patent application that was published today by the U.S. Patent Office was originally filed back in Q3 2018. Considering that this is a patent application, the timing of such a product to market is unknown at this time.
About Making Comments on our Site: Patently Apple reserves the right to post, dismiss or edit any comments. Those using abusive language or negative behavior will result in being blacklisted on Disqus.
Comments