Seven Major Spatial Audio Patents from Apple published today Deeply Dive into the science of motion, head tracking & more
(Click on image to Enlarge)
Apple is prioritizing Spatial Audio to be available on AirPods Pro, AirPods Max and their future audio systems for a Mixed Reality Headset and more. Spatial audio with dynamic head tracking gives users a theater‑like experience for movies and shows, with sound that surrounds them. Using built-in gyroscopes and accelerometers, AirPods Max and your iPhone, iPad, Mac, or Apple TV track the subtle motion of your head, anchoring sounds to your device. Today the US Patent & Trademark Office published a series of seven major Apple patent applications relating to Spatial Audio that deeply details the science of head motion detection, user posture detection and much more.
Spatial audio creates a three-dimensional (3D) virtual auditory space that allows a user wearing a headset to pinpoint where a sound source is located in the 3D virtual auditory space, while watching a movie, playing a video game or interacting with augmented reality (AR) content on a source device (e.g., a computer screen). Existing spatial audio platforms include a head pose tracker that uses a video camera to track the head pose of a user. If the source device is a mobile device (e.g., smartphone, tablet computer), then the source device and the headset are free to move relative to each other, which may adversely impact the user's perception of the 3D spatial audio.
Head Motion Prediction for Spatial Audio Applications
Apple's first Spatial Audio patent application in this report is titled "Head Motion Prediction for Spatial Audio Applications." The patent covers a method that comprises: obtaining, using one or more processors, motion data from a source device and a headset; obtaining, using the one or more processors, transmission delays; estimating, using the one or more processors, relative motion from the relative source device and headset motion data; calculating, using the one or more processors, a first derivative of the relative motion data; calculating, using the one or more processors, a second derivative of the filtered relative motion data; forward predicting, using the one or more processors, the estimated relative motion over the time delays using the first derivative and second derivative of relative motion; determining, using a head tracker, a head pose of the user based on the forward predicted relative motion data; and rendering, using the one or more processors and the head pose, spatial audio for playback on the headset.
In an embodiment, rendering, using the one or more processors and the head pose, spatial audio for playback on the headset comprises: rendering audio channels in an ambience bed of a three-dimensional virtual auditory space, such that a center channel of the audio channels is aligned with a boresight vector that originates from a headset reference frame and terminates at a source device reference frame, wherein the center channel is aligned with boresight vector by rotating a reference frame for the ambience bed to align the center channel with boresight vector, and wherein the boresight indicates a relative position between the source device and the headset.
Apple's patent FIG. 1 below illustrates an example user posture change event; FIG. 2 illustrates a centered and inertially stabilized 3D virtual auditory space.
More specifically, patent FIG. 1 above illustrates an example user posture change event, according to an embodiment. In the example shown, the user (#101) is viewing a visual portion of AV content displayed on source device (#103) while sitting on a couch (#102). The user is wearing a headset #104 that is wirelessly coupled to source device #103. The headset includes stereo loudspeakers that output rendered spatial audio (e.g., binaural rendered audio) content generated by source device. A user posture change event occurs when the user stands up from a seated position on the couch and begins walking with the source device in their hands while still viewing the content displayed on source device.
When source device is moved or rotated, the motion sensors detect the motion. The outputs of IMU are processed into rotation and acceleration data in an inertial reference frame. In an embodiment, the source device outputs AV content, including but not limited to augmented reality (AR), virtual reality (VR) and immersive video content. The source device also includes an audio rendering engine (e.g., a binaural rendering engine) that simulates the main audio cues humans use to localize sounds including interaural time differences, interaural level differences, and spectral filtering done by the outer ears.
To ensure that the spatial audio remains "centered," an estimated relative motion (relative position and attitude) is determined using motion data from two inertial measurement units.
To review the details in the first spatial audio patent, see Apple's patent application 20210397249 here.
Apple's Other Six Spatial Audio Patents
Spatial Audio Patent #2 (20210397250): "User Posture Change Detection for Head Pose Tracking in Spatial Audio Applications."
Spatial Audio Patent #3 (20210396779): "User Posture Transition Detection and Classification." Apple's patent FIG. 1 below is a block diagram of a system for user posture detection and classification; FIG. 2 illustrates the biomechanics of a sit-to-stand posture transition.
Spatial Audio Patent #4 (20210400418): "Head-to-Head Rotation Transform Estimation for Head Pose Tracking in Spatial Audio Applications." Apple's patent FIG. 3 below illustrates the geometry for estimating a head to headset rotation transform auditory space; FIG. 9 illustrates the geometry for a relative motion model used in head tracking.
Spatial Audio Patent #5 (20210400414): "Head Tracking Correlated Motion Detection for Spatial Audio Applications." Apple's patent FIG. 3 below is a block diagram of a system that uses correlated motion to select a motion tracking state; FIG. 9 illustrates various reference frames and notation for relative pose tracking.
Spatial Audio Patent #6 (20210400419): "Head Dimension Estimation for Spatial Audio Applications." Apple's patent FIG. 1 below illustrates a headset IMU undergoing rotational motion; FIG. 3 is a flow diagram of a process of estimating head dimension.
Apple's Spatial Audio Patent #7 (20210398545): "Binaural Room Impulse Response for Spatial Audio Reproduction." Apple's patent FIG. 1 below illustrates a system and method for rendering spatial audio; FIG. 2A and 2B show an example of a sound source and reflection.