The 2023 Kantar BrandZ Most Valuable Global Brands Report lists Apple as the number one most valuable Global Brand by a wide margin
Apple was granted 59 patents yesterday covering Spatial Audio and possible Future Flexible & Foldable Devices

Apple Won a Major Eye Tracking Patent yesterday for Gaze Endpoint Determination

1 XFINAL -- - cover Eye-Tracking

When Apple introduced Apple Vision Pro last week, one of the amazing aspects of this advanced headset was its eye-tracking capabilities. It blew product reviewer Marques Brownlee away. In one of his latest videos he described his impressions of the new Apple Vision Pro. Here's one of his comments on eye tracking:

"Once you get it going, the most impressive thing about this headset, the most impressive thing is the eye tracking. I'm not even Kidding, this eye tracking is sick. So basically the eye tracking in this headset as it looks at your eyes and keeps track of where your eyes move around, is the closest thing that I've experienced to like magic. I normally don't call tech things sort of magical or surreal like this, but this was even for a pre-release product, kind of unbelievable how well it does. Anytime you use your eyes around the UI, it would immediately highlight and select exactly what you're looking at, no matter how small the target was or what you're looking at."   

Yesterday, the U.S. Patent and Trademark Office officially granted Apple a major eye tracking patent that relates to a method and an apparatus for gaze endpoint determination, in particular for determining a gaze endpoint of a subject on a three-dimensional object in space.

Method And An Apparatus For Determining A Gaze Point On A Three-dimensional Object

Apple's granted patent covers a system for determining the gaze endpoint of a subject, the system comprising: an eye tracking unit adapted to determine the gaze direction of one or more eyes of the subject; a head tracking unit adapted to determine the position comprising location and orientation of the head and/or the eye tracking unit with respect to a reference coordinate system; a 3D scene structure representation unit, that represents a real-world scene and objects contained in the scene by representing the objects of the real-world scene through their 3D position and/or their 3D-structure through coordinates in the reference coordinate system to thereby provide a 3D structure representation of the scene; a calculating unit for calculating the gaze endpoint based on the gaze direction, the eye tracker position and the 3D scene structure representation, and/or for determining the object in the 3D scene the subject is gazing at based on the gaze direction, the eye tracker position and the 3D scene structure representation

By using a 3D representation, an eye tracker and a head tracker there can be determined not only a gaze point on a 2D plane but also an object the subject is gazing at and/or the gaze endpoint in 3D.

According to one embodiment the system comprises a module for calculating the gaze endpoint on an object of the 3D structure representation of the scene, wherein said gaze endpoint is calculated based on the intersection of the gaze direction with an object in the 3D structure scene representation.

The intersection of gaze direction with the 3D representation gives a geometrical approach for calculating the location where the gaze “hits” or intersects the 3D structure and therefore delivers the real gaze endpoint. Thereby a real gaze endpoint on a 3D object in the scene can be determined.

According to one embodiment the system comprises a module for calculating the gaze endpoint based on the intersection of the gaze directions of the two eyes of the subject, and/or a module for determining the object the subject is gazing at based on the calculated gaze endpoint and the 3D position and/or 3D structure of the objects of the real world scene.

By using the vergence to calculate the intersection of the gaze direction of the eyes of the subject there can be determined the gaze endpoint. This gaze endpoint can then be used to determine the object the user is gazing at.

According to one embodiment the object being gazed at is determined as the object the subject is gazing at by choosing the object whose 3D position and/or structure is closest to the calculated gaze endpoint,

According to one embodiment said eye tracking unit which is adapted to determining the gaze direction of the said one or more eyes of said subject is adapted to determine a probability distribution of said gaze direction of said one or more eyes, and wherein said calculating unit for determining the object being gazed at determines for one or more objects the probability of said objects being gazed at based on a probability distribution of gaze endpoints.

According to one embodiment the system further comprises: a scene camera adapted to acquire one or more images of the scene from an arbitrary viewpoint; a module for mapping a 3D gaze endpoint onto the image plane of the scene image taken by the scene camera.

In this way not only the 3D gaze endpoint on the 3D structure is determined, but there can be determined the corresponding location on any scene image as taken by a scene camera. This allows the determination of the gaze point in a scene image taken by a camera from an arbitrary point of view, in other words form an arbitrary location.

According to one embodiment the position of the scene camera is known or determined by some position determination or object tracking mechanism and the mapping is performed by performing a projection of the 3D gaze endpoint onto an image of said scene camera.

This is a way of deriving from the 3D gaze endpoint the corresponding point in a scene image taken by a camera at an arbitrary location.

According to one embodiment the system further comprises: A module for generating a scene image as seen from an arbitrary viewpoint based on the 3D structure representation; a module for mapping a 3D gaze endpoint onto the image plane of the image generated by said scene image generating module, wherein the mapping is performed by performing a projection of the 3D gaze endpoint onto the image plane of said scene image generated by said scene image generating module.

In this manner an arbitrary scene image can be generated not by taking an image using a scene camera but instead by generating it based on the 3D structure representation. In this scene image then the gaze endpoint or the object being gazed at can be indicated or visualized by projecting the gaze endpoint onto the scene image or by e.g. highlighting the object which has been determined as the object of the 3D structure being gazed at in the scene image.

According to one embodiment said eye tracker is a head-mounted eye tracker; and/or said scene camera is a head-mounted scene camera.

Head-mounted eye tracker and head-mounted scene cameras are convenient implementations of these devices. Moreover, if the eye tracker is head-mounted, then the head tracker automatically also delivers the position/orientation of the eye tracker. The same is true for the scene camera. Using the position (location and orientation) of the head as determined by the head tracker one can determine based on the gaze direction as determined by the head-mounted eye tracker in the coordinate system of the eye tracker a corresponding gaze direction in the reference coordinate system of the head tracker. 

The position delivered by the head tracker automatically also delivers the position of the eye tracker through the given setup in which the eye tracker is fixed to the head and has a defined spatial relationship with the head, e.g. by the mounting frame through which it is mounted on the head.

According to one embodiment said 3D Structure representation unit comprises a 3D scene structure detection unit that is adapted to determine the 3D structure and position of objects of the scene or their geometric surface structure in the reference coordinate system to obtain a 3D structure representation of the real-world scene.

Apple's patent FIG. 2 below schematically illustrates a gaze endpoint determination system.

2 Apple  granted patent figs. 2 and 3 - 3D gaze mapping - Patently Apple IP report

Apple's patent FIG. 3 above schematically illustrates a gaze endpoint determination system according to a further embodiment of the invention. 

Further to FIG. 2: The 3D-model/reference model is created “offline” using the 3D Structure Detector before the actual gaze measurement (this is illustrated as step a) in the upper part of FIG. 2. The 3D Structure Detector is not needed afterwards—the ET (eye tracker) and HT (head tracker) combination is then sufficient to determine the 3D gaze endpoint on the 3D structure which was determined in step a). This is illustrated in the upper part of step b) illustrated in FIG. 2 which shows the determination of the gaze endpoint on the 3D structure.

Then the mapping of the gaze endpoint onto the scene image taken by a scene camera can be performed. For that purpose any 3D projection method which maps the 3D structure to a 2D scene image using the position and parameters of the camera can be used. In this way the location where the gaze hits the 3D structure can be mapped onto the corresponding location at a scene image taken by a scene camera. This mapping process is schematically illustrated in the lower part of step b) in FIG. 2 which shows the mapping process (e.g. performed by using a 3D projection) of the 3D structure to a scene image.

For finer details, review Apple's granted patent 11676302. Apple's lead inventor is listed as Tom Sengelaub: Senior Engineering Manager - Computer Vision.

10.52FX - Granted Patent Bar

 

Comments

The comments to this entry are closed.