Apple Working on Hot 3D Eye-Tracking Interface for Gaming & iPhone!
On February 10, 2012, the US Patent & Trademark Office published a patent application from Apple that reveals a hot 3D eye-tracking based interface that will be used for gaming, digital photography and videography, biometrics and surveillance applications while being an OS feature option for iOS devices and Apple's iMac. The technology may work in conjunction with Apple's previous work on 3D which touched on head tracking and unique ambient light feature technologies. Apple wowed us earlier this year with a patent which demonstrated that they're paving the way for a new 3D GUI. Today's patent adds fuel to that fire. Apple's 3D GUI patent trend is really picking up momentum and you have to wonder when they'll release this beast into the wild!
The Drawbacks with Today's Device Based 3D Environments
It's no secret that video games now use various properties of motion and position collected from, e.g., compasses, accelerometers, gyrometers, and Global Positioning System (GPS) units in hand-held devices or control instruments to improve the experience of play in simulated, i.e., virtual, three dimensional (3D) environments. In fact, software to extract so-called "six axis" positional information from such control instruments is well-understood, and is used in many video games today. The first three of the six axes describe the "yaw-pitch-roll" of the device in three dimensional space. In mathematics, the tangent, normal, and binormal unit vectors for a particle moving along a continuous, differentiable curve in three dimensional space are often called T, N, and B vectors, or, collectively, the "Frenet frame," and are defined as follows: T is the unit vector tangent to the curve, pointing in the direction of motion; N is the derivative of T with respect to the arclength parameter of the curve, divided by its length; and B is the cross product of T and N. The "yaw-pitch-roll" of the device may also be represented as the angular deltas between successive Frenet frames of a device as it moves through space. The other three axes of the six axes describe the "X-Y-Z" position of the device in relative three dimensional space, which may also be used in further simulating interaction with a virtual 3D environment.
Face detection software is also well-understood in the art and is applied in many practical applications today including: digital photography, digital videography, video gaming, biometrics, surveillance, and even energy conservation. Popular face detection algorithms include the Viola-Jones object detection framework and the Schneiderman & Kanade method. Face detection software may be used in conjunction with a device having a front-facing camera to determine when there is a human user present in front of the device, as well as to track the movement of such a user in front of the device.
However, current systems do not take into account the location and position of the device on which the virtual 3D environment is being rendered in addition to the location and position of the user of the device, as well as the physical and lighting properties of the user's environment in order to render a more interesting and visually pleasing interactive virtual 3D environment on the device's display.
Thus, there is need for techniques for continuously tracking the movement of an electronic device having a display, as well as the lighting conditions in the environment of a user of such an electronic device and the movement of the user of such an electronic device--and especially the position of the user of the device's eyes. With information regarding lighting conditions in the user's environment, the position of the user's eyes, and a continuous 3D frame-of-reference for the display of the electronic device, more realistic virtual 3D depictions of the objects on the device's display may be created and interacted with by the user.
Apple's In-Depth 3D GUI Solution
Apple's invention is about the use of various position sensors, e.g., a compass, a Micro-Electro-Mechanical Systems (MEMS) accelerometer, a GPS module, and a MEMS gyrometer, to infer a 3D frame of reference (which may be a non-inertial frame of reference) for a personal electronic device, e.g., a hand-held device such as a mobile phone. Use of these position sensors could provide a true Frenet frame for the device, i.e., X- and Y- vectors for the display, and also a Z-vector that points perpendicularly to the display. In fact, with various inertial clues from an accelerometer, gyrometer, and other instruments that report their states in real time, it is possible to track the Frenet frame of the device in real time, thus providing a continuous 3D frame of reference for the hand-held device. Once the continuous frame of reference of the device is known, the techniques that will be disclosed herein could then either infer the position of the user's eyes, or calculate the position of the user's eyes directly by using a front-facing camera. With the position of the user's eyes and a continuous 3D frame-of-reference for the display, more realistic virtual 3D depictions of the objects on the device's display may be created and interacted with.
To accomplish a more realistic virtual 3D depiction of the objects on the device's display, objects may be rendered on the display as if they were in a real 3D "place" in the device's operating system environment. In some embodiments, the positions of objects on the display could be calculated by ray tracing their virtual coordinates, i.e., their coordinates in the virtual 3D world of objects, back to the user of the device's eyes and intersecting the coordinates of the objects with the real plane of the device's display. In other embodiments, virtual 3D user interface (UI) effects, referred to herein as "21/2D" effects, may be applied to 2D objects on the device's display in response to the movement of the device, the movement of the user, or the lighting conditions in the user's environment in order to cause the 2D objects to "appear" to be virtually three dimensional to the user.
3D UI Effects Achievable Using this Technique
It is possible, for instance, using a 21/2D depiction of a user interface environment to place realistic moving shines or moving shadows on the graphical user interface objects, e.g., icons, displayed on the device in response to the movement of the device, the movement of the user, or the lighting conditions in the user's environment.
It is also possible to create a "virtual 3D operating system environment" and allow the user of a device to "look around" a graphical user interface object located in the virtual 3D operating system environment in order to see its "sides." If the frame of reference is magnified to allow the user to focus on a particular graphical user interface object, it is also possible for the user to rotate the object to "see behind" it as well, via particular positional changes of the device or the user, as well as user interaction with the device's display.
It is also possible to render the virtual 3D operating system environment as having a recessed "bento box" form factor inside the display. Such a form factor would be advantageous for modular interfaces. As the user rotates the device, he or she could look into each "cubby hole" of the bento box independently. It would also then be possible, via the use of a front-facing camera, to have visual "spotlight" effects follow the user's gaze, i.e., by having the spotlight effect "shine" on the place in the display that the user is currently looking into. It is also possible to control a position of a spotlight effect based solely on a determined 3D frame of reference for the device. For example, the spotlight effect could be configured to shine into the cubby hole whose distorted normal vector pointed the closest in direction to the user of the device's current position.
Interaction with a Virtual 3D World inside the Display
To interact via touch with the virtual 3D display, Apple states that the techniques disclosed in the patent it possible to ray trace the location of the touch point on the device's display into the virtual 3D operating system environment and intersect the region of the touch point with whatever object or objects it hits. Motion of the objects caused by touch interaction with the virtual 3D operating system environment could occur similarly to how it would in a 2D mode, but the techniques disclosed herein would make it possible to simulate collision effects and other physical manifestations of reality within the virtual 3D operating system environment. Further, it is possible to better account for issues such touchscreen parallax, i.e., the misregistration between the touch point and the intended touch location being displayed, when the Frenet frame of the device is known.
How to Prevent the GPU from Constantly Re-Rendering
Apple states that to prevent the over-use of the Graphics Processing Unit (GPU) and excessive battery drain on the device, the techniques disclosed in their patent employ the use of a particularized gesture to turn on the "virtual 3D operating system environment" mode, as well as positional quiescence to turn the mode off. In one embodiment, the gesture is the so-called "princess wave," i.e., the wave-motion rotation of the device about one of its axes. For example, the "virtual 3D operating system environment" mode could be turned on when more than three waves of 10-20 degrees along one axis occur within the span of one second.
In one embodiment, when the "virtual 3D operating system environment" mode turns on, the display of the UI "unfreezes" and turns into a 3D depiction of the operating system environment (preferably similar to the 2D depiction, along with shading and textures indicative of 3D object appearance). When the mode is turned off, the display could slowly transition back to a standard orientation and freeze back into the 2D or 21/2D depiction of the user interface environment. Positional quiescence, e.g., holding the device relatively still for two to three seconds, could be one potential cue to the device to freeze back to the 2D or 21/2D operating system environment mode and restore the display of objects to their more traditional 2D representations.
Desktop Machines as Well
On desktop machines, the Frenet frame of the device doesn't change, and the position of the user with respect to the device's display would likely change very little, but the position of the user's eyes could change significantly. A front-facing camera, in conjunction with face detection software, would allow the position of the user's eyes to be computed. Using field-of-view information for the camera, it would also be possible to estimate the distance of the user's head from the display, e.g., by measuring the head's size or by measuring the detected user's pupil-to-pupil distance and assuming a canonical measurement for the human head, according to ergonomic guidelines. Using this data, it would then be possible to depict a realistic 21/2D or 3D operating system environment mode, e.g., through putting shines on windows, title bars, and other UI objects, as well as having them move in response to the motion of the user's eyes or the changing position of the user's head. Further, it would also be possible to use the position of the user's head and eyes to allow the user to "look under" a window after the user shifts his or her head to the side and/or moves his or her head towards the display.
Apple's Key Patent Graphics
Apple's patent FIG. 2 illustrates an exemplary 3D UI technique that may be employed by a personal electronic device operating in a 21/2D operating system environment mode; patent FIG. 6 illustrates the effects of device movement on a personal electronic device presenting a virtual 3D depiction of a graphical user interface object.
Apple's patent FIG. 7 illustrates the effects of user movement on a personal electronic device presenting a virtual 3D depiction of a graphical user interface object; patent FIG. 8 illustrates a recessed, "bento box" form factor inside the virtual 3D display of a personal electronic device.
Apple's patent FIG. 9 illustrates a point of contact with the touchscreen of a personal electronic device ray traced into a virtual 3D operating system environment; patent FIG. 10 illustrates an exemplary gesture for activating the display of a personal electronic device to operate in a virtual 3D operating system environment mode.
It's Like Reaching into a 3D Virtual Environment
In another embodiment, a shadow or other indicator (902) of the user fingertip may be displayed in the appropriate place in the 2D rendering of the virtual 3D operating system environment depicted on the display. Information about the position of the user's fingertip could be obtained from contact information reported from the touchscreen or from near-field sensing techniques. In this way, the user of the device could actually feel like he or she is "reaching into" the virtual 3D environment.
Using near-field sensing techniques, a finger's position in the "real world" may be translated into the finger's position in the virtual 3D operating system environment by reinterpreting the distance of the finger from the device's display as a distance of the finger from the relevant graphical user interface object in the virtual 3D operating system environment, even when the relevant graphical user interface object is at some "distance" into the virtual 3D world within the display's "window."
For example, if a user's finger were sensed to be one centimeter from the device's display, the relevant indication of the location of the user's touch point in the virtual 3D operating system environment, e.g., dashed circle 906 in FIG. 9, may cast a shadow or display some other visual indicator "in front of" the relevant graphical user interface object, thus providing an indication to the user that they are not yet interacting with the relevant graphical user interface object, but if the user were to move their finger closer to the display's "window," i.e., touch surface display, they may be able to interact with the desired graphical user interface object.
The use of a visual indicator of the user's desired touch location, e.g., the "shadow offset," is not only an enhanced 3D UI effect. Rather, because knowledge of the 3D frame of reference of the device allows for better accounting of touchscreen parallax problems, i.e., the misregistration between the touch point and the intended touch location being displayed, the techniques described herein with respect to FIG. 9 may also provide the user of the device with a better representation of what interactions he or she would be experiencing if the graphical user interface objects were "real life" objects.
Take a Peek at What 3D Could look like on a Future iPad
Considering that the last segment of our patent report quotes Apple stating that "it's like reaching into a 3D environment, " we thought it would be fun to check out what a 3D environment could actually look like on an iPad. If you've never seen this type of demonstration before, then you're in for a treat.
If you advance the video to about the 1:00 minute mark, you'll be able to see three distinct segments unfold covering 3D targets, a 3D iPad Homepage concept and a floating 3D cube. While this is all very cool - I'm sure that Apple's version of a true 3D interface will simply be superior. When Apple is ready, we're likely to see them introduce a few great 3D apps to kick it off on the iPad and these apps are likely to be beneficial, highly functional and less gimmcky. I'm equally sure that Apple will carefully craft a medical community centric video for a future Apple Event demonstrating how doctors are able to take advantage of their new 3D interface capabilities for such things as viewing brain scans on the go and much more to make the point that 3D isn't only about marketing flash. Yet as a general peak at 3D on an iPad, the video below is pretty interesting. It's just enough to get our imaginations roaring on what could be.
Minimum: ARM's Cortex A with v7 Architecture
Apple's documentation points to ARM's Cortex A8 with the v7-A architecture as being able to provide a versatile and robust programmable control device that may be utilized for carrying out the disclosed techniques. I'm sure that the Cortex A8 is the minimum and will only get better under the A9. Which processor Apple will be using for their 2012 iPad and iPhone updates is unknown at this time.
Patent Credits
Apple's patent applicationwas originally filed in Q3 2010 by inventors Mark Zimmer, Geoff Stahl, David Hayward and Frank Doepke. Also see related patents in our Future Interface Archive.
Notice: Patently Apple presents a detailed summary of patent applications with associated graphics for journalistic news purposes as each such patent application is revealed by the U.S. Patent & Trade Office. Readers are cautioned that the full text of any patent application should be read in its entirety for full and accurate details. Revelations found in patent applications shouldn't be interpreted as rumor or fast-tracked according to rumor timetables. Apple's patent applications have provided the Mac community with a clear heads-up on some of Apple's greatest product trends including the iPod, iPhone, iPad, iOS cameras, LED displays, iCloud services for iTunes and more. About Comments: Patently Apple reserves the right to post, dismiss or edit comments.
To Our Facebook Fans: We're now manually entering our report summaries onto our Wall along with a full graphic per report that could easily be enlarged to full size. We think that you'll like the change, so check it out.
Here are a Few Great Sites covering our Original Report
MacSurfer, Designer 3d Glasses, Scoople Apple Channel, Reddit,T3 UK, Real Clear Technology, StockTwits, Twitter, Facebook, Apple Investor News, Google Reader, Macnews, Aberto ate de Madrugada Portugal, iPhone World Canada, MarketWatch, BGR, iPhoneItalia Italy, MacDailyNews, Hardware Zone Singapore, Spaziomela Italy, MobiFrance, Melamorsicata Italy, Mac Gazette, Gizmodo, Clubic French, Cult of Mac, Melablog Italy, Techmeme, PC Magazine, Branchez-Vous Techno Montreal (French Canada), Conecti Mexico, 3D TV, Apfelnews Germany, Kill Screen, Mobile Magazine, ITProPortal, iPhones.co Israel, International Society for Presence Research, Redmond Pie, and more.
Note: The sites that we link to above offer you an avenue to make your comments about this report in other languages. These great community sites also provide our guests with varying takes on Apple's latest invention. Whether they're pro or con, you may find them to be interesting, fun or feisty. If you have the time, join in!
If anyone could pull off a classy and functional 3D environment for the iPad and/or television, it's Apple. After reading this report and the one posted in January, I'm convinced that Apple has the right approaches. If Apple's next architecture will be able to drive graphics 20 x what they are today, this could be the time that Apple brings this to market. It may take another year, but I think it's going to happen.
Posted by: Joe | February 09, 2012 at 09:17 AM