Apple has won a patent for a MacBook with a Built-In Advanced Event Camera that could capture sophisticated Hand Gestures
Today the U.S. Patent and Trademark Office officially granted Apple a patent that relates to an advanced event camera system primarily designed to be integrated into a future MacBook.
Apple notes in their patent background that hand movements and other gestures may involve relatively quick movements. Accordingly, gesture recognition systems that use images from traditional frame-based cameras may lack accuracy and efficiency or otherwise be limited by the relatively slow frame rates that such cameras use. Event cameras capture events as they occur for each pixel and may provide the capability to capture data about user movements significantly faster than many frame-based cameras.
However, gesture recognition systems that use event camera data may face challenges in attempting to identify the gestures of a user because, in a given physical environment, there may be many events that are unrelated to a gesture that is being tracked. The need to analyze and interpret potentially vast amounts of event data from a physical environment may significantly reduce the ability of such event camera-based gesture recognition systems to quickly, efficiently, and accurately identify gestures.
Appleās granted patent covers devices, systems, and methods for event camera-based gesture recognition using a subset of event camera data.
In some implementations, a gesture is identified based on a subset of event camera data. In some implementations, the subset of event camera data is identified by a frame-based camera having a field of view (FOV) that overlaps the event camera FOV. In some implementations, the event camera FOV and the frame-based camera FOV are temporally or spatially correlated.
In one implementation at an electronic device having a processor, event camera data is generated by light (e.g., infrared (IR) light) reflected from a physical environment and received at an event camera.
In some implementations, frame-based camera data is generated by light (e.g., visible light) reflected from the physical environment and received at a frame-based camera.
In some implementations, the frame-based camera data is used to identify a region of interest (e.g., a bounding box) for the event camera to analyze.
For example, the frame-based camera data may be used to remove the background from consideration in the event camera data analysis. In some implementations, identifying the subset of the event camera data includes selecting only event camera data corresponding to a portion of a user (e.g., hands) in the physical environment. In some implementations, the event camera is tuned to IR light to reduce noise.
Various implementations include devices, systems, and methods that identify a path (e.g., of a hand) by tracking a grouping of blocks of event camera events. Each block of event camera events may be a region having a predetermined number of events that occur at a given time. The grouping of blocks may be performed based on a grouping radius, a distance, or time, e.g., blocks within a given 3D distance of another block occurring at an instant in time are included. Tracking a grouping of blocks as the grouping of blocks recurs at different points in time may be easier and more accurate than trying to track individual events moving over time, e.g., correlating events associated with the tip of the thumb at different points in time.
Other implementations include devices, systems, and methods that obtain event camera data corresponding to light (e.g., IR or first wavelength range) reflected from a physical environment.
In some implementations, blocks of events that are associated with multiple times are identified based on blocking criteria. For example, each block may be a region having a predetermined number of events that occur at a given time or time period of predetermined length.
In some implementations, an entity (e.g., a hand) is identified at each of the multiple times. In some implementations, a path is determined by tracking a position of the entity at the multiple times.
In some implementations, the entity at each of the multiple times includes a subset of the blocks of events associated with a respective time. In some implementations, the subset of blocks is identified based on grouping criteria.
In some implementations, tracking a position of the entity at each of the multiple times as the entity moves over time provides a path such as the path of a hand of a person as the hand moves over time. In some implementations, frame-based camera data corresponding to light reflected from the physical environment is received and the event camera data is identified based on the frame-based camera data.
In other implementations, a MacBook is the electronic device where processing of the event camera data occurs. In some implementations, the MacBook is the same electronic device that includes the event camera (e.g., laptop/MacBook).
While emphasis in the granted patent is placed on the event camera relating to a future MacBook, Apple claims that the invention could apply to other devices such as: Apple Watch, iPhone, iPad, HMDs, gaming devices and home automation devices.
To review the full details of this invention, check out granted patent 12289559. Apple lists the lead inventor as being Sai Harsha Jandhyala: Electrical Engineering Manager- Camera Electronics & Sensing.