Apple Invents Advanced 3D Modeling Technology for Mystery App
On July 28, 2011, the US Patent & Trademark Office published a patent application from Apple that reveals various aspects behind a newly advanced 2D-3D modeling application. The invention relates to light source detection from synthesized faces, for example, analyzing an image to detect a type and a relative location of a light source that illuminated an object in the image at the instant that the image was captured. Where it gets a little creepy however, is that Apple states that such software is used by "information gathering agencies." Whether Apple is considering this is for a future iteration of Final Cut Pro X, an all new 3D modeling application or a private project for a government agency is unknown at this time – but using a photo of Tom Hanks face just might tip Apple's hand.
Background
According to Apple, Information gathering agencies could use a variety of techniques for identifying a type and a location of a light source that illuminate a subject of a photograph. One such technique may include taking multiple photographs of the subject from different angles, followed by measuring the length of shadows cast by the subject in the respective photographs for each corresponding angle. Typically, analyzing the shadow-length measurements could be based on an assumption about the type of illuminating source: directional light or point light. Iterative testing of various locations where a point source or a directional source may have been placed may eventually lead to determining a light source location that best reproduces the shadow-length measurements.
Apple's Abstract
Apple's patent abstract states that the patent covers methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining a location relative to an object and a type of a light source that illuminated the object when the image was captured. A method performed by a process executing on a computer system includes identifying an object of interest in a digital image. The method further includes projecting at least a portion of the digital image corresponding to the object of interest onto a three dimensional (3D) model that includes a polygon-mesh corresponding to the object's shape. The method further includes determining one or more properties of a light source that illuminated the object in the digital image at an instant that the image was captured based at least in part on a characteristic of one or more polygons in the 3D model onto which the digital image portion was projected.
Light Source Detection from Synthesized Objects
Apple's patent FIG. 1 below shows an input-output diagram of a system for determining location and type of one or more light sources that illuminated an object when a two dimensional (2D) image of the object was captured. FIG. 2 shows a flowchart of an example process for determining a light source to reproduce a projection of a color map corresponding to a face in a 2D image onto a 3D model of the face.
Apple's patent FIGS 4A and 4B shown below illustrate aspects of a process for determining a light source that illuminates a 3D model of a face.
Apple's patent application 20110182520 was originally filed in Q1 2010 by inventor Robert Free.
For Technophiles Only: Apple's Full Summary
Unless 3D modeling is your field of expertise, the following information will likely bore you to tears. On the other hand, if you're a technophile working with advanced 3D modeling you may find Apple's lengthy and detailed summary to be of great interest to you.
First Aspect of Apple's Patent
This specification describes technologies relating to detection of one or more light sources that illuminated an object in a two-dimensional image when the image was taken. The methods and systems disclosed in this specification enable determining a light source vector of directional light and a position of point light.
One aspect of the subject matter described in Apple's patent application could be implemented in methods performed by a process executing on a computer system. A method includes identifying an object of interest in a digital image. The method further includes projecting at least a portion of the digital image corresponding to the object of interest onto a three dimensional (3D) model that includes a polygon-mesh corresponding to the object's shape. The method also includes determining one or more properties of a light source that illuminated the object in the digital image at an instant that the image was captured based at least in part on a characteristic of one or more polygons in the 3D model onto which the digital image portion was projected.
Implementations could include any, all, or none of the following features. The determining of the one or more properties of the light source could include determining a type of the light source, a location of the light source or both. The method could include selecting a subset of polygons onto which the digital image was projected, the selection based on a predetermined criterion. The predetermined criterion could include a predetermined hue. The selecting could include identifying regions of the selected subset of polygons corresponding to portions of the digital image having the predetermined hue. The predetermined criterion could include a predetermined luminosity spatial frequency. The selecting could include identifying regions of the selected subset of polygons corresponding to portions of the digital image having a spatial-change in luminosity less than the predetermined luminosity spatial frequency. The method could include calculating a median luminosity over portions of the image corresponding to the selected subset of polygons; assigning a weight between zero and one to each polygon from the selected subset of polygons onto which the digital image was projected based on a luminosity relative to the calculated median of an associated portion of the digital image; and responsive to the weight of a polygon being larger than zero, identifying the polygon as a specular polygon.
In some implementations, the assigning could include associating a weight of zero to polygons corresponding to respective portions of the digital image having luminosity equal to or smaller than the calculated median luminosity; and associating a weight larger than 0 to polygons corresponding to respective portions of the digital image having luminosity larger than calculated median luminosity. The assigning could include associating a weight of zero to polygons corresponding to respective portions of the digital image having luminosity equal to or smaller than a predetermined luminosity threshold. The predetermined luminosity threshold could be larger than the calculated median luminosity. The assigning further includes associating a weight of one to polygons corresponding to respective portions of the digital image having luminosity larger than the predetermined luminosity threshold.
In some implementations, a characteristic of polygon could include direction of normal to polygon's face. Determining of the light source could include back-tracing rays that travel from a viewing location relative to the 3D model, to the identified specular polygons, then reflect off respective faces of the identified specular polygons in accordance with respective directions of the normals to the identified specular polygons' faces, and travel in a direction of the light source. The viewing location corresponds to a location relative to the object of a camera that captured the image of the object. For the back-traced rays reflecting off a certain group of polygons among the identified specular polygons to travel parallel to a certain direction relative to the 3D model, the method includes determining that the light source could be of directional-type and could be placed along the certain direction. For the back-traced rays reflecting off a particular group of polygons among the identified specular polygons to intersect at a particular point relative to the 3D model, the method includes determining that the light source a point type and could be placed at the particular point. The digital image could be a two-dimensional (2D) digital image, and the identified object of interest in the 2D digital image could be a human face.
Second Aspect of Apple's Invention
In a second aspect, a computer storage medium encoded with a computer program, the program includes instructions that when executed by data processing apparatus cause the data processing apparatus to perform operations includes projecting onto the provided 3D model a color map corresponding to a light source that illuminated an object in an image at an instant that the image was captured. The 3D model includes a polygon-mesh corresponding to the object's shape. The apparatus further includes selecting specular regions of the projected color map. The specular regions are indicative of specular reflection of light from the light source. The apparatus further includes back-tracing rays that travel from a viewing location relative to the 3D model, to the selected specular regions, then reflect off the selected specular regions and travel in a direction of the light source. The viewing location corresponds to a location relative to the object of a camera that captured the image of the object. The apparatus further includes determining a location relative to the 3D model of the light source based on the back-traced rays reflecting off the selected specular regions.
Implementations could include any, all, or none of the following features. The determining could include for the back-traced rays reflecting off a certain group of regions among the selected specular regions to travel parallel to a certain direction relative to the 3D model, the apparatus further includes determining that the light source could be of directional-type and could be placed along the certain direction. For the back-traced rays reflecting off a particular group of regions among the selected specular regions to intersect at a particular point relative to the 3D model, the apparatus further includes determining that the light source a point type and could be placed at the particular point. The selecting of specular regions could include determining regions of interest of the projected color map based on a predetermined criterion; calculating a median of luminosity across the regions of interest. The luminosity represents a dimension in a hue-saturation-luminosity color space (HSL) corresponding to a brightness of a color along a lightness-darkness axis; and assigning a weight between zero and one to each of the regions of interest based on an associated luminosity relative to the calculated median. A weight larger than zero denotes a region selected as a specular region, and a weight of zero denotes a region not selected as specular region.
In some implementations, the assigning could include associating a weight of zero to regions having luminosity equal to or smaller than the calculated median luminosity, and associating a weight larger than 0 to regions having luminosity larger than calculated median luminosity. The computer storage medium could include correlating the back-traced rays reflected off the selected specular regions at least in part based on the respective weights of the selected specular regions. The assigning could include associating a weight of zero to regions having luminosity equal to or smaller than predetermined luminosity threshold. The predetermined luminosity threshold could be larger than the calculated median luminosity, and associating a weight of one to regions having luminosity larger than the predetermined luminosity threshold. The predetermined criterion could include a predetermined hue. The determining of regions of interest could include identifying regions of the projected color map having the predetermined hue. The predetermined criterion could include a predetermined luminosity spatial frequency. The determining of regions of interest could include identifying regions of the projected color map having a spatial-change in luminosity less than the predetermined luminosity spatial frequency. A region of the projected color map could include one of: a contiguous area of a pixels could included in a polygon; all pixels could be included in a polygon; a contiguous group of neighboring polygons could be included in the polygon-mesh; a planar-surface corresponding to a linear-surface-fit of the contiguous group of neighboring polygons; or a spline-surface corresponding to a non-linear-surface-fit of the contiguous group of neighboring polygons.
In some implementations, the computer storage medium could include finding the viewing location based on information could include one or more of dimensions of the projected color map, dimensions of the object in the image, and the location and photometric parameters of the camera that captured the image of the object. The back-tracing of rays could further include reflecting the back-traced rays in accordance with planar-surface reflection off polygons could be included in portions of the polygon-mesh corresponding to the selected specular regions. The back-tracing of rays could further include reflecting the back-traced rays in accordance with curved-surface reflection off spline surfaces generated over neighboring polygons could be included in portions of the polygon-mesh corresponding to the selected specular regions.
Third Aspect of Apple's Invention
In a third aspect, a computer system includes a data processor communicatively coupled with a data storage device and with a user device. The data processor is configured to receive, from the user device, a two dimensional (2D) digital image of a face. The system further includes providing, from the data storage device, a three-dimensional (3D) model that includes a polygon-mesh corresponding to the face's shape. The system further includes projecting onto the provided 3D model a color map corresponding to a light source that illuminated the face in the 2D digital image at an instant that the 2D digital image was captured. The system further includes selecting specular regions of the projected color map. The specular regions are indicative of specular reflection of light from the light source. The system further includes back-tracing rays that travel from a viewing location relative to the 3D model, to the selected specular regions, then reflect off the selected specular regions and travel in a direction of the light source. The viewing location corresponds to a location relative to the face of a camera that captured the 2D digital image of the face. The system further includes for the back-traced rays reflecting off a certain group of regions among the selected specular regions to travel parallel to a certain direction relative to the 3D model, determining that the light source is of directional-type and is placed along the certain direction. The system further includes for the back-traced rays reflecting off a particular group of regions among the selected specular regions to intersect at a particular point relative to the 3D model, determining that the light source a point type and is placed at the particular point.
Implementations could include any, all, or none of the following features. The selecting of specular regions could include determining regions of interest of the projected color map based on a predetermined criterion; calculating a median of luminosity across the regions of interest. The luminosity represents a dimension in a hue-saturation-luminosity color space (HSL) corresponding to a brightness of a color along a lightness-darkness axis; and assigning a weight between zero and one to each of the regions of interest based on an associated luminosity relative to the calculated median. A weight larger than zero denotes a region selected as a specular region, and a weight of zero denotes a region not selected as specular region.
In some implementations, the providing could include if the 3D model of the face could be stored on a data storage device that could be communicatively coupled with the computer system, retrieving the stored 3D model of the face from the storage device, otherwise retrieving a 3D model of a generic face from the storage device, and modifying the shape of the 3D model of the generic face to match biometric characteristics of the face in the 2D digital image to obtain the 3D model of the face.
In some implementations, the 3D model could further include a neutral color map that covers each polygon of the polygon-mesh. The neutral color map corresponds to the 3D model illuminated by diffuse light such that each hue of the neutral color map could have a luminosity of 50%, the system could be further configured to synthesize a luminosity map that covers each polygon of the 3D model based on the determined light source. Responsive to overlaying the synthesized luminosity map on the neutral color map, reproducing a full-coverage color map that covers each polygon of the 3D model of the face, the reproduced full-coverage color map corresponding to the illumination conditions of the face when the 2D digital image was captured; and examining dark and bright patterns of the full-coverage color map over portions of the 3D model of the face. The examined portions correspond to areas of the face outside of a field of view in the 2D digital image.
In some implementations, the system is configured to correct the 2D digital image for illumination artifacts based on the determined type and location of the light source, and to perform face recognition for the face in the 2D digital image corrected for illumination artifacts. The correcting could include illuminating the 3D model of the face using the determined light source to obtain a synthetically illuminated color map of the face, determining a luminosity-difference between the synthetically illuminated color map of the face and the projected color map of the face, subtracting the determined luminosity-difference from the projected color map of the face to obtain a correction of the projected color map of the face; and mapping the correction of the color map from the 3D model of the face to the 2D digital image of the face to obtain a 2D digital image corrected for illumination artifacts.
Particular implementations of the subject matter described in this specification could be implemented to realize one or more of the following potential advantages. For example, once a location of an illumination point source has been identified using the technologies disclosed in this specification, an orientation of a subject in an image could be determined based on the identified position of the illumination point source.
Further, a user could correlate a directional light source's (e.g., the Sun's) attitude with a time when an image was captured to determine image location latitude. Conversely, a time of day could be determined knowing the attitude of the Sun and the latitude where the image was captured. In addition, if both time and latitude are known, one could an determine the orientation of a subject in the image.
Furthermore, the methods and systems disclosed in this specification could be used to examine dark and bright patterns of a color map over portions of a 3D model of an object. The examined portions could correspond to areas of the object outside of a field of view in a 2D photograph of the object, while the examined patterns correspond to illumination conditions when the 2D photograph was taken.
The disclosed techniques could also be used to remove lighting artifacts (specular reflections, shadows, and the like) from a three-dimensional (3D) synthesized model of an object based on the position and the nature (type) of detected light sources. A color map of the 3D model corrected for illumination artifacts could be mapped back onto a two-dimensional (2D) image of the object to obtain a lighting-neutral texture for improved object correlation, matching and recognition.
Additionally, once the light sources illuminating one or more objects are identified, these could in turn be used as hints for recognizing shadows and highlights on other unrecognized objects, specifically providing hints regarding the unknown objects' shape, size and location.
Further, the identified light sources may also be used to provide hints for tracking the movement and orientation of recognized objects in subsequent frames of sequential images (e.g., videos)--reducing recognition time and improving accuracy. Furthermore, if positions of still objects are known and positions of light sources have been identified, one could determine the orientation and location of the capturing camera from sequential frames of a video.
Notice: Patently Apple presents a detailed summary of patent applications with associated graphics for journalistic news purposes as each such patent application is revealed by the U.S. Patent & Trade Office. Readers are cautioned that the full text of any patent application should be read in its entirety for full and accurate details. Revelations found in patent applications shouldn't be interpreted as rumor or fast-tracked according to rumor timetables. Apple's patent applications have provided the Mac community with a clear heads-up on some of Apple's greatest product trends including the iPod, iPhone, iPad, iOS cameras, LED displays, iCloud services for iTunes and more.
About Comments: Patently Apple reserves the right to post, dismiss or edit comments.
Robert Free was one of the Senior Face Recognition R&D Engineers at Apple. He left to open his own company:
http://www.linkedin.com/pub/dir/Robert/Free
Posted by: Tom | July 28, 2011 at 03:00 PM