The U.S. Patent Office has published a Tesla patent that Focuses on AI Processors that Predict 3-D features for Autonomous Driving
Until Governments invest in "smart roads" with built-in sensors to assist autonomous vehicles ride safely in all whether conditions, Tesla and others have turned to Nvidia and AI to vastly improve the feature's accuracy.
On September 12th, the U.S. Patent Office published a Tesla patent application titled "Predicting Three-Dimensional Features for Autonomous Driving." Tesla has been working on this patent since 2019.
A processor (AI) coupled to memory is configured to receive image data based on an image captured by a camera of a vehicle. The image data is used as a basis of an input to a trained machine learning model trained to predict a three-dimensional trajectory of a machine learning feature. The three-dimensional trajectory of the machine learning feature is provided for automatically controlling the vehicle.
In some embodiments, a three-dimensional representation of a feature, such as a lane line as shown in FIG.5 below, is created from the group of time series elements that corresponds to the ground truth. This ground truth is then associated with a subset of the time series elements, such as a single image frame of the group of captured image data.
For example, the first image of a group of images is associated with the ground truth for a lane line represented in three-dimensional space. Although the ground truth is determined based on the group of images, the selected first frame and the ground truth are used to create a training data.
As an example, training data is created for predicting a three-dimensional representation of a vehicle lane using only a single image. In some embodiments, any element or a group of elements of a group of time series elements is associated with the ground truth and used to create training data. For example, the ground truth may be applied to an entire video sequence for creating training data.
As another example, an intermediate element or the last element of a group of time series elements is associated with the ground truth and used to create training data.
In various embodiments, the selected image and ground truth may apply to different features such as lane lines, path prediction for vehicles including neighboring vehicles, depth distances of objects, traffic control signs, etc. For example, a series of images of a vehicle in an adjacent lane is used to predict that vehicle's path.
Using the time series of images and the actual path taken by the adjacent vehicle, a single image of the group and the actual path taken can be used as training data to predict the path of the vehicle. The information can also be used to predict whether an adjacent vehicle will cut into the path of the autonomous vehicle. For example, the path prediction can predict whether an adjacent vehicle will merge in front of an autonomous vehicle. The autonomous vehicle can be controlled to minimize the likelihood of a collision.
For example, the autonomous vehicle can slow down to prevent a collision, adjust the speed and/or steering of the vehicle to prevent a collision, initiate a warning to the adjacent vehicle and/or occupants of the autonomous vehicle, and/or change lanes, etc. In various embodiments, the ability to accurately infer path predictions including vehicle path predictions significantly improves the safety of the autonomous vehicle.
For full details, review Tesla's patent application 20240304003.