Today the US Patent & Trademark Office published a patent application from Apple that relates to Siri on HomePod determining when a user is asking it a question to answer or to another person in the room by determining a user's head pose based on room reverberation.
Apple notes that in some circumstances it may not be clear to a Siri operating on electronic device like HomePod if the user is intending to interact it or is rather speaking to another person or device like an iPhone.
For example, the user may speak an utterance in a room that has a HomePod present along with multiple people. In this situation, Siri may be unable to determine from the utterance alone if it should respond to the user through HomePod. Accordingly, it would be advantageous for Siri to better determine the user's head pose and whether the user is facing HomePod during an utterance to determine if a response is required.
Apple's invention includes a method that includes one or more processors and memory, receiving audio input, determining a direct-to-reverberant energy ratio based on the audio input, and determining a head pose of a user based on the direct-to-reverberant energy ratio.
Determining the head pose of the user in this manner allows Siri to determine if the user is facing HomePod, and thus intending to address Siri, or if the user was facing away from the electronic device and thus addressing someone else or another device.
This allows Siri to quickly and efficiently determine which requests from the user it should respond to, and which can be ignored. Combining these methods with other known methods of triggering a digital assistant can improve the user experience and the efficiency of both Siri and HomePod.
Apple's patent FIG. 8 below illustrates a block diagram of a portion of digital assistant 800 for determining a head pose of a user; FIGS. 9A-C illustrates examples of Siri determining the head pose of the user in relation to HomePod and whether or not to respond to the user. In the case of FIG. 9C, Siri will pause if unsure the question was directed at HomePod and await the user to clarify by starting the trigger phrase 'Hey Siri.'
In one example, as shown in FIG. 9A above, the user provides the utterance "What's the weather like in Palo Alto?" which Siri receives as part of audio input. Siri may then analyze the audio input to determine whether the direct-to-reverberant energy ratio exceeds the threshold of -60 dB in the 1.5 kHz to 3.5 kHz sub-band and determine that the user is addressing HomePod and Siri with the utterance "What's the weather like in Palo Alto?" and respond accordingly.
To review the deeper details of Apple's invention, review patent application 20210281965. One of the listed inventors is Sreeneel Maddika, a Siri Speech Engineer.