Apple to Advance iPhone with both Text-to-Speech & Speech-to-Text Converters
According to a new Apple patent application that was published this morning by the US Patent and Trademark Office, future iPhone's are likely to provide end users with effective new ways of communicating in both noisy environments like a restaurant or even during a quiet office meeting without stirring a mouse. The system involves using new text-to-speech and speech-to-text converters as well as providing a means of sending prerecorded notifications to the caller if you're unable to speak when answering your phone. I think that many will appreciate these new features and only hope that Apple could get these to market in good time.
Problem One: Communicating in Noisy Environments
A smartphone user may sometimes have to make or answer a phone call in a noisy environment. Noise could interfere with a phone conversation to a degree that the conversation is no longer intelligible to either conversing party. A user in the noisy environment may try to scream into the phone over the noise, but the screaming and the noise may render the voice signal unintelligible at the other end.
For example, a user may be talking on the phone in a busy restaurant. The user may not be able to shout loud enough into the phone to cover the noise in the restaurant. The user may not even be able to hear when the other end is talking. The noise may render the conversation unintelligible and may lead to a termination of the telephone conversation.
Problem Two: Unable to Communicate During a Meeting
In another scenario, it may be inconvenient for a user to talk on a phone. For example, users may be in a meeting and don't want to draw attention to themselves by speaking into the phone. The users may try to whisper into the phone, but the whispering may render the conversation unintelligible. The users may choose to send a text message to the other party, but the other party may be on a landline where texting is unavailable, or may not have a texting plan. It could be frustrating to conduct a telephone conversation when the environment is noisy or the circumstance is inappropriate for a user to speak.
Apple's Solutions
Converting Speech to Text & Prerecorded Notifications
One embodiment of the invention is directed to an iPhone which establishes an audio connection with a far-end user via a communication network. The communication device receives text input from a near-end user, and converts the text input into speech signals. The speech signals are transmitted to the far-end user using the established audio connection while muting audio input to its audio receiving component.
In one embodiment, the communication device detects the noise level at the near end. When the noise level is above a threshold, the communication device could automatically activate or prompt the near-end user to activate text-to-speech conversion at any point of a communication such as a phone call. Alternatively, the communication device may playback a pre-recorded message to inform the far-end user of the near-end user's inability to speak due to the excessive noise at the near end.
In another embodiment, the near-end user can activate text-to-speech conversion whenever necessary regardless of the detected noise level. The near-end user could enter a text message, which is converted into speech signals for transmission via the established audio connection to the far-end user.
In yet another embodiment, the communication device could also perform speech-to-text conversion to convert the far-end user's speech into text for display on the communication device. This feature could be used when the far-end communication device cannot, or is not enabled to, send or receive text messages. The speech-to-text conversion and the text-to-speech conversion could be activated at the same time, or could be activated independent of each other. The far-end communication device communicates with the near-end communication device in audio signals, regardless of whether the speech-to-text conversion or the text-to-speech conversion is activated.
The Proposed Communications System
Apple's patent FIG. 1 is a diagram illustrating a communication environment in which a near-end communication device (e.g., a near-end phone 100) is engaged in, or about to be engaged in, a communication (e.g., phone call) with a far-end communication device (e.g., a far-end phone 98) via a communication network (e.g., wireless network 120). The term "communication device" broadly refers to various real-time communication devices, e.g., landline telephone system (POTS) end stations, voice-over-IP end stations, cellular handsets, smart phones, computing devices, etc.
In one embodiment, the microphone (113) could be used to monitor the noise level in the environment surrounding the near-end phone 100. In an alternative embodiment, a separate microphone could be used to monitor the environmental noise. A noise meter (152) may be shown on the display screen to indicate the detected noise level. The noise meter may be shown when a phone call is made or received, when the noise level reaches the vicinity of a pre-determined threshold, or as long as the near-end phone is powered on. The noise level may be indicated by the noise meter by colors, numeral values, height or length of a bar indicator, etc.
About Apple's patent FIG. 4: in response to the detection of the relative or particular noise level at the near end, the near-end phone displays a number of options for the user to choose. The options may include: text-to-speech, two-way text, play (pre-recorded) message, and voicemail. The user may select one of these options using a physical button or a virtual button. In one embodiment, the near-end phone also displays the noise meter on the display screen to provide a visual indication of the environmental noise level at the near-end.
About Apple's patent FIG. 5: if the near-end user selects the text-to-speech option, the display may show "TEXT TO SPEECH" to indicate that the text-to-speech conversion has been activated. The near-end user may use a physical keyboard or a virtual keyboard to input text messages. The display also shows an outgoing message area that displays the text entered by the near-end user. As the near-end user inputs the text, the text-to-speech converter automatically converts the text into speech. The near-end phone transmits the converted speech signal to the far-end user, utilizing the audio connection that has already been established between the near-end user and the far-end user.
About Apple's patent FIG. 6: if the user selects the two-way text option, the display may show "TWO-WAY TEXT" to indicate that both of the text-to-speech and speech-to-text conversions have been activated. The near-end user may use a physical keyboard or a virtual keyboard to input text messages. The display shows an incoming message area 612 for displaying the text converted from the far-end user, and an outgoing message area 613 for displaying the text entered by the near-end user. The established audio connection carries two-way voice signals between the near-end and the far-end users. The conversions from text to speech and from speech to text are performed by the near-end phone. The far-end user could speak to the far-end phone in the same way as in a normal telephone conversation that does not involve text messages.
Apple credits Baptiste Paquier, Aram Lindahl and Phillip Tamchina as the inventors of patent application 20110111805, originally filed in Q4 2009.
Related Patent: Phone Hold Mechanism
A secondary iPhone patent published today relates to the one noted above in respect to a new incoming call hold mechanism. According to Apple, the iPhone will hold an incoming call for a user when the user is temporarily unavailable to pick up the call. In response to an incoming call signal and an indication from the user to hold the call, the iPhone will answer the call and play back a pre-recorded message to the caller while holding the call.
The call could also be held until the user picks up the call. If the user is on another call when the incoming call arrives, the iPhone answers the incoming call with a pre-recorded message and holds the incoming call, as well as concurrently maintaining uninterrupted communication on the in-progress call. The user could also enter an estimated hold time, which is announced to the caller.
Apple credits Craig Pietrowx as the sole inventor of patent application 20110111735.
Notice: Patently Apple presents only a brief summary of patents with associated graphic(s) for journalistic news purposes as each such patent application is revealed by the U.S. Patent & Trade Office. Readers are cautioned that the full text of any patent application should be read in its entirety for further details. Patents shouldn't be digested as rumors or fast-tracked according to rumor time tables. Apple patents represent true research that could lead to future products and should be understood in that light. About Comments: Patently Apple reserves the right to post, dismiss or edit comments.
Here are Some Great Community Sites that are Covering our Original Report
MacSurfer, Apple Investor News, Google Reader, UpgradeOSX, iPhone World Canada, Macnews, CBS MarketWatch, MacTechNews Germany, 4Arab Arabic, Olahar Digital Brazil, MacTrast, WebProNews, Mashable, Tech2, iPodNN, Amanz Malaysian, Techmeme, TUAW, MacDailyNews, Digital Trends, Inc., WebProNews, Yahoo News!, Aberto ate de Madrugada Portugal, CitiPrice, CNET, CNET Japan, Wayerless Spain, iPhoneMy, iDevice Romania, Free SEO Advice, Superdownloads Brazil, App Advice, MacRumors, Silicon News Spain, RazorianFly, Blognone Thailand, MacFreak Netherlands, Italiamac Italy, MacWorld Sweden, BugOnline Croatia, Bce 06 iPhone Russia, Xsellize, Green Poison, GeoHot, Alldeaf, iPhone Download Blog, ApfelTech Germany, Mac4Ever France, Movemento Milenio Portugal, Cyber Pais (El Pais) Spain, SlashGear, Engadget, Engadget China, Calling all Geeks, iPhoneros Spain, OE24 Austria, i-ekb Russia, Apfel+Z Germany, Phones Review UK, One More Thing Netherlands and more.
Note: The sites that are linked to above offer Apple community members with an avenue to make comments about this report in many original languages. Additionally, many of these sites provide our guests with different takes on any given patent or concept that is presented in our reports to make it more fun, interesting and/or personal. If you have the time, join in!
very very cool, useful information
Posted by: dred | May 17, 2011 at 07:26 PM
FINALLY, I was wondering why nobody introduced software like this before. Wake up companies... this shouldn't have taken so long!!
Posted by: Jörg | May 13, 2011 at 12:24 PM
@ Michel, what a great idea about providing subtitles. I have French relatives and I don't speak much French. Would I appreciate subtitles - yes of course. Maybe someone at Apple will take up that challenge. Cheers - Michel!
Posted by: Jack Purcher | May 13, 2011 at 11:13 AM
This is very cool. Speaking many languages (English being my 3rd language), it is sometimes difficult to hear/understand people (or be understood) in noise locations. I think the next step is to provide this feature on video conferences (re: offshore/remote teams) and provide subtitles and automated translation. Sounds like a Star Trek episode but I can definitely see business value in this.
Posted by: Michel R. | May 13, 2011 at 10:55 AM