Google Duplex: On the Leading Edge of Conversational AI

  • Conversational AI

What are the implications of the advances made by Google Duplex?

Google is working to make smart phones, well, smarter, with Google Duplex. The feature is an artificial intelligence (AI)-powered voice agent that works as an add-on to Google Assistant. Smartphone users can, for instance, ask Assistant to book a restaurant table, and it’ll make the reservation, using Duplex to speak with whomever answers the phone at the restaurant.

While its current function may seem mundane, Duplex has significant implications. When Google Duplex makes a phone call, its voice sounds remarkably similar to a human — even injecting pauses and fillers like “um.”

“When people talk to each other, they use more complex sentences than when talking to computers,” says Yossi Matias, engineering vice president at Google. “The Google Duplex system is capable of carrying out sophisticated conversations and it completes the majority of its tasks fully autonomously, without human involvement.”

Google Duplex marks the next advancement in natural-sounding, fully-autonomous AI assistants. According to one report, Duplex is available on Google’s own Pixel handsets, and on every iPhone device running iOS 10 or above. Users of other Android handsets can also use the voice agent if they their OS is 5.0 or above. The feature now allows users in 43 states to make restaurant reservations via Google Assistant.

A new threshold for AI-human interaction

With Duplex, Google has established a new threshold for AI-human interaction. The technology is able to understand complicated sentences, pauses, interruptions, and quick talk. This is a “as a major accomplishment, particularly from a linguistic perspective, and an affirmation of the importance of conversational UI,” says Dennis R. Mortensen, CEO of a company that uses AI for scheduling.

The natural-sounding human voice capability also enables Duplex to deliver other benefits. For example, the voice agent gives users with disabilities, such as impaired speech, the opportunity to make a smooth reservation at their favorite restaurant.

Duplex also has the ability to make the call, check what slots are available, and send a confirmation email. In addition to smartphone users, it can help businesses that don’t have their own online booking system to increase reservations. This can have a considerable impact, as Google’s research reveals that 60 percent of small businesses that depend on customer bookings aren’t using an online reservation system.

Handling ethical concerns

Duplex is capable of managing reservations and appointments without any human intervention, and while that excites many AI experts, some are concerned about the technical implications of the technology. Chief among those concerns is that Duplex might be considered to be deceiving call recipients into thinking they are talking to a real person.

“For the person receiving the call, they are working for a business that doesn’t want to allow automated booking in some way,” says Chris Butler, chief product architect at IPsoft.“A great question would be whether businesses should be given the option of opting out of this type of interface. Sure, it would be potentially customer hostile but that is their choice on how to run their business.”

Google has responded to critics by explaining that it has designed Duplex with disclosure built-in. Moreover, Scott Huffman, a vice president of engineering for Google Assistant, told Bloomberg that one way that the agent could reveal itself is by saying something such as “I’m the Google Assistant and I’m calling for a client.”

In one recent test of the software, Gizmodo confirmed that the AI voice agent does, in fact, identify itself as an automated assistant. “When one person asked if they were talking to a person or a computer, Duplex responded by identifying itself as an automated assistant,” the report noted.

Mining conversational AI for data insights

Developments like Duplex aren’t just changing the game for how people use devices and connect with businesses. They’re also changing the way vendors store and analyze data. Conversational data is quickly becoming the key driver for companies that want to get to know their audience and offer a better user experience. The most advanced tech companies are integrating conversational data with external systems and applying semantic analysis to meet the growing expectations for human-like conversations.

Microsoft, for instance, is using technology from its 2018 Semantic Machines acquisition to make its voice agents more natural and capable. Its latest social chat bot, Xiaolce, is the only other conversational AI that has voice-sense built-in. The overarching goal is to use the power of machine learning to enable users to access, interact and engage with information in a much more natural manner, and with minimum effort.

“Microsoft has driven research and breakthroughs in the fundamental building blocks of conversational AI, such as speech recognition and natural language understanding, for more than two decades,” says David Ku, chief technology officer of artificial intelligence and research at Microsoft. “Combining Semantic Machines’ technology with Microsoft’s own AI advances, we aim to deliver powerful, natural and more productive user experiences that will take conversational computing to a new level.”

Analyzing the conversational data generated by robust voice agents will reap big rewards for companies. This is because when people communicate in a natural, more conversational tone, they’re providing information about more than just the words they’re speaking. Their feelings, views, preferences, and inclinations are all part of the conversation. But conversational data must be interpreted within its native context before an organization uses it for actionable insight.

Integrating local processing units with external systems is key for improving customer satisfaction, increasing personalization and business agility. Of course, as enterprises begin to develop the new benefits to be gleaned from the use of conversational data, as with all emerging elements of the Data Age it’ll be crucial for these companies to develop the processes and the IT infrastructure necessary to share these data assets across their organizations.


About the Author:

John Paulsen
John Paulsen is a "Data for Good" advocate, with nearly 20 years in the data storage industry. He's helped launch many industry-firsts including HAMR technology, 10K-rpm and 15K-rpm hard drives, drives designed specifically for video and for gaming, Serial ATA drives, fluid dynamic HDD motors, 60TB SSDs, and MACH.2 multi-actuator technology.