How This Company Is Making Customer Service Calls Less Painful
Cogito’s algorithms read nonverbal cues better than you can. Welcome to the age of A.I.-driven empathy.
Joshua Feast has a distinct conversational tic: a nervous laugh that escapes when he's worried something he says might come across as self-serious or highfalutin. Like, for instance, when he mentions that he was the first New Zealander to be selected as a Fulbright Scholar in entrepreneurship, or tells you he once spent two weeks in a coma after contracting malaria in Indonesia, or says he was among the highest-scoring secondary-school students on a national math exam.
Feast grew up in the suburbs of New Zealand's capital, Wellington, where it's considered unwise to act like a big hairy deal. Kiwis enjoy nothing more than cutting a "tall
poppy" down to size. So Feast, a slender, bespectacled 40-year-old, is studiously careful at all times to deflect credit and minimize his importance. "I'm not very comfortable talking about myself sometimes," he says, with appealing humility that stops just the right distance short of false modesty.
You can't help analyzing someone's personal affect after talking to Feast, because that's his business. Feast is co-founder and CEO of Cogito Corporation, a Boston-based software startup that uses artificial intelligence to measure and improve the quality of certain key conversations, such as sales and customer-service calls, in real time. Or, as Feast puts it, Cogito--which is Latin for "I think"--"helps people be more charming in conversation."
"Charm" might sound like a hard thing to quantify, and it is. But Cogito's other co-founder, MIT professor of media arts and sciences Alex "Sandy" Pentland, has spent the past 20 years doing just that. Through experiments in his Human Dynamics Lab, Pentland has shown how unconscious, nonverbal "honest signals"--he wrote a book with that title in 2008--including tempo, emphasis, and mimicry, influence the outcome of interactions like salary negotiations, team meetings, and romantic courtships. "Language is culturally something that's relatively new to humans, but before language we were already social beings," says Pentland, a polymath and futurist whose expertise extends to evolutionary psychology, artificial intelligence, and data science. "We're all sort of brainwashed to think it's about the words." Thanks to Pentland's work, we now know the mechanisms of human connection can, in fact, be boiled down to a set of mathematical equations. Those equations are what Cogito is built on.
Cogito occupies two floors of a decidedly unstylish office building in downtown Boston. Amid overflowing cubicles and faded yellow walls, I watched over the shoulder of product marketing manager Eli Orkin as he handled a Cogito-assisted simulated customer-service call from one of his colleagues, Channah Rubin. As Rubin attempted to tell a story about the problems she was having obtaining a new credit card, Orkin repeatedly cut her off, until a discreet slide-in notification appeared on his computer screen: "Frequent overlaps." Orkin overcorrected, letting Rubin ramble on, until he got a second prompt reading "Slow to respond." A long, somewhat condescending monologue earned him a "Speaking a lot" warning.
During the call, a color-coded meter in the corner of Orkin's screen offered a running gauge of how well it was going, shading to yellow and orange when he responded too abruptly or slowly and back to green when normal give-and-take was restored. "Conversations are like a dance," Feast explains. "You can be in sync or out of sync." Had Rubin become truly upset, Orkin would have seen an "Empathy" prompt, a cue to say something reassuring. But just acting upset wouldn't do it. No one at Cogito can fool the software, which analyzes hundreds of signals to tell if distress is real.
Cogito, which has 75 employees, counts among its clients three of the five largest U.S. health insurance firms, two of the five largest disability insurers, and some of the biggest credit card companies. Cogito also has a mental health care product, an app called Companion, that's used by nurses and social workers in private and Veterans Health Administration hospitals to flag patients showing signs of PTSD and depression. "Our dream would be to take advantage of this for a lot of important conversations," says Feast. "It could be negotiations, it could be meetings, it could be improving dating experiences." Anything, he says, where there's a need to "be more emotionally intelligent in real time."
"In some ways, we think of ourselves as a cyborg company."
A quintessentially Homo sapiens trait is empathy--the ability to intuit and be moved by others' emotions--and it can be crucial for persuading, consoling, or seducing. Yet the data is clear: We humans aren't the empathy prodigies we think we are. In the modern workplace, with its high attention demands, packed schedules, and long hours, we're pretty bad at it.
In cognitive psychology, assessing others' thoughts and feelings through nonverbal cues is called person perception. Some people do this easily. Others find it impossible. Most of us muddle along somewhere in the middle.
And when it comes to assessing our own person-perception skills, most of us are rank amateurs. "If I were to ask a whole group of people, 'On a scale of one to five, how good do you think you are at recognizing social signals in others?' nearly all would rate themselves a four or a five," says Feast. He's sympathetic to the deluded. "One of the big problems we have is we don't get much feedback on it in our daily lives. If you think I'm being kind of rude right now, you're probably not going to tell me." In the absence of feedback, it's hard to improve--and easy to think you don't need to.
That's where Cogito's promise lies. If it lays bare our limitations, it's only by showing how, for the first time, we have the tools to transcend them. In Feast's view, the pessimists who see A.I. making humans obsolete--those who worry over research that shows A.I. may unemploy millions over the next several years--are overlooking how much smarter and more productive and creative we can be with its help. How much more human it can make us, to put it bluntly. "In some ways, we think of ourselves as a cyborg company," Feast says, "helping humans be their best selves."
We humans aren't the empathy prodigies we think we are.
In New Zealand, there's a saying: "You can make anything from No. 8 wire"--sheep farmers use it for fences. "No. 8 wire mentality" is Kiwi shorthand for a can-do, scrappy spirit.
Feast's family embodied No. 8. His grandfather founded a large construction business, and his father and uncle ran a number of enterprises in real estate and development. Feast too felt the pull of building, albeit in a digital realm. He earned an undergraduate degree in computer engineering, and one of his first jobs was working as a tech consultant for Accenture. Among its clients was New Zealand's Department of Child, Youth, and Family, which faced two huge challenges. First, even the best social workers often lasted only three to five years before burning out, a phenomenon known as compassion fatigue. Second, below-average social workers had trouble improving, because their work was hard to quantify.
Feast was still ruminating on those lessons when, in 2005, he won that Fulbright Scholarship in entrepreneurship, which allowed him to study at MIT. One of Feast's courses there was a seminar consisting of guest lectures curated by Pentland. For the past few years, the shaggy-haired, hyper-energetic Pentland had been focusing his investigations on those unconsciously transmitted honest signals. Pentland and his team were interested in how two strangers adopt each other's physical postures and intonations as they grow comfortable together, and how a speaker using a steady pattern of emphasis leaves an audience with the sense that he or she is well informed.
Among Pentland's plaudits is this: He's considered a pioneer of wearable computing. (One of his grad students created Google Glass.) To conduct their inquiries, Pentland's team built a wearable device they called a "sociometer," a shoulder-mounted pack, roughly the size of an iPhone, whose sensors gathered data about speech and movements
during interactions. In experiments, they demonstrated that those honest signals
(the term, borrowed from evolutionary biology, refers to behaviors that are hard to fake) could be used to accurately predict the outcome of salary negotiations, group decisions, and speed dating.
As Feast learned about his professor's research, he realized that software built using Pentland's findings could help social workers see if they were building trust with their clients, or determine which psychiatric patients need emergency counseling. "I thought the potential was unbelievable if we could take it to market and make it a thing people could use," he says.
Pentland has helped found more than 20 startups based on his research, and he loved the idea. He'd grown up close with his grandmother, who ran Michigan's first inpatient institution for children with psychiatric problems. "I actually learned to read in a mental institution," he says. He signed on and helped recruit one of his former PhD students, Ali Azarbayejani, to be CTO.
But all they had were the theoretical underpinnings of a business--nothing close to a prototype, much less a product. It was what venture capitalists call a science project. "You've got all this academic research that can't move forward because the chasm between experimental results and a fundable technology is too wide, too risky," says Feast. Yet they were in luck. VCs hate throwing money down a hole without knowing how deep it is. But one institution didn't mind.
The Defense Advanced Research Projects Agency--Darpa--is an arm of the U.S. Defense Department that incubates emerging technologies. "It's like a venture capital firm without a profit motive," says Russell Shilling, a former Darpa program manager who now develops educational tech at a nonprofit called Digital Promise. "You're trying to think about what might be possible in 10 to 20 years and then create a program to build it in three to four." In 2010, Shilling was tasked with scouting out technologies that could be used to flag soldiers returning from Afghanistan and Iraq who might be showing early signs of PTSD. Feast's idea was a perfect fit, and Pentland's reputation helped their application secure an easy approval.
The technical challenge they had embarked on was indeed daunting, requiring models for turning speech, with all its nuances and inflections, into neatly labeled data that can be fed into machine-learning algorithms, which would then try to extract behavioral patterns from it. Looking for hints of emotional states in raw audio is an order of magnitude more difficult than speech recognition: A word has a beginning and an end. Clues to its meaning can be derived from the words around it. But signs of a speaker's depression might be scattered throughout a long conversation.
Cogito hit all of the mileposts in its Darpa proposal, and in 2012, it had Feast's dream product. Called Companion, it's an app nurses, psychologists, and social workers can use to monitor the psychological states of their patients, who record and upload audio diaries. It's been a useful tool. "This has enormous promise in changing the way we do mental health care as well as medical care," says David Ahern, director of behavioral informatics and e-health at Boston's Brigham and Women's Hospital. Ahern has been overseeing a three-year study involving more than 200 patients who suffer from physical and psychiatric disorders. Research has shown that medical patients who develop emotional health problems cost more to treat and respond less well to treatment; as a result, says Ahern, "it behooves the medical system to do a better job in detection and treatment of behavioral health because it drives outcomes and drives the costs." While his study's results aren't yet in, feedback from patients and clinicians has been positive. Some practitioners believe Companion has helped avert suicides.
But Feast quickly came to see it was a product with limited revenue potential. In the convoluted world of health care, insurers often pay more readily for a treatment, even an expensive and ineffective one, than for a preventive service. "Making a business purely around making people more well isn't always a good proposition," Feast says, delicately. There were much bigger markets to crack.
Companies have their own version of the person-perception problem. A big brand might have tens of thousands of employees handling customers' calls and complaints. Each of those contacts is an opportunity for the customer to form an impression, positive or negative. Yet each is handled by a customer-relations representative who--like all of us--probably overrates his or her skill at basic things like listening, demonstrating empathy, and establishing trust. Geeta Wilson, vice president of consumer experience at the health insurer Humana, says she lives in fear, as do many executives, of this "discrepancy between how we think of ourselves internally and how our customers think of us." She adds, "You don't want to be in a place where you're rating yourself better than your customer is."
But unlike us overconfident individuals, companies such as Humana put a lot of money and effort into closing that gap. They ask callers to stay on the line and take automated surveys. They send email questionnaires and conduct focus groups to calculate metrics like CSAT (customer satisfaction), NPS (Net Promoter Score--the willingness to recommend a brand to others), and VOC (voice of the customer, derived from surveys and focus groups). Calls to a contact center are recorded, and random samples of each employee's interactions are reviewed for quality.
These methods have shortcomings. Surveys and reviews are slow, and small sample sizes skew findings. "Every human assessment, it's always going to have some margin
of error," says Wilson. "Even if you're really good at that as a company, you'll always be off."
In 2016, Humana tried something new. Wilson, who had heard about Cogito from Christopher Kay, Humana's chief innovation officer, offered 200 of her call-center associates as guinea pigs in a test of a new product. In a six-month pilot study, these associates took all of their calls using Cogito's real-time conversation-analysis tool. The results were hard to ignore: Customers whose calls were handled by those using the Cogito app reported a 28 percent higher NPS. Issue resolution improved by 6 percent, while average call time and escalations--when callers demand to speak with managers--both went down. Humana is now in the process of rolling out Cogito to thousands more of its customer-relations associates, and it's running a second pilot, this one of Cogito's application tailored for use in sales.
Call-center agents have a lot in common with social workers: They don't last either. The average call center's turnover runs from 30 to 45 percent annually. Agents also have compassion fatigue, but theirs builds up over hours, not years. Thanks to online self-service tools and chatbots, most easy queries never make it to a call center. The questions that do come in are often difficult and fraught with emotion.
What customers want, above all, is to feel that they're in good hands, says Douglas Kim, Cogito's chief revenue and customer-success officer. The sense that someone knows what he or she is talking about or cares about what you have to say is exactly the sort of thing conveyed primarily through nonverbal signals, Pentland's research has shown. But maintaining the behaviors that send those signals, such as answering questions without hesitation, gets increasingly hard as cognitive fatigue creeps in over the course of a shift; to excel, workers need something stronger than a cup of coffee. "I use an analogy," says Kim. "It's like when you drive a newer car and you have lane assist, blind spot monitoring, and collision avoidance. Those things aren't driving for you, but they are enhancing your awareness of the situation."
The technical challenges in making a tool for real-time call analysis introduced new layers of difficulty. The platform now had to make sense of the interplay of two voices, and do it quickly enough to allow for useful interventions. While such interventions are purposely minimal--Cogito is wary of increasing agents' cognitive load with additional distractions--the processing behind them is wildly complex. When Cogito's deep-learning algorithms listen in on a call, Feast says, "we're basically simulating having a bunch of people listen to that call and decide whether the customer is satisfied." A human intuitively knows the difference between two people interrupting each other in an argument and two people finishing each other's sentences because they're simpatico. Teaching software to make that distinction was a complicated task.
That difficulty is what attracted Scott Maxwell, managing partner of Boston-based venture capital firm OpenView Venture Partners. In a meta twist, OpenView uses software to perform a role once played by people: To scout startups worthy of investment flying under the radar, it has a tool that scours the internet for signs of rapid growth. Like pretty much every VC these days, Maxwell has been looking at many A.I. companies. But two things convinced him to lead Cogito's $20 million B round in 2016: The roots of its work stretched back 20 years, which made him confident no one could easily replicate it, and its sales traction. "A lot of these companies have great promise, but not that many of them have converted that promise into great customer value," Maxwell says.
That Cogito helps companies keep customers happier--and, maybe soon, wins them more sales--makes it valuable. But Feast is proudest of what it does for those on what he calls "the frontlines"--call-center agents. Rather than being judged on a few calls arbitrarily selected from hundreds, they now have a score, rooted in science, attached to each one. Supervisors can slice and dice the data to provide coaching. This avenue toward improved performance gives workers, who typically earn $12 to $16 an hour, a sense of control and job security they've historically lacked, says Feast.
That sets Cogito apart from other work-force-management tools, which often treat employees as would-be malingerers to be policed. "It's really important for technologists to come up with something that's a win-win-win," he says. "It's easy to come up with something that's creepy. Really easy."
It is, which is why many wonder if systems like Cogito will displace human workers from fields in which they now seem indispensable. The Pew Research Center recently canvassed 1,400 technologists and futurists on how technology will transform the work force. A sizable contingent predicted there will be vastly fewer or no jobs for humans within a few decades. "Their basic thought is robotics and A.I. are going to weave their way into more and more skills that used to be human skills," says Lee Rainie, Pew's director of internet, science, and technology research. "They think this is a broad, systemic, global phenomenon that will potentially touch on every imaginable business."
Pentland, whose record as a prognosticator is as good as anyone's, believes A.I. will make people more employable, not less. He likens Cogito to skilled negotiators bringing along an assistant when making high-stakes deals. While the negotiator focuses on the terms of the deal, the assistant watches participants' body language and passes notes.
"Human-machine systems often beat the A.I.," he says. "The best chess players are not machines--they're humans and machines together. The humans do the strategy, the machines do the tactics."
Far from shrinking, employment in call centers stands at an all-time high of 5 million. Talkdesk founder Tiago Paiva predicts that figure will continue to rise as companies increasingly seek to excel on customer service. "If agents can solve problems faster, we might get to a point where you don't need as many agents," says Paiva, whose company makes call-center software. "But it's not going to be significant, because the customer is calling so much more now."
"The best chess players are not machines," says co-founder Pentland. "They're humans and machines together."
Then again, just as we humans overestimate our empathy, we may also overrate machines' rationality. "A.I. is really good at solving well-posed problems," Feast says. "But it's just a statistical equation that looks at bits of data. That's all it is. It doesn't have a model of the world. Humans are fantastic at asking questions and posing problems. Most things in human existence are a collaboration."
Murray Campbell agrees. He's a distinguished IBM engineer who was on the team that built Deep Blue, which, in 1997, became the first computer to defeat a world champion chess player, Garry Kasparov, in match play. A.I. systems have come a long way since, Campbell says, but "they still have gaps--blind spots where people are better. For now, and I believe for decades to come, these gaps will have to be filled by people." The kind of deep learning that powers Cogito is great at very rapid judgments--identifying faces in a photo or translating text--but it's a long way from mastering reasoning, intuition, or strategy. For the foreseeable future, Campbell says, "I think people will just start to perform their jobs at a higher level," thanks to A.I. tools.
In other words: We'll all be cyborgs.
Charm school tips from a computer
Joshua Feast believes artificial intelligence applications like Cogito will soon be available to aid in all the important conversations in your life--job interviews, first-date flirting, buying a house. Until then, though, take the advice of Cogito's algorithms for some insights you can use right now.
Lead with consistency
One of the "honest signals" that shapes almost any interaction is emphasis--the amount of energy with which a speaker delivers his or her message. In experiments, Sandy Pentland has shown that speakers who maintain a consistent level of emphasis are seen as steadfast in their motivation and single-minded in their focus. Such speakers are viewed by others as strong leaders. Variable emphasis is a sign of indecision or wishy-washiness.
Persuade with mimicry
When two people are in conversational harmony, they unconsciously mimic each other's words, intonations, and gestures, a result of so-called mirror neurons that fire in response to others' behaviors. Pentland's research has shown that the effect works in the other direction, too: A speaker who subtly mimics his or her conversation partner is rated as more interesting, honest, and persuasive.
Soothe with recognition
Cogito uses readings of vocal-cord tension to detect when a caller is becoming excessively agitated. When this happens, the agent sees an empathy cue and is trained to respond by simply acknowledging the caller's feelings. Just saying "You sound a little frustrated" can be enough to ratchet down the tension, says Feast. "Even if you say, 'No, I'm feeling fine,' you'll immediately give me a ton of social credit for recognizing and caring."