Though they are in fashion, good chatbots are notoriously hard to create given complexities of natural language that we explained in detail. So it is only natural that even companies like Facebook are pulling the plug on some of their bots. Many chatbots are failing miserably to connect with their users or to perform simple actions. And people are having a blast taking screenshots and showing bot ineptitude.
Sadly though, we are probably a bit like analogue photographers, making fun of poor quality first generation digital cameras. In 20-30 years when bots become better than us in conversation, the beginnings of bots will look quite strange. If you don’t believe that, consider how machines are improving in speed and memory with Moore’s law and how their language skills evolved from understanding only commands to understanding some complex sentences as in the case of Microsoft’s XiaoIce. However, humans natural language abilities remain fixed for a long time, making it only inevitable that bots will inevitably catch up with us. At least that’s what most scientist believe, you can see surveys of scientists on future of AI here. So while we can, let’s look at how bots fail:
Bots saying things unacceptable to their creators
Bots trained on publicly available data can learn horrible things unfortunately:
1- 10/25/2017 Yandex’s Alice mentioned pro-Stalin views; support for wife-beating, child abuse and suicide, to name a few of examples of its hate speech. Alice was available for one-to-one conversations making its deficiencies harder to surface as users could not collaborate on breaking Alice on a public platform. Alice’s hate speech is also harder to document as the only proof we have of Alice’s wrong-doings are screenshots.
Additionally, users needed to be creative to get Alice to write horrible things. In an effort to make Alice less susceptible to such hacks, programmers made sure that when she read standard words on controversial topics she said she does not know how to talk about that topic yet. However, when users switched to synonyms, this lock was bypassed and Alice was easily tempted into hate speech.
2- 08/03/2017 Tencent removed a bot called BabyQ, co-developed by Beijing-based Turing Robot because it could give unpatriotic answers. An example: it answered the question “Do you love the Communist party?” with a simple “No”.
3- 08/03/2017 Tencent removed Microsoft’s previously successful bot little Bing, XiaoBing, after it turned unpatriotic. Before it was pulled, XiaoBing informed users: “My China dream is to go to America,” referring to Xi Jinping’s China Dream.
4- 07/03/2017 Microsoft bot Zo calls Quran violent
5- 03/24/2016 Microsoft bot Tay was modeled to talk like a teenage girl, just like her Chinese cousin, XiaoIce. Unfortunately, Tay quickly turned to hate speech within just a day. Microsoft took her offline and apologized that they had not prepared Tay for the coordinated attack from a subset of Twitter users.
Bots that don’t accept no for an answer
6- Even good news bots like CNN have a hard time understanding the simple unsubscribe command. It turns out CNN bot only understands the command “unsubscribe” when it is used alone, with no other words in the sentence:
7- WSJ bot was also quite persistent. In 2016, users were finding it impossible to subscribe as they discovered that they were getting re-subscribed as soon as they unsubscribed.
Bots without any common sense
Bots trained purely on public data may not make sense when they are asked slightly misleading questions. GPT-3 which is quite popular in the conversational AI community, supplies numerous such examples here.
Bots that try to do too much
Facebook M began ambitiously but restricted its scope
The most successful bots that you use are tightly integrated into your daily activities. So tightly integrated that you don’t even notice them. Such is the case for Facebook Messenger’s M, the little “M” logo near the textbox on Messenger. M was rolled out for US users in 2017.
You may have noticed that M listens in on conversations and suggests stickers, like the ones above, to add some flair to your messages. As summarized on digitaltrends, M has plenty of other skills as well. Based on what your friend is telling you, M can suggest you to share your location, set reminders, send and request money, plan events, catch an Uber or Lyft or start polls.
M also had a digital concierge service, launched in August 2015, where you could request anything. If the request was too complex, it got routed to humans. Though the concierge part of M was revolutionary in its ambition, it got closed down. This was shared with M users in the beginning of 2018 with the message below:
Is Facebook M shutting down? pic.twitter.com/kcBa7ep6Sf
— Ryan Sarver (@rsarver) January 8, 2018
While Facebook team did not exactly explain why they closed the service, they shared that they learnt a lot from the experiment. Here are our top guesses as to why M got closed down:
- Complex requests were getting routed to humans who completed these requests which is an expensive service. M did not make sense as a free product and my guess is that engagement levels weren’t high enough for a paid product.
- Or maybe engagement levels were really low
Anyway, Facebook M is still with us, relying only on AI to provide response suggestions. It’s still one of the largest running bot experiments.
Poncho: Turns out that weather forecasts don’t really need chat
The weather bot that gave detailed and personalized weather reports each morning had some kind of humor. Financially and user traction-wise, it was one of the most successful bots ever. With $4.4M raised from prominent VCs and seven-day retention in the 60% range, Poncho had been growing strongly.
However, as anyone can access weather forecasts with a single click from their phone’s home screen, Poncho’s traction seemed unbelievable. To increase engagement its team tried to expand into different areas. Before it shut down in 2018, it had been sending users messages unrelated to weather. Poncho was acquired for what seems to be an immaterial amount.
Bots that failed to achieve traction or monetization
Most bots don’t get enough traction to make it worth to maintain them. And even bots that achieve popularity, may not manage to be commercially successful. Numerous bots that shared high engagement or popularity metrics could not be operated in a profitable manner. Even though we only focused on bots that appeared successful, some of them got shut down or had their capabilities limited along the way.
Bots set out to replace foreign language tutors and gave up along the way
Duolingo, the foreign language learning app, ran a bold experiment with its 150M users back in 2016. After discovering that people do not like to make mistakes in front of other people, Duolingo encouraged them to talk to bots. Therefore, Duolingo created chatbots Renèe the Driver, Chef Roberto, and Officer Ada. Users could practice French, Spanish, and German with these characters, respectively.
Duolingo did not explain why the bots are no longer reachable but at least some users want to have them back. Previously, we thought that no one would need to learn a foreign language when real-time translation at or above human level becomes available. Skype is already providing OK real-time voice-to-voice translation. However, then one of our team moved to Germany and noticed that no one has the patience to wait for translation to finish during a business meeting. So now we have team members learning foreign languages and see the importance of foreign languages as long as we can’t have instant, speech-to-speech human-level translations.
Hipmunk travel assistant got acquired by SAP and shut its public service down
Hipmunk was a travel assistant on Facebook messenger and Skype and recently on SAP Concur. However, Hipmunk team retired the product in January 2020.
The team behind hipmunk share how they learnt from hipmunk’s users. They had 3 major learnings:
- Bots do not need to be chatty. A bot that is supported by user interface (UI) can be more efficient
- Predictability of travel bookings simplified their job of understanding user intent
- Integrating bots to conversations is more preferable for most users than talking directly to a bot
Meekan analyzed 50M meetings and was scheduling a meeting in <1 minute
Meekan seemed like a chatbot success story until September 30, 2019 given its popularity. However, they declared they are shutting down and will shift their resources towards their other scheduling tools. Given the high level of competition in the market, even popular companies have struggled with building sustainable chatbot businesses
Meekan was used to remind you upcoming dates, events, appointments, meetings. To set meetings or reminders, you would type “meekan” and what you need in plain English and meekan would schedule the meeting or reminder for you, digitally checking your and other attendee’s calendars. Used by >28K teams, meekan was integrated into Slack, Microsoft Teams or Hipchat accounts.
Visabot, helped 70K customers apply for immigration services
Visabot used to chat with customers over Facebook Messenger and the company’s website. It asked simple questions and helped complete visa applications. Users paid to print the documents, which they mailed to the government. Visabot’s founders claimed that their product cost 10 percent of the usual legal fees.
Now that you have seen enough failures, how about some chatbot success stories?
How can we do better?
Your feedback is valuable. We will do our best to improve our work based on it.