- It can feel like AI chatbots are everywhere, from dating apps to customer service.
- "Botshit" describes inaccurate or fabricated info produced by chatbots that humans use for tasks.
- Researchers who coined the term said the risks of botshit must be mitigated as more firms use AI.
When Jake Moffatt's grandmother died in 2022, he booked a flight to her funeral on Air Canada, hoping to use their bereavement travel discounts.
Air Canada's customer service chatbot told Moffatt he could claim the discount after the flight. Yet, the company later denied his discount request because they said it had to be filed prior to the flight.
In February, Canada's Civil Resolution Tribunal — an online platform to resolve disputes — ruled that Air Canada's chatbot had misled Moffatt and ordered the airline to compensate him for the discount.
The chatbot misleading Moffatt with false information is an example of "botshit," which describes incorrect or fabricated information produced by chatbots that humans use to complete tasks.
Researchers Ian P. McCarthy, Timothy R. Hannigan, and André Spicer coined the term in a paper published in January and a July 17 Harvard Business Review article.
Botshit is one example of how the use of AI might worsen companies' customer service. As businesses employ generative AI, the researchers said employees must be more critical in managing chatbot-generated responses.
AI is here to stay, but chatbots keep spewing botshit
The use of generative AI in the workplace has nearly doubled in the past six months, according to a survey of 31,000 global workers conducted between February and March by the research firm Edelman Data and Intelligence. The results were published on May 8 by Microsoft and LinkedIn.
What's more, 79% of business leaders said their companies must adopt AI to remain competitive.
According to economist Dan Davies, companies employ technology like AI and chatbots to streamline decision-making and optimize efficiency. Over the past decade, chatbots have proliferated as a customer service feature for businesses.
However, that can also lead to situations where no one employee is accountable for an algorithm going awry.
For example, researchers in 2023 found that about 75% of ChatGPT responses to drug-related questions were often inaccurate or incomplete. Additionally, when asked, ChatGPT generated fake citations to support some of its inaccurate responses.
In January, a UK parcel company removed its new AI customer service chatbot after it swore at a customer.
And when Google rolled out its AI chatbot Gemini earlier this year, it produced historically inaccurate images of people of color. The company paused and then relaunched the chatbot's image-generation tool after public backlash.
In a February memo to employees, Google CEO Sundar Pichai said the chatbot's responses were "unacceptable" and the company had "got it wrong" when trying to use new AI.
McCarthy, Hannigan, and Spicer wrote in the July 17 article that businesses that carelessly use AI-generated information jeopardize their customer experience and reputation, going as far as risking legal liability.
"Managers and organizations are beginning to see an increasing array of new risks based on expectations and professional standards around the accuracy of information," the researchers wrote.
Still, they wrote that they believe AI provides opportunities for useful application "as long as the related epistemic risks are also understood and mitigated."
Black boxing and the risks of using AI
The biggest challenge associated with using AI chatbots for customer service is "black boxing," in which it becomes difficult to discern why an AI technology operates a certain way, according to the researchers.
McCarthy, Hannigan, and Spicer wrote that AI customer service chatbots can be improved through more rigorous and specific guidelines, guardrails, and restrictions on the available range of vocabulary and response topics.
However, the researchers argued that customer service is the least risky use of AI for businesses.
Using AI chatbots for tasks like safety procedures in healthcare, complicated financial budgeting, or legal judgments are cases where it's most essential the AI is accurate, but it's most difficult to verify in real time, according to the researchers.
In 2023, a New York law firm was fined $5,000 after lawyers submitted a court brief containing false references produced by ChatGPT.
While general AI chatbots like ChatGPT are more susceptible to botshit, practice-specific chatbots that use retrieval augmented generation, a technology that enhances AI accuracy, are more promising, according to the researchers.
The researchers said that rigorous checking and calibration of the AI's output over time, including expert fact-checking of its responses, can mitigate risks.
"Chatbots and other tools which draw on generative AI have great potential to significantly improve many work processes," the researchers wrote. "Like any important new technology, they also come with risks. With careful management, however, these risks can be contained while benefits are exploited."