Anyone tempted by AI-powered chatbots like ChatGPT and Bard – wow, they can write essays and recipes! — eventually encounters what are known as hallucinations, the tendency of artificial intelligence to fabricate information.
The chatbots, guessing what to say based on information gleaned from all over the internet, can’t help but get things wrong. And when they fail — publishing a cake recipe with wildly inaccurate flour measurements, say — it can be a real buzzkill.
But as mainstream tech tools continue to integrate AI, it’s crucial to get to grips with how we can use it to serve us. After testing dozens of AI products over the past two months, I’ve come to the conclusion that most of us are using the technology in a sub-optimal way, largely because the tech companies gave us bad directions.
The chatbots are least helpful when we ask them questions and then hope that the answers they come up with on their own are true, which is how they are designed to be used. But when AI is instructed to use information from trusted sources, such as credible websites and research papers, AI can perform useful tasks with a high degree of accuracy.
“If you give them the right information, they can do interesting things with it,” says Sam Heutmaker, founder of Context, an AI start-up. “But by itself, 70 percent of what you get isn’t going to be accurate.”
With the simple adjustment of advising the chatbots to work with specific data, they generated understandable answers and helpful advice. That has turned me from a cantankerous AI skeptic to an enthusiastic power user over the past few months. When I went on a trip using a ChatGPT mapped out itinerary, it went well because the recommendations came from my favorite travel websites.
Directing the chatbots to specific high-quality resources, such as websites of reputable media outlets and academic publications, can also reduce the production and spread of disinformation. Let me share some of the approaches I used to get help with cooking, research, and travel planning.
Meal planning
Chatbots like ChatGPT and Bard can write recipes that look good in theory but don’t work in practice. In an experiment conducted by The New York Times Food Desk in November, an early AI model created recipes for a Thanksgiving menu with an extremely dry turkey and a dense cake.
I also encountered disappointing results with AI-generated fish recipes. But that changed when I experimented with ChatGPT plugins, which are essentially third-party apps that work with the chatbot. (Only subscribers who pay $20 per month for access to ChatGPT4, the latest version of the chatbot, can use plugins, which can be activated in the settings menu.)
In ChatGPT’s plugins menu, I selected Tasty Recipes, which pulls data from BuzzFeed’s Tasty website, a well-known media site. I then asked the chatbot to come up with a meal plan using recipes from the site, including fish dishes, ground pork, and vegetable sides. The bot presented an inspiring meal plan, including lemongrass pork banh mi, grilled tofu tacos, and all-in-the-fridge pasta; each meal suggestion included a link to a recipe on Tasty.
For recipes from other publications, I used Link Reader, a plugin that allowed me to paste a web link to generate meal plans using recipes from other credible sites like Serious Eats. The chatbot pulled data from the sites to make meal plans and told me to visit the websites to read the recipes. That took extra work, but it beat an AI-devised meal plan.
Research
When researching for an article on a popular video game series, I turned to ChatGPT and Bard to jog my memory about past games by recapitulating their plots. They messed up important details about the games’ stories and characters.
After testing many other AI tools, I came to the conclusion that it was crucial for research to fixate on reliable sources and quickly check the data for accuracy. I finally found a tool that does just that: Humata.AI, a free web app that has become popular among academic researchers and lawyers.
The app lets you upload a document, such as a PDF, and from there a chatbot answers your questions about the material alongside a copy of the document, highlighting relevant parts.
In one test, I uploaded a research paper I found on PubMed, a government-run search engine for scientific literature. The tool produced a relevant summary of the lengthy document in minutes, a process that would have taken me hours, and I glanced at the highlights to make sure the summaries were accurate.
Cyrus Khajvandi, a founder of Humata, based in Austin, Texas, developed the app when he was a researcher at Stanford and needed help reading dense scientific papers, he said. The problem with chatbots like ChatGPT, he said, is that they rely on outdated models of the web, so the data may lack relevant context.
Trip planning
When a Times ChatGPT travel writer recently asked to put together an itinerary for Milan, the bot directed her to a central part of the city that was deserted because it was an Italian holiday, among other things.
I had better luck when I applied for a vacation travel plan for me, my wife, and our dogs in Mendocino County, California. Similar to meal planning, I asked ChatGPT for suggestions from some of my favorite travel sites, such as Vox-owned Thrillist and the travel section of The Times.
Within minutes, the chatbot generated a route of dog-friendly restaurants and activities, including a farm with wine and cheese pairings and a train to a popular hiking trail. This saved me several hours of planning, and most importantly, the dogs had a great time.
It boils down
Google and OpenAI, which works closely with Microsoft, say they are working to reduce hallucinations in their chatbots, but we can already reap the benefits of AI by taking control of the data the bots rely on to come up with answers.
Put another way, the main benefit of training machines with huge datasets is that they can now use language to simulate human reasoning, says Nathan Benaich, a venture capitalist who invests in AI companies. The important step for us, he said, is to match that capability with high-quality information.