There are several chatbots available, but which is best and for what task? We examined the performance of the Google Bard, Microsoft Bing, and OpenAI ChatGPT models using a variety of questions including typical requests like holiday advice, gaming suggestions, and mortgage computations.
This is obviously not a comprehensive list of these systems’ capabilities (AI language models are, in part, characterized by their undiscovered capacities, a trait known as “capability overhang” in the AI world), but it does give you an indication of their respective advantages and disadvantages.
Although you can (and should) peruse through our inquiries, analyses, and conclusion below, we’ll cut right to the chase: The best verbal dexterity comes from ChatGPT, the best web information retrieval comes from Bing, and Bard is doing its best. (How limited Google’s chatbot is in comparison to the other two is actually pretty surprising.)
Nevertheless, before we get started, a few programming notes. First, on ChatGPT, we were utilizing OpenAI’s most recent model, GPT-4. This is the same AI model that drives Bing, yet the responses provided by the two platforms are very dissimilar. Most significantly, Bing has additional capabilities. It can create graphics, browse the web, and provide sources for its answers (which is a super important attribute for certain queries). The development of plug-ins for ChatGPT, however, by OpenAI as we were wrapping off this story would enable the chatbot to additionally obtain real-time data from the internet. The system’s capability will be greatly increased as a result, becoming much more similar to Bing’s. We were unable to use this function since it is currently only accessible to a small number of people.
Also, it’s critical to keep in mind that AI language models are, to put it mildly, hazy. They provide responses based on statistical regularities in their training data rather than being deterministic systems like regular software. This implies that you won’t always get the same response if you ask them the same question. That also means that the way you phrase a question can influence the response, so we followed up on several of these inquiries to receive better answers.
The glaring loser is ChatGPT (GPT-4), which is not surprising given that Elden Ring was released the next year and that ChatGPT’s training data primarily ends in 2021. The instruction to “stop her counterattacks” is the exact opposite of what you should do, and the entire list reads like it was written by a student who didn’t read the assigned reading for English class, which is what it is in essence. All of these leave me unimpressed, but this one strikes me as particularly offensive.