Skip to content

Has Gemini surpassed ChatGPT? We put the AI ​​models to the test.

    Gemini, on the other hand, provides a high-level overview of the landing instructions I asked for. But when I presented both options to Ars' own aviation expert Lee Hutchinson, he pointed out a major problem with Gemini's response:

    Gemini's guidance is both accurate (in terms of “these are the literal steps you need to take now”) and guaranteed to kill you, since the first thing it says is that you, the presumably inexperienced pilot, must disable the autopilot of a giant twin-engine plane before even suggesting you talk to air traffic control.

    While Lee gave Gemini points for “actually answering the question,” he ultimately called ChatGPT's answer “more practical… ultimately ChatGPT gives you the more useful answer [since] Google's answer will kill you unless you have some 737 time and are ready to fly a passenger plane with over 100 souls on board.

    For these reasons, ChatGPT should win this one.

    Final verdict

    This was a relatively exciting match measured purely on points. Gemini achieved wins on four prompts, compared to three for ChatGPT, with one draw.

    That said, it's important to consider where these points come from. ChatGPT scored some relatively minor and subjective style wins over dad joke prompts and Lincoln's basketball story, for example, showing that it may have a slight edge over more creative writing prompts.

    However, for the more informative clues, ChatGPT had significant factual errors in both the biography and the Super Mario Bros. strategy, plus signs of confusion in calculating Windows 11's diskette size. These types of errors, which Gemini was largely able to avoid in these tests, could easily lead to broader distrust in an AI model's overall output.

    All told, it seems clear that Google has gained quite a bit of relative ground on OpenAI since we did similar testing in 2023. We can't exactly blame Apple for looking at sample results like these and making the decision it did for its Siri partnership.