OpenAI released a much anticipated update to the technology that powers ChatGPT on Tuesday, the latest salvo in an increasingly heated and fast-moving competition among tech powers to dominate artificial intelligence.
GPT-4, as the new version of OpenAI’s technology is called, boasts impressive performance improvements, highlighted by top-level scores the company said it achieved on a variety of standardized tests including the bar exam. But GPT-4 continues to suffer from some of the key limitations that have raised concerns about A.I., particularly its tendency to “hallucinate,” or invent facts and present them as truth.
OpenAI cautioned people to take “great care” when using GPT 4, saying its limitations create significant safety challenges.
“It is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it,” OpenAI CEO Sam Altman tweeted. In the lead up to the announcement, Altman has set the bar low by suggesting people will be disappointed and telling his Twitter followers that “we really appreciate feedback on its shortcomings.”
OpenAI is at the forefront of a wave of excitement around so-called generative A.I. — a flavor of artificial intelligence that uses large language models, trained on vast quantities of data, to produce human-sounding responses to questions. In January, OpenAI struck a $10 billion partnership with Microsoft, which has begun incorporating the technology throughout its products, including the Bing search engine. Alphabet-owned Google has stepped up its A.I. activities in response to the business threat posed by OpenAI and Microsoft.
In a blog post on Tuesday, OpenAI described the distinction between GPT-3.5—the previous version of the technology—and GPT 4, as subtle in situations when users are having a “casual conversation” with the technology. “The difference comes out when the complexity of the task reaches a sufficient threshold—GPT-4 is more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5,” a research blog post read.
This latest model was trained on Microsoft Azure and is multimodal, meaning it can accept image and text inputs and generate text. According to OpenAI, GPT-4 scores in the 88th percentile and above on the LSAT, SAT Math and SAT Reading and Writing exams, and it achieved a score that put it in the top 10% of test takers on a simulated bar exam (whereas GPT-3.5 ranked in the bottom 10%).
According to OpenAI, it’s also more difficult for GPT-4 to evade guardrails limiting the ways it can be used.
But the recent A.I. hype and rapid growth has given some pause about whether responsible safety measures are being taken as companies rush to compete. Google boasted responsible A.I. efforts as it announced it was adding generative A.I. features to Google Docs, Sheets, and its other work tools. Meanwhile Microsoft, which backs OpenAI, laid off one of its responsible A.I. teams, Platformer reported Monday. While it still has an office governing A.I. initiatives, its role is to set high-level principals, frameworks, and processes and isn’t focused on safety and ethical checks.
While GPT-4 has been hotly anticipated within the tech industry, the product that OpenAI unveiled on Tuesday did not include some speculated upon features, including the ability to create videos from text inputs. Some observers also criticized OpenAI’s lack of specific technical details about GPT-4, including the number of parameters in its large language model.
OpenAI has expanded access to business customers through its API service and this version is part of that strategy. GPT-4 will be integrated into the products of several companies that OpenAI has partnerships with, including Stripe, Khan Academy, and Duolingo, which will offer a subscription tier that provides two A.I.-powered features. GPT-4 is initially being made available to a limited set of users through OpenAI’s $20 a month subscription service. Users can also join a waitlist to get access to the product.
OpenAI is hosting a demonstration on YouTube Tuesday. And for those who want to make improvements, OpenAI is open sourcing its framework for automated evaluation of AI model performance.
Fortune‘s CFO Daily newsletter is the must-read analysis every finance professional needs to get ahead. Sign up today.