TECH NEWS – Elon Musk aimed for OpenAI hegemony by releasing the latest Grok 2 AI model.
Elon Musk’s AI company, xAI, has finally released its latest generation AI model, the Grok 2. xAI, along with its founder Tesla and X’s ownership stake, has allowed the company to generate demand for its products and use the expensive computing resources needed to train AI models. Today’s announcement follows Musk’s comments earlier this year, in which he promised to update the model soon.
In addition to Grok, Amazon-backed Anthropic Claude, Microsoft-backed OpenAI ChatGPT, Facebook-owned Meta Llama, and Google Gemini are among the world’s leading AI software products. All of these offer AI features for general consumer and enterprise use cases, and the Grok 2 release covers both.
Elon Musk’s Grok 2 and Grok 2 Mini have a significant advantage over OpenAI GPT 4 and Amazon Claude?!
xAI’s latest Grok release includes an early preview of the Grok 2 and a mini Grok 2 model. Both will be available to users on Musk’s X social media platform. The Grok 2 was tested on UC Berkley’s Large Model Systems Organization (LMSYS) AI benchmark and found to match OpenAI’s GPT-4o performance nearly.
According to LMSYS, Grok 2 ranked second best in math and coding and third best in the ability to respond to tough prompts, giving it third place overall. Grok 2 is preceded by ChatGPT 4.0 and Google Gemini 1.5 Pro.
According to xAI’s own data, the Grok 2 is ahead of the GPT 4 Turbo and slightly behind the GPT 4o.
However, even based on xAI data, OpenAI ChatGPT 4o is the king of AI performance. Thanks to LMSYS’ ELO rating of 1.314. On the other hand, xAI’s early Grok 2 version received a rating of 1,281, while the Gemini 1.5 Pro has an average score of 1,297.
Regarding chatbot performance, the Grok 2 lags behind the Gemini 1.5 Pro regarding “win rate,” which measures the percentage of responses rated as better. It’s 48% against Google’s product, and xAI’s data shows no similar data for OpenAI’s ChatGPT 4o model, which allows users to upload images and ask the AI to generate responses based on them.
What about factuality?
Improving realism is the other key area where xAI claims to have improved the Grok 2’s performance. Early AI models were criticized for lack of realism, and the company’s in-house ‘AI Tutors’ gave the Grok 2 and Grok 2 mini models realism ratings of 62.9% and 59.,6% respectively – a significant improvement over the previous models compared to the 50% rate of iteration.
XAI claims Grok 2 has advanced capabilities in text and vision understanding, adding that the model uses data available on the X. Like other AI products, the Grok 2 mini is aimed at general consumer use. It supports functions such as writing, coding, or generating text responses.
xAI shares that the Grok 2 and Grok 2 mini will be available to developers for enterprise use cases of its products by the end of this month. The API offers “multi-region inference deployments for low-latency access around the world,” as well as mandatory multi-factor authentication, data analytics for billing, traffic analysis, and integration with in-house business systems.
Source: X
Woah, another exciting update from Chatbot Arena❤️🔥
The results for @xAI’s sus-column-r (Grok 2 early version) are now public**!
With over 12,000 community votes, sus-column-r has secured the #3 spot on the overall leaderboard, even matching GPT-4o! It excels in Coding (#2),… https://t.co/gqSWSwYN0z pic.twitter.com/j9UYDBYNt4
— lmsys.org (@lmsysorg) August 14, 2024
Leave a Reply