Elon Musk's AI company, xAI, has unveiled its latest flagship model, Grok 3, on February 17, 2025, featuring enhanced capabilities, a family of specialized models, and new tools like DeepSearch, positioning it as a formidable competitor in the AI landscape.
Developed using a massive data center in Memphis with approximately 200,000 GPUs, Grok 3 represents a significant leap in AI capabilities1. The model boasts ten times more computing power than its predecessor, Grok 2, and incorporates an expanded training dataset that includes legal documents2. Grok 3 is not a single entity but a family of AI models, each designed for specific functionalities:
Grok 3: The main model with enhanced capabilities
Grok 3 Mini: Offers faster responses with a slight trade-off in accuracy
Grok 3 Reasoning and Grok 3 Mini Reasoning: Specialized for advanced problem-solving tasks3
This diverse range of models allows xAI to cater to various AI applications, from quick responses to complex reasoning tasks, positioning Grok 3 as a versatile and powerful AI system.
Grok 3 has demonstrated impressive performance across various benchmarks, outperforming many of its competitors. Here's a comprehensive breakdown of the benchmark results:
Reasoning + Test-Time Compute:
Math (AIME '24): Grok-3 Reasoning Beta (93), Grok-3 mini Reasoning (96), o3-mini-high (87), o1 (83), DeepSeek-R1 (80), Gemini-2 Flash Thinking (73)
Science (GPQA): Grok-3 Reasoning Beta (85), Grok-3 mini Reasoning (84), o3-mini-high (80), o1 (78), DeepSeek-R1 (71), Gemini-2 Flash Thinking (74)
Coding (LCB Oct-Feb): Grok-3 Reasoning Beta (79), Grok-3 mini Reasoning (80), o3-mini-high (74), o1 (73), DeepSeek-R1 (65), Gemini-2 Flash Thinking (46)
Standard Benchmarks:
Math (AIME '24): Grok-3 (52), Grok-3 mini (40), Gemini-2 Pro (36), DeepSeek-V3 (39), Claude 3.5 Sonnet (16), GPT-4o (9)12
Science (GPQA): Grok-3 (75), Grok-3 mini (65), Gemini-2 Pro (65), DeepSeek-V3 (59), Claude 3.5 Sonnet (65), GPT-4o (50)12
Coding (LCB Oct-Feb): Grok-3 (57), Grok-3 mini (41), Gemini-2 Pro (36), DeepSeek-V3 (40), Claude 3.5 Sonnet (36), GPT-4o (34)12
Additionally, an early version of Grok-3, codenamed "Chocolate," became the first AI model to break the 1400 ELO score in the LMSYS Chatbot Arena, ranking first across all categories3. In the AIME 2025 Mathematics Competition, both Grok-3 Reasoning Beta and Grok-3 mini Reasoning dominated the top two positions, significantly outperforming other reasoning models2.
Grok 3 introduces several innovative tools and features that enhance its functionality and user experience. DeepSearch, a new AI-powered research tool, scans the internet and X (formerly Twitter) to analyze information and provide concise summaries in response to user queries12. This feature aims to rival similar offerings from competitors like OpenAI's deep research tools3.
Elon Musk announced that Grok 3 will soon gain voice interaction capabilities, with the voice mode expected to launch within a week of the initial release45. Additionally, xAI plans to make Grok 3 models available through its business API in the coming weeks, along with DeepSearch functionality43. This API access will allow developers and businesses to integrate Grok 3's advanced reasoning and research capabilities into their own applications and services67.
Grok 3 is initially available to X Premium+ subscribers for $50 per month12. xAI is also introducing a new SuperGrok plan at $30 monthly or $300 annually, offering additional DeepSearch queries, enhanced reasoning capabilities, and unlimited image generation34. The Grok app will soon feature voice interaction, with enterprise API access planned in the coming weeks5. Meanwhile, xAI plans to open-source Grok 2 once Grok 3 is fully stable, maintaining their commitment to releasing previous versions for public use56.