OpenAI Unveils Its Biggest AI Model to Date, Garnering Varied Reactions

OpenAI Unveils Its Biggest AI Model to Date, Garnering Varied Reactions

OpenAI Unveils Its Biggest AI Model to Date, Garnering Varied Reactions


# **GPT-4.5: An Expensive Upgrade with Minimal Improvements**

## **Introduction**
OpenAI’s newest AI model, GPT-4.5, has been launched amidst considerable anticipation, yet initial assessments indicate that it may not deliver the groundbreaking advancements many expected. Though it is considerably pricier than its forerunner, GPT-4o, the enhancements in performance are only slight. Critics contend that GPT-4.5 highlights the dwindling returns associated with scaling conventional large language models (LLMs), prompting discussions about the future direction of AI innovation.

## **Performance vs. Cost: An Unsatisfactory Exchange**
GPT-4.5 stands as OpenAI’s most sophisticated traditional AI offering to date, but its performance boosts come with a hefty financial burden. The model incurs costs that are **30 times higher for input and 15 times greater for output** compared to GPT-4o. Based on OpenAI’s own assessments, GPT-4.5 shows little significant improvement over its predecessor across various tasks, causing some specialists to doubt the value of investing in larger models.

An unnamed AI professional remarked to *Ars Technica*, *”GPT-4.5 is a lemon!”*—a view shared by AI researcher Gary Marcus, who labeled the release a *”nothing burger.”* Even former OpenAI researcher Andrej Karpathy conceded that while GPT-4.5 surpasses GPT-4o, the advancements are nuanced and hard to measure.

## **Coding Performance: A Notable Deficiency**
A prominent drawback of GPT-4.5 lies in its lackluster performance in programming tasks. Independent evaluations by tech investor Paul Gauthier with the **Aider Polyglot Coding benchmark** revealed that GPT-4.5 placed **10th overall**, falling short of models like Claude 3.7 Sonnet and OpenAI’s own reasoning models, o1 and o3.

Moreover, GPT-4.5 has a **knowledge cutoff in October 2023**, which means it is unaware of the latest changes to programming languages and frameworks. This shortfall renders it a less dependable resource for software developers compared to alternative AI models.

## **GPT-4.5 vs. OpenAI’s Simulated Reasoning Models**
OpenAI’s benchmark data underscore GPT-4.5’s limitations in comparison to its **simulated reasoning models**, o1 and o3. For instance:

– **AIME Math Competition Scores:**
– GPT-4.5: **36.7%**
– o3-mini: **87.3%**

– **Cost Analysis (per million tokens):**
– GPT-4.5: **$75 (input), $150 (output)**
– o1 Pro: **$15 (input), $60 (output)**
– o3-mini: **$1.10 (input), $4.40 (output)**

These statistics indicate that OpenAI’s simulated reasoning models deliver **superior results at a significantly lower cost**, making GPT-4.5 an undesirable option for numerous use cases.

## **A Model Concentrated on “Vibes” Rather than Reasoning**
CEO of OpenAI **Sam Altman** sought to manage expectations by characterizing GPT-4.5 as a model that is *”high on vibes, low on reasoning.”* He admitted that while GPT-4.5 seems more engaging in conversational contexts, it does not perform well in analytical tasks or benchmark evaluations.

Altman also disclosed that OpenAI **lacked sufficient GPUs** to launch GPT-4.5 widely, further underscoring the model’s inefficacy.

## **The Conclusion of Traditional LLM Expansion?**
Given the underwhelming outcomes from GPT-4.5, OpenAI appears to be pivoting away from conventional LLMs. Altman has indicated that **GPT-5 will deviate from this trajectory**, opting instead to merge **non-reasoning LLMs with simulated reasoning models** such as o3.

This transition implies that OpenAI acknowledges the **diminishing returns** associated with merely enhancing model size and training datasets. The company is investigating **innovative architectures**, including inference-time reasoning and **diffusion-based AI models**, which could provide more effective and scalable alternatives.

## **Competition in the AI Landscape**
As GPT-4.5 struggles to validate its expense, rivals like **Anthropic’s Claude 3.7 Sonnet** are making strides. Claude 3.7 Sonnet has shown **greater efficacy** with a more optimized architecture, indicating that OpenAI may encounter increasing challenges from competing AI firms.

## **Conclusion: An Expensive Venture with Limited Prospects**
GPT-4.5 is now accessible to **ChatGPT Pro subscribers**, with plans to extend availability to **Plus, Team, Enterprise, and Education users** in the near future. However, OpenAI has not committed to sustaining the model in the long run, likely due to its high