OpenAI's New Endeavor: Reports of Google AI Chip Adoption

Hello everyone, this is Tak@, a system integrator. I imagine many of you are encountering AI more frequently in your daily work, aren't you?

Right now, a significant shift is happening in the world of chips—the very "brains" of AI.

This time, I want to discuss the surprising news that OpenAI has started using Google's AI chips and share my perspective on how this could change our businesses and the future.

Hopes for Reduced Inference Costs

There seem to be several key reasons behind this move. First, OpenAI wants to lower its inference costs.

AI model inference refers to the process where a trained AI makes predictions or decisions based on new information.

For example, when ChatGPT answers a question, it's performing inference. As more users engage with AI, the volume of inference processing becomes immense, driving up the costs of the computational resources required.

OpenAI appears to be looking to reduce these inference costs by leasing TPUs through Google Cloud.

A Shift from Microsoft Dependence

Another significant reason is OpenAI's apparent intention to diversify its supply sources, moving away from its almost complete reliance on Microsoft's data centers.

Since 2019, OpenAI has partnered with Microsoft, conducting large-scale model training and inference on Microsoft Azure. Microsoft has invested heavily in OpenAI and integrated OpenAI's technologies into its own products, forging a very strong bond between the two companies.

However, the sudden surge in ChatGPT's popularity led to insufficient computational power, causing delays in new feature rollouts and other issues. Consequently, OpenAI reevaluated its agreement with Microsoft, gaining the ability to use other cloud services starting January 2025.

This is a crucial step for OpenAI to reduce its dependence on Microsoft alone and aim for more stable service delivery.

Diversifying Supply Sources

This kind of diversification is considered fundamental for businesses to operate sustainably. By procuring AI chips from multiple companies, organizations can mitigate risks associated with issues at a single supplier and potentially secure better terms for chips.

In addition to Google's TPUs, OpenAI is also actively utilizing Nvidia's GPUs and AMD's AI chips.

For Google, this partnership is an excellent opportunity to broadly offer its internally developed TPUs to external parties. Historically, TPUs were primarily used within Google, but now they're being adopted by OpenAI's competitors like Apple and Anthropic.

What are Google TPUs? Features and History

Google TPUs are AI chips custom-built by Google to accelerate machine learning, especially neural network processing. They began internal use at Google in 2015, and by 2018, other companies gained access to TPUs through cloud services.

Chips Designed Specifically for AI

Unlike Graphics Processing Units (GPUs), TPUs are specifically optimized for AI computations.

For instance, in AI inference tasks like generating images or summarizing text, efficiently handling numerous low-precision calculations is crucial. TPUs excel at these types of calculations while minimizing power consumption.

Google has consistently prioritized efficiency in TPU development. The initial TPU v1 was reportedly 30 to 80 times more power-efficient than conventional CPUs and GPUs.

TPU Evolution and Performance Improvements

TPUs have undergone numerous enhancements since their initial release:

  • First Generation (v1): Introduced in 2015, primarily focused on 8-bit matrix multiplication.
  • Second Generation (v2): Announced in 2017, significantly increased memory bandwidth and added floating-point computation capabilities, enabling their use for AI model training.
  • Third Generation (v3): Launched in 2018, offered double the performance of v2 and introduced liquid cooling for higher-density computation.
  • Fourth Generation (v4): Revealed in 2021, providing more than double the performance of v3.
  • Fifth Generation (v5e, v5p): Announced in 2023, boasting even faster performance than v4 and reportedly comparable to the H100.
  • Sixth Generation (Trillium): The latest TPU unveiled in 2024, said to offer 4.7 times the performance of v5e.
  • Seventh Generation (Ironwood): Expected in 2025, with a design particularly focused on inference efficiency.

Jeff Dean, Chief Scientist of Google's AI research division, stated that the new Ironwood TPU system excels at both inference and training, noting that inference demand is now much higher than before.

This is because more users are engaging with advanced, large-scale models, requiring not just simple outputs but increasingly complex reasoning and decision-making.

Why TPUs Were Chosen

OpenAI's choice of Google Cloud isn't just about compute capacity; it's also due to Google's strong AI-specific infrastructure.

TPUs are said to be more efficient and performant than GPUs for training and inference of large language models.

Operating massive AI models like OpenAI's demands not only computational power but also efficiency, power consumption, and flexible scalability. TPUs embody these qualities, offering distinct technical advantages that set them apart from other cloud services.

Furthermore, Google Cloud's active support for AI startups is a significant factor.

Many rapidly growing AI companies, such as Anthropic, have chosen Google Cloud. Google provides resources, collaborative research, and venture support, making it an attractive partner for OpenAI.

Conflicting Reports and OpenAI's Statement

News that OpenAI began using Google's TPUs garnered widespread attention, but there were some discrepancies in the reporting.

Reuters and Semianalysis Information

Reuters reported that OpenAI had started leasing Google TPUs for ChatGPT operations. However, discussions on Reddit indicated that Semianalysis, a source of information, claimed OpenAI was leasing chips from Coreweave, not TPUs.

Some argue that Semianalysis's information is credible because it has internal sources within OpenAI and Google.

Regarding this discrepancy, some opinions stated that Reuters' article repeatedly specified TPUs, so there was no ambiguity, while others criticized Reuters for often disseminating "misinformation."

OpenAI's Official Stance

In response to these reports, OpenAI issued an official statement. An OpenAI spokesperson stated that while the company was conducting "early tests" with Google's TPUs, it currently had "no plans to use these chips at scale."

This statement suggests that OpenAI is exploring many hardware options and has a strong desire for in-house development.

Fully adopting new hardware typically takes time and requires compatibility with different systems and software, making it a complex undertaking.

The Reality of Experimental Use

In essence, while OpenAI is testing Google's TPUs, a complete transition is not imminent.

Some media outlets have reported that OpenAI is using lower-performance TPUs, and Google continues to use its latest chips, designed for its own AI model Gemini, internally.

Investors and market experts initially viewed the TPU adoption as a sign that OpenAI was seeking alternatives to Nvidia. However, OpenAI's recent statement indicates that its ties with existing chip providers, Nvidia and AMD, remain strong.

As demand for AI computation continues to soar, OpenAI appears to be gradually expanding its experimental use of current GPUs and TPUs, but a full-scale switch to TPU systems is not on the table.

Current State of the AI Chip Market

The advancement of AI is significantly impacting the semiconductor industry. Demand for chips to power AI in data centers and cloud environments is rapidly increasing.

Nvidia and AMD's Presence

Currently, Nvidia's GPUs dominate the AI chip market. Specifically, Nvidia's Hopper series (H100/H200) and the recently announced Blackwell series (B200/B300) provide the high computational power essential for AI model training.

Meanwhile, AMD continues to compete in the market with its MI300 series (MI300X/MI325X) and other chips.

OpenAI plans to continue using Nvidia's GPUs and AMD's AI engines. This is because these products have proven performance, and existing supply agreements are already in place between OpenAI and both companies.

The Importance of Inference Processing

AI model training is the process of teaching AI knowledge, while inference is the process of using that learned knowledge to make judgments. As Google's Jeff Dean notes, as more users utilize advanced AI models, the demand for inference processing is exponentially increasing.

Specifically, the growing need for "reasoning," which involves logical thinking and judgment beyond simply providing answers, further underscores the importance of inference.

However, a challenge remains: inference throughput is limited by memory bandwidth. Moreover, it's often pointed out that securing memory and I/O bandwidth consumes far more power than achieving computational power (FLOPS).

The Push for In-House Chip Development

OpenAI is also moving towards developing its own AI chips for the future. The chip design is expected to be finalized and move into production this year.

This move is likely aimed at enabling OpenAI to harness AI's power more efficiently and for specific purposes by developing its own chips. Google also previously developed and used TPUs internally but is now expanding their availability externally.

The Larger Trend in the Semiconductor Industry

The dynamics of AI chips are part of a broader transformation occurring across the entire semiconductor industry.

Rise of Application-Specific Chips

While the semiconductor industry has historically prioritized "mass production" and "scale," it is now rapidly shifting its focus towards "application-specific specialization," "handling advanced small-volume production," and "rapid, collaborative development."

This is due to simultaneous changes in various end-product markets, including AI accelerators, autonomous vehicles, new power semiconductor technologies for electrification, and data center densification.

For example, many custom ICs (Application-Specific Integrated Circuits) like Google's TPUs and Meta's MTIAs are developed in collaboration with external design partners such as Broadcom and Marvell.

These "application-specific high-performance chips" offer a different value proposition than Nvidia's GPUs, balancing computational efficiency and cost, and development based on small-volume production is becoming the norm.

AI Driving Semiconductor Market Growth

The advancement of artificial intelligence is a powerful driver of growth in the semiconductor industry. The global semiconductor market is projected to reach $642 billion in 2024 and $1 trillion within the next decade.

Particularly, the demand for more efficient data processing and high-capacity memory is increasing, and this momentum will accelerate as the use of AI, IoT, and cloud computing expands.

By 2028 or 2029, over 80% of data center processors are expected to be AI accelerators or incorporate AI capabilities. This represents a market size of approximately $150 billion.

The Importance of Collaboration

To adapt to these changes, individual companies face limitations. In the semiconductor industry, it's becoming essential to collaborate beyond organizational boundaries and build systems that can flexibly respond to change.

For instance, High Bandwidth Memory (HBM), crucial for AI processing, is technically complex, making it difficult for memory manufacturers to develop alone. Instead, manufacturing companies like TSMC and multiple memory manufacturers have collaborated for over two years to develop it.

This kind of collaboration is what enables the market introduction of new technologies like HBM3E and HBM4.

Our Future and AI Chips

OpenAI's recent moves regarding Google's chips carry significant implications for the future of the AI industry and how our lives will change.

Towards the Era of AI Agents

Google's Jeff Dean has spoken about the current focus on "agents" powered by AI. This concept involves instructing AI agents to perform complex tasks at a high level, carrying them out on our behalf.

For example, imagine a world where an AI agent plans your trip, automatically handling everything from flight bookings to connecting flights.

As such agents become widespread, standardized mechanisms for AI agents to communicate with each other will become crucial.

In the future, a single request from us to an AI agent might result in many agents collaborating to produce the final outcome.

AI Collaborates to Solve Complex Problems: Sakana AI's Challenge and Human-Like Trial and Error

Discover Sakana AI's AB-MCTS, a breakthrough enabling AI to collaborate and solve tough problems like humans. It uses trial-and-error and collective intelligen…

Diversifying AI Infrastructure Choices

OpenAI's collaboration with Google Cloud signifies its strategy to combine multiple cloud services for flexibility and stability.

This allows them to operate AI services that are resilient to change, enhancing performance while mitigating risks.

This shift will likely prompt businesses to change their cloud selection criteria from "large means safe" to "what best suits our AI objectives." For companies that place AI at the core of their business, AI-specialized clouds like Google Cloud could become the new standard.

The evolution of AI technology is relentless, and the chips required for it are constantly changing.

OpenAI and Google's recent actions, while part of the power struggle in the AI market, offer crucial clues for us, both businesses and individuals, to consider "what to choose next."

I hope to continue leveraging technology and creativity to help you "turn your ideas into reality."

Follow me!

photo by:Dima Solomin