


Zyphra, in collaboration with AMD and IBM, introduced ZAYA1, the first large-scale Mixture-of-Experts (MoE) foundational model developed entirely on the AMD platform. This model, trained on AMD Instinct MI300X GPUs, AMD Pollara network, and ROCm software, offers a production-ready alternative platform with high performance.
San Francisco, November 24, 2025 – Zyphra announced today a significant milestone in AI infrastructure and model development. The company's technical report reveals how large-scale training can be performed using AMD GPUs and networks.
The report introduces ZAYA1 as the first large-scale MoE model trained on the AMD integrated platform (AMD Instinct™ GPUs, AMD Pensando™ networking, and ROCm software stack). With a total of 8.3 billion parameters, ZAYA1 operates with 760 million active parameters, delivering comparable performance to pioneering models such as Qwen3-4B from Alibaba and Gemma3-12B from Google. Moreover, it achieves better results in logic, mathematics, and coding scales compared to models like Llama-3-8B from Meta and OLMoE.
Krithik Puthalath, CEO of Zyphra, stated, "Efficiency has always been a core principle for Zyphra. This principle guides us in designing model architectures, developing training and inference algorithms, and selecting the best price-performance hardware. ZAYA1 reflects this philosophy, and we are excited to be the first company capable of large-scale training on the AMD platform."
Mixture-of-Experts (MoE) models have become the fundamental architecture of modern frontier AI systems. These models utilize dynamically activated expert networks to offer greater efficiency, scalability, and reasoning performance compared to traditional dense architectures. Today’s leading frontier models, such as GPT-5, Claude-4.5 DeepSeek-V3, and Kimi2, are expanding their capabilities using MoE designs. ZAYA1 represents the first large-scale pre-training of an MoE model on the AMD platform, demonstrating that the AMD AI ecosystem is ready for frontier-class AI development.
Zyphra specifically designed ZAYA1 for AMD silicon, incorporating innovations like advanced routing architecture, Compressed Convolutional Attention (CCA), and lightweight residual scaling to provide higher training efficiency and more effective inference.
Philip Guido, AMD's Vice President of Commercial Business, stated, "Zyphra's collaboration with AMD and IBM demonstrates that an open platform based on AMD Instinct GPUs and the AMD Pensando network can provide revolutionary performance and efficiency for large-scale AI. This milestone emphasizes how innovative AMD hardware and software solutions support the next wave of frontier AI development."
Zyphra closely collaborated with AMD and IBM to achieve this success, designing a large-scale training cluster based on AMD Instinct GPU and AMD Pensando network. The previously announced joint engineering work includes AMD Instinct MI300X GPUs combined with a robust IBM Cloud infrastructure and storage architecture.
Alan Peacock, General Manager of IBM Cloud, said, "While AI enables businesses to innovate, foundational models are critical for providing accelerated development, efficiency, and productivity." He added, "We are proud to provide IBM's scalable AI infrastructure for the ZAYA1 large-scale model and are excited to continue our collaboration with AMD in AI model development for our mutual clients."
This partnership showcases how Zyphra can deliver the necessary performance for developing a reliable frontier-scale AI model powered by IBM Cloud infrastructure by bringing advanced AI research and an optimized software stack together with the AMD platform.
.png)
Sizlere kesintisiz haber ve analizi en hızlı şekilde ulaştırmak için. Yakında tüm platformlarda...