- News>
- Technology
Google builds world`s fastest machine learning training supercomputer that breaks AI performance records
Google said that it achieved these results with ML model implementations in TensorFlow, JAX, and Lingvo. Four of the eight models were trained from scratch in under 30 seconds.
Highlights
- The supercomputer Google used for the MLPerf training round is four times larger than the Cloud TPU v3 Pod.
- It had set three records in the previous competition.
- The system includes 4096 TPU v3 chips and hundreds of CPU host machines, all connected via an ultra-fast, ultra-large-scale custom interconnect.
New Delhi: Google said it has built world’s fastest machine learning (ML) training supercomputer that broke AI performance records in six out of eight industry-leading MLPerf benchmarks.
“The latest results from the industry-standard MLPerf benchmark competition demonstrate that Google has built the world’s fastest ML training supercomputer. Using this supercomputer, as well as our latest Tensor Processing Unit (TPU) chip, Google set performance records in six out of eight MLPerf benchmarks,” a Google blog said.
Google said that it achieved these results with ML model implementations in TensorFlow, JAX, and Lingvo. Four of the eight models were trained from scratch in under 30 seconds.
Google blog explains, “…Consider that in 2015, it took more than three weeks to train one of these models on the most advanced hardware accelerator available. Google’s latest TPU supercomputer can train the same model almost five orders of magnitude faster just five years later.”
MLPerf models are chosen to be representative of cutting-edge machine learning workloads that are common throughout industry and academia. The supercomputer Google used for the MLPerf training round is four times larger than the "Cloud TPU v3 Pod" that set three records in the previous competition.
The system includes 4096 TPU v3 chips and hundreds of CPU host machines, all connected via an ultra-fast, ultra-large-scale custom interconnect. In total, this system delivers over 430 PFLOPs of peak performance.
Google said its MLPerf Training v0.7 submissions demonstrate our commitment to advancing machine learning research and engineering at scale and delivering those advances to users through open-source software, Google’s products, and Google Cloud.