Google Cloud unveils AI-optimised infrastructure enhancements

About the Author

By Ryan Daws | 29th August 2023

Ryan is a senior editor at TechForge Media with over a decade of experience covering the latest technology and interviewing leading industry figures. He can often be sighted at tech conferences with a strong coffee in one hand and a laptop in the other. If it’s geeky, he’s probably into it. Find him on Twitter (@Gadget_Ry) or Mastodon (@gadgetry@techhub.social)

.pp-multiple-authors-boxes-wrapper {display:none;} img {width:100%;}

Google Cloud has announced significant advancements in its AI-optimised infrastructure, including fifth-generation TPUs and A3 VMs based on NVIDIA H100 GPUs.

Traditional approaches to designing and constructing computing systems are proving inadequate for the surging demands of workloads like generative AI and large language models (LLMs). Over the last five years, the parameters in LLMs have surged tenfold annually, prompting the need for both cost-effective and scalable AI-optimised infrastructure.

From conceiving the transformative Transformer architecture that underpins generative AI, to AI-optimised infrastructure tailored for global-scale performance, Google Cloud has stood at the forefront of AI innovation.

Cloud TPU v5e headlines Google Cloud’s latest offerings. Distinguished by its cost-efficiency, versatility, and scalability, the TPU aims to revolutionise medium- and large-scale training and inference. This iteration outpaces its predecessor, Cloud TPU v4, delivering up to 2.5x higher inference performance and up to 2x higher training performance per dollar for LLMs and generative AI models.

Wonkyum Lee, Head of Machine Learning at Gridspace, said:

“Our speed benchmarks are demonstrating a 5X increase in the speed of AI models when training and running on Google Cloud TPU v5e.

We are also seeing a tremendous improvement in the scale of our inference metrics, we can now process 1000 seconds in one real-time second for in-house speech-to-text and emotion prediction models—a 6x improvement.”

Striking a balance between performance, flexibility, and efficiency, Cloud TPU v5e pods support up to 256 interconnected chips, boasting an aggregate bandwidth surpassing 400 Tb/s and 100 petaOps of INT8 performance. Furthermore, its adaptability shines – with eight distinct virtual machine configurations – accommodating an array of LLM and generative AI model sizes.

The ease of operation also receives a boost, with Cloud TPUs now available on Google Kubernetes Engine (GKE). This development streamlines AI workload orchestration and management. For those inclined towards managed services, Vertex AI offers training with diverse frameworks and libraries via Cloud TPU VMs.

Google Cloud fortifies its support for leading AI frameworks including JAX, PyTorch, and TensorFlow.

PyTorch/XLA 2.1 release is on the horizon, featuring Cloud TPU v5e support and model/data parallelism for large-scale model training. Moreover, Multislice technology enters preview—enabling seamless scaling of AI models, transcending the confines of physical TPU pods.

[embedded content]

Meanwhile, the new A3 VMs are powered by NVIDIA’s H100 Tensor Core GPUs and focus on demanding generative AI workloads and LLMs,

A3 VMs deliver exceptional training capabilities and networking bandwidth. Their implementation in combination with Google Cloud’s infrastructure heralds a breakthrough, achieving 3x faster training and 10x greater networking bandwidth compared to previous iterations.

David Holz, Founder and CEO at Midjourney, commented:

“Midjourney is a leading generative AI service enabling customers to create incredible images with just a few keystrokes. To bring this creative superpower to users we leverage Google Cloud’s latest GPU cloud accelerators, the G2 and A3.

With A3, images created in Turbo mode are now rendered 2x faster than they were on A100s, providing a new creative experience for those who want extremely quick image generation.”

The unveiling of these advancements aims to solidify Google Cloud’s leadership in AI infrastructure, empowering innovators and enterprises to forge the most advanced AI models.

(Image Credit: Google Cloud)

See also: EDB reveals three new ways to run Postgres on Google Kubernetes Engine

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Cyber Security & Cloud Expo and Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Ryan is a senior editor at TechForge Media with over a decade of experience covering the latest technology and interviewing leading industry figures. He can often be sighted at tech conferences with a strong coffee in one hand and a laptop in the other. If it’s geeky, he’s probably into it. Find him on Twitter (@Gadget_Ry) or Mastodon (@gadgetry@techhub.social)

View all posts

.pp-multiple-authors-boxes-wrapper.box-post-id-99149.pp-multiple-authors-layout-boxed.multiple-authors-target-the-content.box-instance-id-1 .pp-author-boxes-avatar img { width: 80px !important; height: 80px !important; } .pp-multiple-authors-boxes-wrapper.box-post-id-99149.pp-multiple-authors-layout-boxed.multiple-authors-target-the-content.box-instance-id-1 .pp-author-boxes-avatar img { border-radius: 50% !important; } .pp-multiple-authors-boxes-wrapper.box-post-id-99149.pp-multiple-authors-layout-boxed.multiple-authors-target-the-content.box-instance-id-1 .pp-author-boxes-meta a { background-color: #655997 !important; } .pp-multiple-authors-boxes-wrapper.box-post-id-99149.pp-multiple-authors-layout-boxed.multiple-authors-target-the-content.box-instance-id-1 .pp-author-boxes-meta a { color: #ffffff !important; } .pp-multiple-authors-boxes-wrapper.box-post-id-99149.pp-multiple-authors-layout-boxed.multiple-authors-target-the-content.box-instance-id-1 .pp-author-boxes-meta a:hover { color: #ffffff !important; } .pp-multiple-authors-boxes-wrapper.box-post-id-99149.pp-multiple-authors-layout-boxed.multiple-authors-target-the-content.box-instance-id-1 .pp-author-boxes-recent-posts-title { border-bottom-style: dotted !important; } .pp-multiple-authors-boxes-wrapper.box-post-id-99149.pp-multiple-authors-layout-boxed.multiple-authors-target-the-content.box-instance-id-1 .pp-multiple-authors-boxes-li { border-style: solid !important; } .pp-multiple-authors-boxes-wrapper.box-post-id-99149.pp-multiple-authors-layout-boxed.multiple-authors-target-the-content.box-instance-id-1 .pp-multiple-authors-boxes-li { color: #3c434a !important; }

Tags: a3 vm, artificial intelligence, cloud, cloud computing, gke, google cloud, inference, jax, Kubernetes, kubernetes engine, llm, tensor core, tensorflow, tpu v5, tpu v5e

Joomla plugins are vital tools that enhance the functionality of a website, with RSForm! Pro distinguished as a robust form-building solution. This overview aims to outline the key features, benefits, and straightforward installation process of RSForm! Pro. It will explore various customization options that enable users to tailor forms to their specific requirements, review common… […]

In the rapidly evolving digital landscape, the performance of a Joomla website significantly influences user engagement. Performance optimization plugins serve as essential tools aimed at enhancing the speed and efficiency of a website by combining, minifying, and compressing assets such as CSS, JavaScript, and images. This article examines the advantages these plugins provide, ranging from… […]

The stable version of Ubuntu 24.04 LTS (code-named Noble Numbat) is released on April 25th 2024, if you are curious to know what is in it, you can now upgrade to the version of it… The post How to Upgrade from Ubuntu 22.04 LTS to Ubuntu 24.04 LTS appeared first on FAST DOMAINS.

Do you have an old laptop that has gathered layers of dust over time and you don’t exactly what to do with it? A good place to start would be to install a Linux distribution… The post 16 Best Linux Distributions for Older Computers appeared first on FAST DOMAINS.

Google Cloud unveils AI-optimised infrastructure enhancements

About the Author

Why hasn’t The Year of the Linux Desktop happened yet?

5 reasons Databricks runs best on Azure

Some predictions for 2018

TCP Analysis with Wireshark

Harmonizing AI-enhanced physical and cloud operations | Microsoft Azure Blog

Deepen the value chain for geospatial earth imagery on cloud using Azure Orbital

© Copyright

VMware ESXi Power Optimization Overview

WiredGorilla

About the Author

Similar Posts

© Copyright