Americas

  • United States

Google Cloud targets enterprise AI builders with upgraded Vertex AI Training

News
Oct 28, 20255 mins

The new features are designed for companies training massive AI models, making it easier to manage complex workloads and keep systems running smoothly.

Google Cloud sign is displayed at Google campus in Silicon Valley - Sunnyvale, California, USA - November, 2019
Credit: Michael Vi / Shutterstock

Google Cloud is stepping up its push into enterprise AI with an upgraded version of its Vertex AI Training service, designed to make large-scale model training faster and easier.

The new release gives companies access to large-scale compute clusters through a managed Slurm environment, along with built-in monitoring and management tools to simplify complex training jobs.

With this move, Google Cloud is taking a sharper aim at rivals like AWS, Microsoft Azure, and CoreWeave as more enterprises look to build or customize AI models tailored to their own data and business needs.

The company said the new capabilities are designed for organizations running long, compute-intensive training jobs and aim to simplify workload management while improving reliability and throughput.

“Vertex AI Training delivers choice across the full spectrum of model customization,” Google said in a blog post. “This range extends from cost-effective, lightweight tunings like LoRA for rapid behavioral refinement of models like Gemini, all the way to large-scale training of open-source or custom-built models on clusters for full domain specialization.”

The company added that the new Vertex AI Training features focus on flexible infrastructure, advanced data science tools, and integrated frameworks.

Enterprises can quickly set up managed Slurm environments with automated resiliency and cost optimization through the Dynamic Workload Scheduler. The platform also includes hyperparameter tuning, data optimization, and built-in recipes with frameworks like NVIDIA NeMo to streamline model development.

Enterprises weigh AI training gains

Building and scaling generative AI models demands enormous resources, and for many enterprises, the process can be slow and complex.

In its post, Google pointed out that developers often spend more time managing infrastructure, including handling job queues, provisioning clusters, and resolving dependencies, than on actual model innovation.

Analysts suggest that the expanded Vertex AI Training could reshape how enterprises approach large-scale model development.

“Google’s new Vertex AI Training strengthens its position in the enterprise AI infrastructure race,” said Tulika Sheel, senior VP at Kadence International. “By offering managed large-scale training with tools like Slurm, Google is bridging the gap between hyperscale clouds and specialized GPU providers like CoreWeave or Lambda. It gives enterprises a more integrated, compliant, and Google-native option for high-performance AI workloads, which could intensify competition across the cloud ecosystem.”

Others pointed out that Google’s decision to embed managed Slurm directly within Vertex AI Training reflects more than a product update. It represents a shift in how Google is positioning its cloud stack for enterprise-scale AI.

“By placing Slurm inside the same platform that handles data prep, experiment tracking, and model deployment, Google eliminates the loose ends that cause real-world delivery delays,” said Sanchit Vir Gogia, chief analyst and CEO at Greyhound Research. “Teams now have a way to launch complex training jobs without breaking their security model or building a second pipeline. That might sound like a technical fix. It isn’t. It’s strategic.”

Not everyone may benefit

While the update broadens the range of model development options available, not every enterprise will benefit equally.

“For most enterprises, training models from scratch remains expensive and resource-heavy,” Sheel said. “Fine-tuning existing foundation models or adopting retrieval-augmented generation methods still delivers faster results and better ROI. Vertex AI Training may appeal more to advanced enterprises seeking custom control, but the broader market will likely stick to fine-tuning rather than full training.”

Gogia said that although the upgrade lowers the setup burden, the foundational questions haven’t changed. Does the organization have the data, the team, and the governance maturity to make full-model pretraining worthwhile?

“It’s tempting to assume that building your own model means greater control,” Gogia said. “In practice, it often introduces more risk than value. Many firms that try this route run into problems they didn’t expect: misaligned evaluation benchmarks, unclear redaction requirements, and delayed approvals due to compliance ambiguity.”

Shifting patterns in enterprise cloud use

As more organizations consider the balance between customization and cost, the broader impact may extend beyond AI development itself to cloud strategies and spending priorities.

“Making large-scale training easier could drive up demand for GPUs and high-performance compute in the near term,” Sheel said. “However, it may also push enterprises to optimize workloads and budgets more carefully, choosing flexible or hybrid deployments.”

Over time, this could spark more competitive pricing and innovation among cloud providers as enterprises seek efficiency alongside scale. Echoing this view, Gogia said that with Vertex AI Training and managed Slurm, teams can now deploy multi-thousand-GPU clusters in days instead of weeks, enabling them to align compute usage with project timelines and avoid overcommitting resources.

Prasanth Aby Thomas is a freelance technology journalist who specializes in semiconductors, security, AI, and EVs. His work has appeared in DigiTimes Asia and asmag.com, among other publications.

Earlier in his career, Prasanth was a correspondent for Reuters covering the energy sector. Prior to that, he was a correspondent for International Business Times UK covering Asian and European markets and macroeconomic developments.

He holds a Master's degree in international journalism from Bournemouth University, a Master's degree in visual communication from Loyola College, a Bachelor's degree in English from Mahatma Gandhi University, and studied Chinese language at National Taiwan University.

More from this author