IT Brief New Zealand - Technology news for CIOs & IT decision-makers
Story image

Red Hat & Google Cloud extend partnership for AI innovation

Yesterday

Red Hat and Google Cloud have agreed to extend their partnership to focus on advancing artificial intelligence (AI) for enterprises, specifically with new developments in open and agentic AI solutions.

The collaboration will bring together Red Hat's open source technologies and Google Cloud's infrastructure, along with Google's Gemma family of open AI models. This initiative aims to offer cost-effective AI inference and greater hardware choices for businesses deploying generative AI at scale.

Brian Stevens, Senior Vice President and Chief Technology Officer – AI, Red Hat said, "With this extended collaboration, Red Hat and Google Cloud are committed to driving groundbreaking AI innovations with our combined expertise and platforms. Bringing the power of vLLM and Red Hat open source technologies to Google Cloud and Google's Gemma equips developers with the resources they need to build more accurate, high-performing AI solutions, powered by optimized inference capabilities."

The latest phase of the alliance will see the companies launch the llm-d open source project, with Google acting as a founding contributor. This project is intended to facilitate scalable and efficient AI inference across diverse computing environments. Red Hat is introducing the project as a response to enterprise challenges, such as the growing complexity of AI ecosystems and the need for distributed computing strategies.

The companies have also announced that support for vLLM, an open source inference server used to speed up generative AI outputs, will be enabled on Google Cloud's Tensor Processing Units (TPUs) and GPU-based virtual machines. Google Cloud's TPUs, which are already a part of Google's own AI infrastructure, will now be accessible to developers using vLLM, allowing for improved performance and resource efficiency for fast and accurate inference.

Red Hat will be among the earliest testers for Google's new open model Gemma 3, and it will provide 'Day 0' support for vLLM on Gemma 3 model distributions. This is part of Red Hat's broader efforts as a commercial contributor to the vLLM project, focusing on more cost-effective and responsive platforms for generative AI applications.

The collaboration also includes the availability of Red Hat AI Inference Server on Google Cloud. This enterprise distribution of vLLM helps companies scale and optimise AI model inference within hybrid cloud environments. The integration with Google Cloud enables enterprises to deploy generative AI models that are ready for production and can deliver cost and responsiveness efficiencies at scale.

Supporting community-driven AI development, Red Hat will join Google as a contributor to the Agent2Agent (A2A) protocol, an application-level protocol designed to enable communication between agents or end-users across different platforms and cloud environments. Through the A2A ecosystem, Red Hat aims to promote new ways to accelerate innovation and enhance the effectiveness of AI workflows through agentic AI.

Mark Lohmeyer, Vice President and General Manager, AI and Computing Infrastructure, Google Cloud, commented, "The deepening of our collaboration with Red Hat is driven by our shared commitment to foster open innovation and bring the full potential of AI to our customers. As we enter a new age of AI inference, together we are paving the way for organisations to more effectively scale AI inference and enable agentic AI with the necessary cost-efficiency and high performance."

The llm-d project builds upon the established vLLM community, aiming to create a foundation for generative AI inference that can adapt to the demands of large-scale enterprises while facilitating innovation and cost management. The intention is to enable AI workload scalability across different resource types and enhance workload efficiency.

These initiatives highlight the companies' collective effort to offer business users production-ready, scalable, and efficient AI solutions powered by open source technologies and robust infrastructure options.

Follow us on:
Follow us on LinkedIn Follow us on X
Share on:
Share on LinkedIn Share on X