Vista Equity Partners and Cambium Capital have launched Vector Core Compute, an inference cloud that combines Intel CPUs, SambaNova RDUs and NVIDIA GPUs. The service is live from a Los Angeles data centre, and Together.ai is its first commercial customer.
The companies describe it as the first commercially available deployment of a disaggregated inference architecture, which assigns different stages of AI inference to different types of processors instead of relying on a GPU-only setup.
VC2 uses Intel Xeon 6 processors for orchestration and tool execution in agentic AI workloads, SambaNova SN40 RDUs for decode, and NVIDIA Blackwell GPUs for prefill and prompt caching. The goal is to match each stage of inference with a processor designed for that task.
According to the companies, independent measurement by Artificial Analysis found the architecture was at least two to three times faster than a GPU-only stack. The launch also includes a USD 3.5 billion compute commitment to SambaNova, backed by Vista and supported by Intel.
The Los Angeles site is the first operational facility in a broader rollout. Additional sites are in development in Chicago, Seattle and Phoenix, with plans for deployments across more than 50 US metropolitan areas by converting existing data centres into inference-focused sites without major facility upgrades.
Different approach
The model reflects a broader shift in the AI market from training large models to running them in production. As companies deploy agentic systems for tasks such as code generation, claims handling and customer service, infrastructure suppliers are trying to reduce latency and cost while increasing throughput.
Rather than centralising capacity in a small number of large sites, VC2 is designed as a distributed network that places compute closer to customers. The approach is intended to reduce delays for enterprise users running inference-heavy applications.
Together.ai, which says it serves more than 400 trillion inference tokens a month, is using the new system to expand capacity for customers building agentic AI applications. Vista has also secured early access to the platform for more than 90 portfolio companies serving more than 2.5 million enterprise customers and 750 million users worldwide.
Robert F. Smith outlined the investment case for the platform in comments accompanying the launch.
"Agentic AI is producing real work at enterprise scale: decisions made, code written, claims processed, customers served. The constraint is no longer the model; it is access to the infrastructure that makes it economically viable to run at scale. Vista believes purpose-built inference infrastructure is a key competitive enabler for enterprise software - distributing workloads like always-on monitoring, high-volume data processing, and complex multi-step orchestration across specialized hardware to reduce cost. Securing early access to this type of specialized inference cloud puts that infrastructure directly in the hands of our portfolio companies doing this work today. As our portfolio companies scale enterprise agentic solutions and expand value to their customers, innovative inference infrastructure ensures they capture more of that value with improved inference economics," said Robert F. Smith, Founder, Chairman and Chief Executive Officer of Vista Equity Partners.
Industry backing
Intel Chief Executive Officer Lip-Bu Tan described the demonstration of a fully disaggregated system as a turning point for customers seeking a different economic model for inference workloads.
"The rapid growth of AI training over the past decade has resulted in an exponential increase in inference and agentic AI workloads, driving the need for a new model to meet customer demand for high-performance and low-cost inference at scale," said Tan.
"Today's demonstration of fully disaggregated inference represents a breakthrough moment for customers seeking a cost-efficient and high-performance compute model to accelerate the deployment of AI workloads into production. Together, Intel and its partners are redefining and advancing the economics of running inference at scale," he added.
SambaNova, whose chips handle the decode stage in the VC2 design, called the launch its largest commercial deployment to date.
"SambaNova was founded in 2017 before the mechanics of generative inference were understood. We built a chip that was purpose-built for AI, and today, the RDU is perfectly suited for the agentic workloads of the enterprise," said Rodrigo Liang, Co-Founder and Chief Executive Officer of SambaNova.
"VC2 is the largest commercial deployment of SambaNova technology in our history, and we're proud to partner with the industry's strongest leaders," he said.
Cambium Capital argued that a single chip type will not be enough for increasingly complex AI workflows.
"Cambium has been investing in advanced compute longer than most firms in venture today. We know one thing for certain: the agentic era will not be served by a single chip optimized for a single task," said Landon Downs, Co-Founder and Managing Partner of Cambium Capital.
"The disaggregated architecture is the only way to give each stage of an agentic workflow the silicon it needs," he said.
Together.ai said the arrangement would help it add inference capacity for customers and for Vista portfolio companies using the service.
"We continue to see exponential demand for inference tokens, now serving over 400T tokens a month of open models for agentic use cases," said Vipul Ved Prakash, Co-Founder and Chief Executive Officer of Together.ai.
"We are excited to collaborate with Vector Core Compute to bring significantly more inference capacity to companies building the next generation of agentic applications, including early access for more than 90 portfolio companies of Vista Equity Partners," he said.