With the emergence of generative AI and Retrieval Augmented Generation (RAG), there has been growing demand for vector capable databases.
One particular challenge of vector database deployment is that for some use cases organizations don’t want to deploy entirely on a cloud Database-as-a-Service (DBaaS) and would rather control the database themselves. That’s the challenge that DataStax is looking to solve with its Hyper-Converged Data Platform (HCDP) which is being announced as a preview technology today.
Building Trustworth AI: Microsoft’s Strategy for Secure and Scalable Generative AI
DataStax develops commercial services that are based on the open source Apache Cassandra database as well as streaming capabilities based on the open source Apache Pulsar technology. The company has increasingly been moving over the past year to provide vector database search and RAG capabilities to help enable gen AI deployments. The new Hyper-Converged Data Platform (HCDP) aims to provide organizations with a modern data platform that can be deployed across cloud, on-premises and edge environments
“HCDP brings together our next generation Cassandra database, streaming capabilities, and open search functionality into a single platform,” Bill McLane, CTO of Cloud at DataStax, told VentureBeat.
What is a Hyper-Converged Data Platform?
The term Hyper-Converged Infrastructure (HCI) is a somewhat overloaded one that can often mean different things to different vendors.
McLane explained that DataStax defines Hyper-Converged as the modern take on infrastructure where the IT infrastructure is completely virtualized and acts as a cloud within an organization.
“With HCI there is virtualized compute, software-defined storage, and virtualized networking,” McLane said.
With DataStax HCDP, DataStax provides the components to manage an organization’s data needs. The HCDP includes multiple components including a Hyper-Converged Database (HCD). The platform also supports streaming for real-time events and microservices with Hyper-Converged Streaming (HCS). The other big piece of HCDP is Hyper-Converged OpenSearch (HCOS).
Not a new database, a new way to deploy and run a data platform for AI
The HCDP offering is different from DataStax’s existing platforms, which include DataStax Enterprise (DSE) and Astra cloud.
McLane explained that DSE was built and designed to run Apache Cassandra in a self-hosted, on-premises environment. HCDP on the other hand, has a cloud-native architecture. HCDP is built on the same platform as DataStax’s Astra serverless database and features the latest innovations with vector search and data APIs.
While organizations could have chosen to deploy and run DSE in the cloud, it isn’t an optimized approach. The promise of HCDP however is not just about running a database in the cloud it’s about providing organizations with the ability to deploy in multiple modalities.
The new platform is designed to take advantage of modern hyperconverged infrastructure and hardware from vendors such as Nvidia, Dell and Intel. McLean explained that HCDP provides a more flexible approach to segmenting storage and compute, allowing for the deployment of the database not just into the data center, but also on the edge. This optimized infrastructure allows customers to deploy workloads where it makes the most sense for their needs. McLane noted that with HCDP organizations can choose to run the platform on-premises or in a private cloud or a hybrid cloud environment.
DataStax aims to provide feature parity across HCDP, its existing Cassandra distribution DataStax Enterprise, and its managed database service Astra DB. McLean said the goal is for applications to have a seamless experience regardless of which platform they use.
Content Courtesy – Venture Beat