Okera, a privately held data governance platform vendor, was acquired by Databricks today. Okera’s technology will be integrated into Databricks’ existing data governance product, Unity Catalog, to provide further AI-powered functionality.
“By bringing on the talented Okera team and leveraging their domain expertise, we’ll accelerate the Unity Catalog roadmap and provide best-in-class governance for the lakehouse,” said Reynold Xin, cofounder and chief architect at Databricks.
The deal’s financial specifics have not been made public.
Okera was formed in 2016 in San Francisco and raised $29.6 million in investment before being bought. Okera’s recent emphasis has been on the use of artificial intelligence for data governance and data security.
Databricks, on the other hand, has raised $3.5 billion in venture money to expand its data lakehouse and AI technology. Databricks made waves recently for its debut into the generative AI market with the release of Dolly, the ChatGPT clone.
Prior to the announcement of the merger, Databricks and Okera were not strangers.
Nong Li, Okera’s co-founder and CEO, is well-known for developing Apache Parquet, an open-source standard storage format on which Databricks and the rest of the industry is built. Li formerly worked at Databricks, where he led the vectorized Parquet and codegen projects that resulted in the 10x performance gain of Apache Spark 2.0.
Data is essential for analytics and machine learning (ML), regardless of the application. The ability to effectively regulate such data is crucial for accuracy, security, and compliance.
Customers will be able to use Okera to identify, classify, and administer all of their data, analytics, and AI assets using attribute-based and intent-based access policies, according to Xin. Governance is also about observability, which Okera’s technology may help with.
Okera will support Databricks’ data observability on the lakehouse, allowing enterprises to centrally audit and report sensitive data usage across analytics and AI applications, according to Xin.
As AI gets more powerful and diverse, the topic of how to ensure its safety and ethical application has become increasingly pressing.
Nvidia, one of the industry’s leaders, revealed a new initiative last month dubbed NeMo Guardrails, which intends to assist developers in monitoring and regulating the output of generative AI models capable of producing realistic text, images, and audio.
Guardrails and AI governance are also important to Xin and Databricks.