Apache Hudi
hudi.apache.orgAn open source data lake platform with database functionality
Data & Analyticsdata-lakehouseopen-sourcedata-lakeincremental-processingapachestreamingetl

About
Apache Hudi is an open-source data lakehouse platform built on a high-performance open table format that brings database functionality to data lakes. It replaces slow batch data processing with an incremental processing framework designed for low-latency, minute-level analytics. Hudi integrates with a wide ecosystem including Kafka, Spark, Flink, S3, BigQuery, and many other data tools.
Problem
Traditional batch data processing in data lakes is slow and lacks database-like functionality for real-time analytics.
For
data engineers and organizations managing large-scale data lakes
How it works
Hudi provides an open table format and incremental processing framework that layers database capabilities over cloud and on-prem storage, enabling CDC ingestion, streaming, and interactive queries.
Business model
open-source
Status
launched
Company
Apache Software Foundation