Apache Iceberg

iceberg.apache.org

A high-performance open table format for huge analytic datasets

Data & Analytics open-source table-format data-lake sql big-data analytics apache

/ About /

Apache Iceberg is an open-source, high-performance table format designed for large-scale analytic workloads on data lakes. It enables multiple query engines such as Spark, Trino, Flink, Presto, Hive, and Impala to safely read and write the same tables concurrently. Key features include full schema evolution, hidden partitioning, time travel, rollback, and data compaction.

/ How it works /

Iceberg defines a high-performance open table format with a metadata layer that tracks snapshots, partition specs, and schema history, allowing any compatible engine to read and write tables safely and efficiently.

/ Who it's for /

data engineers and analysts working with large-scale analytics on data lakes

/ More info /

Background.

Status: launched
Business model: open-source
Company: The Apache Software Foundation

Contact

/ Discovered patterns /

Similar projects.

Coming soonSpektrail’s read on Data & Analytics

Editorial take on the space this project sits in — momentum signals, adjacent moves, our call on whether the wedge is real. Get pinged when we publish a new read or when the landscape shifts.

Coming soon

Have a take on this space?

Tell us what you’d build differently, where you think the incumbents miss, or what we’ve gotten wrong about this project. Comments + reactions are coming soon.

Apache Iceberg

Background.

Contact

Similar projects.

Delta Lake

Apache Hudi

Dremio

Have a take on this space?