← All projects

Apache Iceberg

A high-performance open table format for huge analytic datasets

Data & Analyticsopen-sourcetable-formatdata-lakesqlbig-dataanalyticsapache
Apache Iceberg screenshot

About

Apache Iceberg is an open-source, high-performance table format designed for large-scale analytic workloads on data lakes. It enables multiple query engines such as Spark, Trino, Flink, Presto, Hive, and Impala to safely read and write the same tables concurrently. Key features include full schema evolution, hidden partitioning, time travel, rollback, and data compaction.

Problem

Managing huge analytic tables across multiple query engines is complex, error-prone, and lacks reliability guarantees like schema evolution and ACID transactions.

For

data engineers and analysts working with large-scale analytics on data lakes

How it works

Iceberg defines a high-performance open table format with a metadata layer that tracks snapshots, partition specs, and schema history, allowing any compatible engine to read and write tables safely and efficiently.

Business model

open-source

Status

launched

Company

The Apache Software Foundation

Similar projects