Home Up PDF Prof. Dr. Ingo Claßen
Data Lakehouse

aaa

  • Building a Local Data Lake from scratch with MinIO, Iceberg, Spark, StarRocks, Mage, and Docker (link)
  • Build an on-premise Data Lakehouse (link)
  • Why Dremio is a must for Apache Iceberg Data Lakehouses (link)
  • Delta, Hudi, Iceberg — A Benchmark Compilation (link)
  • Data lake Table formats : Apache Iceberg vs Apache Hudi vs Delta lake (link)
  • The History and Evolution of Open Table Formats (link)
  • 5 Brilliant Lakehouse Architectures from Tencent, WeChat, and More (link)
  • Data Pipeline Development with MinIO, Iceberg, Nessie, Polars, StarRocks, Mage, and Docker (link)

apache iceberg

  • home (link)
  • doc (link)
  • blog (link)
  • Getting started with Apache Iceberg (link)
  • Apache Iceberg Crash Course for AWS users: Amazon S3, Athena & AWS Glue - Iceberg (link)
  • Boost Your Cloud Data Applications with DuckDB and Iceberg API (link)
  • How Bilibili Builds OLAP Data Lakehouse with Apache Iceberg (link)
  • PyIceberg 0.4.0 (link)
  • Building a Plotly Dashboard on a Lakehouse using Apache Iceberg & Arrow (link)
  • Apache Hudi vs Delta Lake vs Apache Iceberg - Lakehouse Feature Comparison (link)
  • Apache Iceberg, Git-Like Catalog Versioning and Data Lakehouse Management (link)
  • I spent 4 hours learning Apache Iceberg. Here’s what I found. (link)
  • Apache Iceberg Won the Future — What’s Next for 2025? (link)
  • Apache Iceberg Topics: Stream directly into your data lake (link)
  • Apache Iceberg: Built for Big Data, Not Ready for Small(?) (link)
  • Understanding the Future of Apache Iceberg Catalogs (link)

delta lake

  • Delta Lake RS (link)
  • I spent 5 hours understanding more about the Delta Lake table format (link)

hudi

  • Hudi-rs with DuckDB, Polars, Daft, DataFusion — Single-node Lakehouse (link)