Home Up PDF Prof. Dr. Ingo Claßen
Noch lesen
  • A distributed systems reliability glossary (link)

  • Jepsen Redis 1 (link)

  • Jepsen Redis 2 (link)

  • Jepsen Redis 3 (link)

  • Jepsen Cassandra 1 (link)

  • Jepsen Postgres 1 (link)

  • PostgreSQL Backup Strategies (link)

  • What I Learnt Using Claude Code to Build Production-Ready Apps (link)

  • Anthropic Looked Inside Claude’s Brain. What They Found Changes Everything. (link)

  • I Switched from Claude to Kimi K2.5 for a Week — Here’s What Broke and What Got Better (link)

  • I found at least 9 different date formats in one SQL column — Here’s how I detected them (link)

  • Stop Using Markdown with Claude Code (link)

  • Before You Panic About That Viral AI Article, Read This. (link)

  • Graph database-ball! Exploring the Game with the graph capabilities of LadybugDB, DuckDB and PostgreSQL (link)

  • Day 9: Still Editing Text Manually? These 6 Linux Commands Will Save You Hours (link)

  • 10 Must-Know Java Fundamentals That Even Seniors Forget (link)

  • The 6 Most Common Design Patterns in Java Projects (With Examples) (link)

  • If You Had To Read Only 5 AI Papers, This Should Be It. (link)

  • The Expert Way to Decide the Data Model for Any Data Engineering Problem (link)

  • Forward Proxy vs Reverse Proxy: The Deep Dive Every Engineer Must Understand (link)

  • How PostgreSQL Can Replace 200 WHERE Clauses in Your Codebase (Row Level Security)? (link)

  • If You Don’t Understand DNS, You Don’t Really Understand the Internet (link)

  • Monolith vs Microservices: The Big Mistake Junior Developers Often Make (link)

  • How to Write SQL Queries That Survive Out-of-Order Replays (Before ETL Testing Exposes Bugs) (link)

  • Python f-Strings: 7 Tricks You Didn’t Know (link)

  • Anthropic’s Engineer Said Kill Markdown. Here’s What He Actually Meant. (link)

  • Making JSONB More Queryable with Generated Columns (link)

  • Open Source LLM Platforms in 2026: Ollama, OpenRouter, Groq, NVIDIA NIM — Which One Should You Use? (link)

  • Using Claude Code to Build Production-Ready System (link)

  • 9 Linux Tricks (link)

  • Multi-Agent Systems: When 2 Agents Beat 1 (and When They Don’t) (link)

  • My Practical Approach for Reviewing AI-Generated Code (link)

  • The First 10 Shell Tricks That Make You Look Like a Wizard (link)

  • Understanding Linux Networking Without Reading a 500-Page Book (link)

  • GitHub Copilot Code Review: Guidelines, Best Practices, and How to Integrate It into Your PR Workflow (link)

  • I Compared 4 Python HTTP Libraries. One Shocked Me Completely. (link)

  • I Used the Terminal Wrong for Years (link)

  • The Best RAG Architectures for AI Agents Every Developer Must Know (link)

  • Agentic AI Project: Build a Customer Service Chatbot for a Clinic (link)

  • Google Research: LLM can never achieve consciousness (not even in 100years) (link)

  • Stop Memorizing Design Patterns: Use This Decision Tree Instead (link)

  • 10 Proven Ways I Instantly Spot Bad AI-Generated Code (link)

  • Best SQL Hacks I Wouldn’t Have Believed If I Hadn’t Used Them Myself (link)

  • Java 26 Is Here — Here’s What Actually Matters (link)

  • Anthropic Is Giving Away 13 Free Courses That Others Charge Thousands For (link)

  • I Tested 5 Python ORMs. One Replaced SQLAlchemy Completely. (link)

  • The YC CEO Ships 10,000 Lines a Day. Here’s His Exact Setup (link)

  • 9 Things Every Python Script Should Have (link)

  • Zero ETL Is the Reality Check Every Data Engineer Needs in 2026 (link)

  • Agentic AI with DuckDB and smolagents (link)

  • Our JetBrains Devs Switched to VSCode. They Hate VSCode. (link)

  • Every Python Concept Explained (link)

  • Exploratory Data Analysis Checklist: What to Look for Every Time (link)

  • oLLM: The Revolutionary Python Library Running Powerful Language Models on Ordinary Computers (link)

  • The RAG Layer Nobody Talks About (link)

  • “OpenRAG” From Documents to Agentic Search in Minutes (from IBM research open source) (link)

  • End-to-End Data Engineering Project (by Free Tools) (link)

  • How to Build a High-Performance, Free ELT Pipeline Locally using DuckDB (link)

  • 10 SQL Time Zone Mistakes Most Teams Discover Only After Numbers Are Escalated by Executives (link)

  • 10 Critical Data Quality Queries Every Data Engineer Should Implement (Before Trust Is Lost) (link)

  • Google Finally Solved The Fragile Text-to-SQL Systems (link)

  • F3: The Future-Proof File Format That Finally Gets It Right (link)

  • Automate schema mappings with LLMs (link)

  • How to Scrape a Website to Markdown: 2026 Guide (link)

  • Django vs FastAPI: I Built the Same App Twice (link)

  • How to Build a Local ELT Pipeline with DuckDB and DBT (link)

  • NumPy + Pandas: The Only Guide You Need (link)

  • AI Code Assistants for Data Engineering: I Tested 6 Tools for SQL and Python (link)

  • Agentic EDA with AI Foundry: Automating Exploratory Analysis (link)

  • The 2026 Data Engineering Roadmap: Building Data Systems for the Agentic AI Era (link)

  • 10 Business-Centric Data Metrics Analysts Ignore (Until They’re Replaced) (link)

  • 10 dbt Macros That Saved Us 100+ Hours (With Copy-Paste Code) (link)

  • The Power of PyTorch and vLLM Together (link)

  • 10 Data Models Every Data Engineer Must Know (link)

  • The First Nmap Scan That Makes You Realize How the Internet Actually Works (link)

  • How to Write SQL Queries That Use Window Frames to Transform Your Data Analysis (link)

  • Building a Simple SQL Parser in Python: From Basics to Hands-On (link)

  • Unlock RAG-Anything’s Power with Ollama on Your Machine (with Docling as Bonus) (link)

  • SQL Performance Mastery: 10 High-Impact Aggregation Strategies for Sub-Second Queries (link)

  • Why Semi-Joins in SQL Are More Powerful Than You Think (And How to Master Them) (link)

  • Handling Imbalanced Data: The Complete Guide Every Data Scientist Must Know (link)

  • Fivetran vs. Airbyte in 2026 | Complete ELT Guide (link)

  • AI Skills Are Exploding in 2026 (link)

  • Why Gradient Boosting Often Beats Deep Learning on Tabular Data (And How to Tune It) (link)

  • Why Exponentials and Logarithms Dominate Statistics and Information Theory: The Fundamental Properties (link)

  • The Journey to Causality: From Dashboards to Causal Inference (link)

  • How to Choose the Right Search Algorithm for your VectorDB? (link)

  • Top 8 Free Python Excel Libraries for Developers in 2025 (link)

  • Data Analyst vs. AI Agent: Who Wins the Job in 2026? (link)

  • How to Write SQL Queries That Detect When a LEFT JOIN Is Silently Dropping Rows (link)

  • The Semantic Layer Revolution: How dbt and Databricks Built the Universal Language of Business Data (link)

  • The Truth About Data Modeling: What You Learn Only After Real Projects (link)

  • I Wasted 6 Months Learning NotebookLM the Hard Way — So You Don’t Have To (link)

  • Phind vs Google: My Go-To Tool for Explaining Complex Code (link)

  • Python + MCP Is the New Automation Superpower (Here’s the Proof) (link)

  • Why COLLATION Rules in SQL Are More Powerful Than You Think (And How to Master Them) (link)

  • We Spent 2 Years Building a Data Mesh. It Was a $4M Disaster. (link)

  • Mastering Hyperparameter Tuning (link)

  • What is Microsoft MarkItDown and Why It Matters? (link)

  • Every Python Built In Function Explained (link)

  • How Web Search Inside AI Chatbots Works (link)

  • Testing Google’s Antigravity for Data Engineering: My End-to-End Experience (link)

  • Insert-Only Design in Modern Data Warehousing: Lessons from Data Vault 2.0, SCD2, and Databricks Performance (link)

  • 10 Hard Window Join Problems in SQL and How to Solve Them (link)

  • The Serverless Illusion: Why Everyone Is Quietly Moving Back to VMs (link)

  • How I Built an AI That Talks to Your Database: A Journey into RAG (link)

  • What Makes dbt So Popular in Modern Data Teams? (link)

  • Uber Eats Data Warehouse Architecture: A Complete Guide to Trip Data Modeling and Analytics Storage (link)

  • UV in Python: The Fastest Package & Project Manager (Complete Guide + Example Project) (link)

  • HTMX Murdered React: Why Nobody Needs JavaScript Frameworks Anymore (link)

  • Why Anti-Joins in SQL Are More Powerful Than You Think (link)

  • Why Window Exclusion Clauses in SQL Are More Powerful Than You Think (link)

  • Python visualization tools to level up from Matplotlib (link)

  • How to Write Python (and Others) Code Your Future Self Will Thank You For! (link)

  • Understanding Palantir’s Ontology: Semantic, Kinetic, and Dynamic Layers Explained (link)

  • Meet TOON — The Fresh Data Format That Could Replace JSON in the AI Era (link)

  • Storing products, prices and orders in PostgreSQL (link)

  • The Convergence Nobody Saw Coming: When APIs, AI, and Cloud Storage Accidentally Built the Future (link)

  • Your Database Is Slow Because You’re Using UUIDs (link)

  • Solving Many-to-Many & Drill-Across with the Unified Star Schema (link)

  • Why Scalar Subqueries in SQL Are More Powerful Than You Think (link)

  • SQL Query Optimization: Modern Techniques and Best Practices (link)

  • Every Industry Faces the Same Data Problems. My Take on Palantir’s Solution at AIPCON 8 (link)

  • Designing the Open Metadata Modeling Platform (link)

  • Self-Describing SQL: Embedding Metadata as YAML Front-Matter in Generated Objects (link)

  • Lessons from Data Vault: Principles Without the Dogma (link)

  • Islands and Gaps with Recursive CTE (link)

  • Graph Analytics for All of Your Data - Oracle (link)

  • Getting Started with Oracle AI Database AI Vector Search (link)

  • How and Why Netflix Built a Real-Time Distributed Graph: Part 1 — Ingesting and Processing Data Streams at Internet Scale (link)

  • From Kimball to Metadata: How Dimensional Thinking Still Shapes Modern Data Architecture (link)

  • Dual SCD2: The Foundation for True History in Data Warehousing (link)

  • Time-Based SQL Questions: Gaps, Overlaps, and Intervals (link)

  • Resampling Imbalanced Datasets for Binary Classification (link)

  • Robust methods to generate synthetic table data (link)

  • A Marketer’s Guide to Calculus (link)

  • Cosine Distance vs Dot Product vs Euclidean in vector similarity search (link)

  • 6 Data Modeling Mistakes That Kill Scalability (and How to Fix Them) (link)

  • Why Recursive CTEs Are More Powerful Than You Think (link)

  • Palantir’s Ontology, Kimball’s Star Schema, and Model-Driven Data Engineering: A Comparative View (link)

  • SQL Complexity Explained: What Your Queries Are Really Doing Behind the Scenes (link)

  • Synthetic Data: What It Is and How to Use It (link)

  • The Only 15 SQL Questions I Ask in Every Junior Data Scientist Interview (link)

  • From Raw Data to Reliable Systems: The Power of Data Modeling in Data Engineering (link)

  • From Chaos to Clarity: Advanced Data Models Every Data Engineer Must Master. (link)

  • How to Compare Two or More Distributions (link)

  • Is Your Training Data Representative? A Guide to Checking with PSI in Python (link)

  • What Fivetran’s acquisition of dbt Labs would mean for the Data Industry (link)

  • Agentic AI: Building Long-Term Memory (link)

  • I built an end-to-end interpretable Machine Learning research pipeline (link)

  • The Anatomy of a Modern LLM (link)

  • Waiting for Postgres 18: Accelerating Disk Reads with Asynchronous I/O (link)

  • Building 17 Agentic AI Patterns and Their Role in Large-Scale AI Systems (link)

  • MCMC & the art of Sampling without Sampling (link)

  • Building a Real-Time Profit & Loss Engine with RisingWave and Streaming SQL (link)

  • What Is The Best Diagramming Software in 2025 (link)

  • excalidraw (link)

  • Type Casting in Python (link)

  • SIMD: The real superpower behind super fast databases (link)

  • How to Start Learning Machine Learning: A Practical Guide (link)

  • Autoencoders for Defect Detection in Images (link)

  • What is an autoencoder? (link)

  • Integrating LLMs and AI Agents into Data Engineering Workflows (link)

  • Memory Management in Python (link)

  • Get Excited About Postgres 18 (link)

  • Optuna: The Hyperparameter Optimization Framework That Saved My Machine Learning Sanity (link)

  • 7 Wonders of Data Science (link)

  • SQL Window Functions Explained Like a Story (link)

  • Why Generative AI Is Forcing Us to Rethink Data Modeling (link)

  • Stop Using Requests — Try This Modern HTTP Library Instead (link)

  • LangExtract (Google, Open Source): Turn Unstructured Text into Structured, Auditable Data (link)

  • Zero Degrees of Separation (link)

  • HTAP: Still the Dream, a Decade Later (link)

  • Demystifying Apache Spark (link)

  • mlflow (link)

  • Radically Simple Data Lineage (link)

  • DocumentDB (link)

  • AWS joins the DocumentDB (link)

  • A decade of database innovation: The Amazon Aurora story (link)

  • 5 SQL Questions That Stump Even Senior Analysts (link)

  • SQLModel (link)

  • Effortless EDA with Sweetviz & YData-Profiling (link)

  • Introducing LangExtract: A Gemini powered information extraction library (link)

  • LangExtract (link)

  • Fun and weirdness with SSDs (link)

  • Vector Search Isn’t the Answer to Everything. (link)

  • What You Should Know About B-Trees on Disk (link)

  • PyCaret (link)

  • STOP Guessing Who Will Leave — How I Would Predict Customer Churn Before It Happens (link)

  • What is a t-test and When to Use It in Pandas? (link)

  • The Complete Beginner’s Guide to Python Modules (link)

  • Kimball Star Schema vs Palantir’s Ontology (link)

  • Database Connections in FastAPI: Best Practices for Efficient and Scalable APIs (link)

  • The Dark Side of @Transactional in Spring Boot—Exposed (link)

  • Customer-Facing Analytics Without Denormalizing Everything (link)

  • Building Agentic Adaptive RAG with LangGraph for Production (link)

  • How CERN Powers Ground-Breaking Physics with TimescaleDB (link)

  • RAG Without Embeddings? Here’s how OpenAI is doing this… (link)

  • Object Detection with Python and HuggingFace Transformers (link)

  • Implementing 12 AI Agent Evaluation Techniques Using LangSmith (link)

  • Why the Server Should Handle the Web Again (link)

  • From Chaos to Clarity: Building Modern Python Projects with UV (link)

  • Experimenting with SQL:2023 Property-Graph Queries in Postgres 18 (link)

  • A Database Schema for Engineering Project Management (link)

  • Featherweight - Lightning Fast Analytics with DuckDB and Postgres (link)

  • Why Your Data Lake Needs BLM, Not LLM (link)

  • SQL Isn’t a Query Language. It’s a Thinking Framework (link)

  • I spent $500 testing Replit/Lovable/Bolt/v0 & Cursor so you don’t have to (link)

  • Billions of Edges Per Second with Postgres (link)

  • pdot: Interactive Directed Graphs of Your Database (link)

  • Exploring Databases Visually (link)

  • pdot: Exploring Databases Visually, Part II (link)

  • pgai (link)

  • How to Build Near Real Time Data Pipelines with Incremental Loading (link)

  • 5 Things You Didn’t Know About LocalStorage (link)

  • Building Real-Time Dashboards with FastAPI and HTMX (link)

  • HTMX Made Me Like the Web Again (link)

  • Build an AI Agent That Turns SQL Databases into Dashboards — No Queries Needed (link)

  • OpenAI: Scaling PostgreSQL to the Next Level (link)

  • DuckDB vs Databricks SQL Warehouse: Can We Save on Compute? (link)

  • Dynamic Data Source Routing in Spring Boot: Master Multi-Tenancy & Read-Write Separation (link)

  • SQLMesh Incremental Modeling with DuckDB: A Hands-On Tutorial (link)

  • Postgres Language Server: Initial Release (link)

  • Building a modern Data Warehouse from scratch (link)

  • A case where SQL joins struggle but MongoDB documents shine (link)

  • Beyond Materialized Views: Using DuckDB for In-Process Columnar Caching (link)

  • Life Altering Postgresql Patterns (link)

  • The Most Comprehensive Explanation of Session, Cookie, Token, and JWT (link)

  • PgBouncer: Don’t Let Connection Chaos Ruin Your Day (link)

  • Can Artificial Intelligence Created Better Tables Than You? (link)

  • Benchmarking PostgreSQL Batch Ingest (link)

  • This Happens Inside Python…When We Call a Function (link)

  • SQLAlchemy 2.0: The Most Powerful ORM for Python Yet (link)

  • Stop Writing Manual Validators! Use Pydantic for Data Validation (link)

  • Is Kimball Still Relevant in the Modern Data Warehouse Era? (link)

  • STOP Using Python Dictionaries Like This! (link)

  • Multi-Tenant Architecture using SpringBoot and PostgreSQL (link)

  • PyGWalker (link)

  • Data Warehouse Basics: How to Handle Changing Data with SCDs (link)

  • Postgres is all you need for vectors (link)

  • Building a Perfect Million Parameter LLM Like ChatGPT in Python (link)

  • I “vibe-coded” over 160,000 lines of code. It IS real. (link)

  • Hard-Earned Lessons from a Year of Building AI Agents (link)

  • How To Train Your PyTorch Models (Much) Faster (link)

  • Prompt Decorators: A Simple Way to Improve AI Responses (link)

  • Postgres query plan visualization tools (link)

  • Vector Search at 10,000 QPS in PostgreSQL with VectorChord (link)

  • Optimizing PostgreSQL Performance: Essential Queries for Monitoring and Maintenance (link)

  • One Line of SQL, All the LiteLLM Embeddings (link)

  • How to Map Column Values in a Pandas DataFrame? (link)

  • Top 6 Core App Dashboard Building Tools (link)

  • How I Learned to Love init.py : A Simple Guide (link)

  • Postgres as a Graph Database: (Ab)using pgRouting (link)

  • EdgeDB is now Gel and Postgres is the Future (link)

  • EdgeDB 1.0 (link)

  • Use PASSING with JSON_TABLE() To Make Calculations (link)

(link)

  • Delta Lake 4.0: Next-Level Big Data Management (link)

  • From Traditional BI to GenBI: Embracing a Smarter, More Human Approach (link)

  • ClickBench — a Benchmark For Analytical DBMS (link)

  • Modern CI-CD Pipelines of REST API Python Project with UV (link)

  • Real-Time Chat Application with FastAPI and WebSockets (link)

  • 10 Advanced Python Concepts You Should Know To Be a Senior Developer (link)

  • MkDocs (link)

  • Representing graphs in Postgresql (link)

  • Redis with FastAPI for Lightning-Fast Applications (link)

  • Building the Modern PostgreSQL GUI With PopSQL (link)

  • The twelve-factor app (link)

  • 20 Advanced Statistical Approaches Every Data Scientist Should Know (link)

  • How Uber Handles TRILLIONS of Transactions — The Secret (link)

  • Advanced SQL for Data Professionals (link)

  • Handling Slowly Changing Dimensions (SCD) in Modern Data Pipelines: A Complete Guide with SQL Examples (link)

  • RunSQL (link)

  • ChartDB (link)

  • Rethinking the frontend with HTMX (link)

  • SQLModel (link)

  • documentdb (link)

  • AlloyDB vs PostgreSQL: Unleash Performance, Slash Costs, Simplify Data Stack (link)

  • Don’t Fear Async: A Friendly Guide to Python’s Most Powerful Tool (link)

  • How We Built a Content Recommendation System With Pgai and Pgvectorscale (link)

  • A Visual Exploration of Semantic Text Chunking (link)

  • Jupyter Agent: Revolutionizing Data Analysis with LLMs (link)

  • Combining FastAPI, PostgreSQL, and Leaflet — GIS Tutorial (link)

  • Python Memory Management: Best Practices for Performance (link)

  • From query to plot: Exploring GeoParquet Overture Maps with Ibis, DuckDB, and Lonboard (link)

  • This is How I Use Swagger to Design REST APIs Before Starting the Development (link)