A distributed systems reliability glossary (link)
Jepsen Redis 1 (link)
Jepsen Redis 2 (link)
Jepsen Redis 3 (link)
Jepsen Cassandra 1 (link)
Jepsen Postgres 1 (link)
PostgreSQL Backup Strategies (link)
What I Learnt Using Claude Code to Build Production-Ready Apps (link)
Anthropic Looked Inside Claude’s Brain. What They Found Changes Everything. (link)
I Switched from Claude to Kimi K2.5 for a Week — Here’s What Broke and What Got Better (link)
I found at least 9 different date formats in one SQL column — Here’s how I detected them (link)
Stop Using Markdown with Claude Code (link)
Before You Panic About That Viral AI Article, Read This. (link)
Graph database-ball! Exploring the Game with the graph capabilities of LadybugDB, DuckDB and PostgreSQL (link)
Day 9: Still Editing Text Manually? These 6 Linux Commands Will Save You Hours (link)
10 Must-Know Java Fundamentals That Even Seniors Forget (link)
The 6 Most Common Design Patterns in Java Projects (With Examples) (link)
If You Had To Read Only 5 AI Papers, This Should Be It. (link)
The Expert Way to Decide the Data Model for Any Data Engineering Problem (link)
Forward Proxy vs Reverse Proxy: The Deep Dive Every Engineer Must Understand (link)
How PostgreSQL Can Replace 200 WHERE Clauses in Your Codebase (Row Level Security)? (link)
If You Don’t Understand DNS, You Don’t Really Understand the Internet (link)
Monolith vs Microservices: The Big Mistake Junior Developers Often Make (link)
How to Write SQL Queries That Survive Out-of-Order Replays (Before ETL Testing Exposes Bugs) (link)
Python f-Strings: 7 Tricks You Didn’t Know (link)
Anthropic’s Engineer Said Kill Markdown. Here’s What He Actually Meant. (link)
Making JSONB More Queryable with Generated Columns (link)
Open Source LLM Platforms in 2026: Ollama, OpenRouter, Groq, NVIDIA NIM — Which One Should You Use? (link)
Using Claude Code to Build Production-Ready System (link)
9 Linux Tricks (link)
Multi-Agent Systems: When 2 Agents Beat 1 (and When They Don’t) (link)
My Practical Approach for Reviewing AI-Generated Code (link)
The First 10 Shell Tricks That Make You Look Like a Wizard (link)
Understanding Linux Networking Without Reading a 500-Page Book (link)
GitHub Copilot Code Review: Guidelines, Best Practices, and How to Integrate It into Your PR Workflow (link)
I Compared 4 Python HTTP Libraries. One Shocked Me Completely. (link)
I Used the Terminal Wrong for Years (link)
The Best RAG Architectures for AI Agents Every Developer Must Know (link)
Agentic AI Project: Build a Customer Service Chatbot for a Clinic (link)
Google Research: LLM can never achieve consciousness (not even in 100years) (link)
Stop Memorizing Design Patterns: Use This Decision Tree Instead (link)
10 Proven Ways I Instantly Spot Bad AI-Generated Code (link)
Best SQL Hacks I Wouldn’t Have Believed If I Hadn’t Used Them Myself (link)
Java 26 Is Here — Here’s What Actually Matters (link)
Anthropic Is Giving Away 13 Free Courses That Others Charge Thousands For (link)
I Tested 5 Python ORMs. One Replaced SQLAlchemy Completely. (link)
The YC CEO Ships 10,000 Lines a Day. Here’s His Exact Setup (link)
9 Things Every Python Script Should Have (link)
Zero ETL Is the Reality Check Every Data Engineer Needs in 2026 (link)
Agentic AI with DuckDB and smolagents (link)
Our JetBrains Devs Switched to VSCode. They Hate VSCode. (link)
Every Python Concept Explained (link)
Exploratory Data Analysis Checklist: What to Look for Every Time (link)
oLLM: The Revolutionary Python Library Running Powerful Language Models on Ordinary Computers (link)
The RAG Layer Nobody Talks About (link)
“OpenRAG” From Documents to Agentic Search in Minutes (from IBM research open source) (link)
End-to-End Data Engineering Project (by Free Tools) (link)
How to Build a High-Performance, Free ELT Pipeline Locally using DuckDB (link)
10 SQL Time Zone Mistakes Most Teams Discover Only After Numbers Are Escalated by Executives (link)
10 Critical Data Quality Queries Every Data Engineer Should Implement (Before Trust Is Lost) (link)
Google Finally Solved The Fragile Text-to-SQL Systems (link)
F3: The Future-Proof File Format That Finally Gets It Right (link)
Automate schema mappings with LLMs (link)
How to Scrape a Website to Markdown: 2026 Guide (link)
Django vs FastAPI: I Built the Same App Twice (link)
How to Build a Local ELT Pipeline with DuckDB and DBT (link)
NumPy + Pandas: The Only Guide You Need (link)
AI Code Assistants for Data Engineering: I Tested 6 Tools for SQL and Python (link)
Agentic EDA with AI Foundry: Automating Exploratory Analysis (link)
The 2026 Data Engineering Roadmap: Building Data Systems for the Agentic AI Era (link)
10 Business-Centric Data Metrics Analysts Ignore (Until They’re Replaced) (link)
10 dbt Macros That Saved Us 100+ Hours (With Copy-Paste Code) (link)
The Power of PyTorch and vLLM Together (link)
10 Data Models Every Data Engineer Must Know (link)
The First Nmap Scan That Makes You Realize How the Internet Actually Works (link)
How to Write SQL Queries That Use Window Frames to Transform Your Data Analysis (link)
Building a Simple SQL Parser in Python: From Basics to Hands-On (link)
Unlock RAG-Anything’s Power with Ollama on Your Machine (with Docling as Bonus) (link)
SQL Performance Mastery: 10 High-Impact Aggregation Strategies for Sub-Second Queries (link)
Why Semi-Joins in SQL Are More Powerful Than You Think (And How to Master Them) (link)
Handling Imbalanced Data: The Complete Guide Every Data Scientist Must Know (link)
Fivetran vs. Airbyte in 2026 | Complete ELT Guide (link)
AI Skills Are Exploding in 2026 (link)
Why Gradient Boosting Often Beats Deep Learning on Tabular Data (And How to Tune It) (link)
Why Exponentials and Logarithms Dominate Statistics and Information Theory: The Fundamental Properties (link)
The Journey to Causality: From Dashboards to Causal Inference (link)
How to Choose the Right Search Algorithm for your VectorDB? (link)
Top 8 Free Python Excel Libraries for Developers in 2025 (link)
Data Analyst vs. AI Agent: Who Wins the Job in 2026? (link)
How to Write SQL Queries That Detect When a LEFT JOIN Is Silently Dropping Rows (link)
The Semantic Layer Revolution: How dbt and Databricks Built the Universal Language of Business Data (link)
The Truth About Data Modeling: What You Learn Only After Real Projects (link)
I Wasted 6 Months Learning NotebookLM the Hard Way — So You Don’t Have To (link)
Phind vs Google: My Go-To Tool for Explaining Complex Code (link)
Python + MCP Is the New Automation Superpower (Here’s the Proof) (link)
Why COLLATION Rules in SQL Are More Powerful Than You Think (And How to Master Them) (link)
We Spent 2 Years Building a Data Mesh. It Was a $4M Disaster. (link)
Mastering Hyperparameter Tuning (link)
What is Microsoft MarkItDown and Why It Matters? (link)
Every Python Built In Function Explained (link)
How Web Search Inside AI Chatbots Works (link)
Testing Google’s Antigravity for Data Engineering: My End-to-End Experience (link)
Insert-Only Design in Modern Data Warehousing: Lessons from Data Vault 2.0, SCD2, and Databricks Performance (link)
10 Hard Window Join Problems in SQL and How to Solve Them (link)
The Serverless Illusion: Why Everyone Is Quietly Moving Back to VMs (link)
How I Built an AI That Talks to Your Database: A Journey into RAG (link)
What Makes dbt So Popular in Modern Data Teams? (link)
Uber Eats Data Warehouse Architecture: A Complete Guide to Trip Data Modeling and Analytics Storage (link)
UV in Python: The Fastest Package & Project Manager (Complete Guide + Example Project) (link)
HTMX Murdered React: Why Nobody Needs JavaScript Frameworks Anymore (link)
Why Anti-Joins in SQL Are More Powerful Than You Think (link)
Why Window Exclusion Clauses in SQL Are More Powerful Than You Think (link)
Python visualization tools to level up from Matplotlib (link)
How to Write Python (and Others) Code Your Future Self Will Thank You For! (link)
Understanding Palantir’s Ontology: Semantic, Kinetic, and Dynamic Layers Explained (link)
Meet TOON — The Fresh Data Format That Could Replace JSON in the AI Era (link)
Storing products, prices and orders in PostgreSQL (link)
The Convergence Nobody Saw Coming: When APIs, AI, and Cloud Storage Accidentally Built the Future (link)
Your Database Is Slow Because You’re Using UUIDs (link)
Solving Many-to-Many & Drill-Across with the Unified Star Schema (link)
Why Scalar Subqueries in SQL Are More Powerful Than You Think (link)
SQL Query Optimization: Modern Techniques and Best Practices (link)
Every Industry Faces the Same Data Problems. My Take on Palantir’s Solution at AIPCON 8 (link)
Designing the Open Metadata Modeling Platform (link)
Self-Describing SQL: Embedding Metadata as YAML Front-Matter in Generated Objects (link)
Lessons from Data Vault: Principles Without the Dogma (link)
Islands and Gaps with Recursive CTE (link)
Graph Analytics for All of Your Data - Oracle (link)
Getting Started with Oracle AI Database AI Vector Search (link)
How and Why Netflix Built a Real-Time Distributed Graph: Part 1 — Ingesting and Processing Data Streams at Internet Scale (link)
From Kimball to Metadata: How Dimensional Thinking Still Shapes Modern Data Architecture (link)
Dual SCD2: The Foundation for True History in Data Warehousing (link)
Time-Based SQL Questions: Gaps, Overlaps, and Intervals (link)
Resampling Imbalanced Datasets for Binary Classification (link)
Robust methods to generate synthetic table data (link)
A Marketer’s Guide to Calculus (link)
Cosine Distance vs Dot Product vs Euclidean in vector similarity search (link)
6 Data Modeling Mistakes That Kill Scalability (and How to Fix Them) (link)
Why Recursive CTEs Are More Powerful Than You Think (link)
Palantir’s Ontology, Kimball’s Star Schema, and Model-Driven Data Engineering: A Comparative View (link)
SQL Complexity Explained: What Your Queries Are Really Doing Behind the Scenes (link)
Synthetic Data: What It Is and How to Use It (link)
The Only 15 SQL Questions I Ask in Every Junior Data Scientist Interview (link)
From Raw Data to Reliable Systems: The Power of Data Modeling in Data Engineering (link)
From Chaos to Clarity: Advanced Data Models Every Data Engineer Must Master. (link)
How to Compare Two or More Distributions (link)
Is Your Training Data Representative? A Guide to Checking with PSI in Python (link)
What Fivetran’s acquisition of dbt Labs would mean for the Data Industry (link)
Agentic AI: Building Long-Term Memory (link)
I built an end-to-end interpretable Machine Learning research pipeline (link)
The Anatomy of a Modern LLM (link)
Waiting for Postgres 18: Accelerating Disk Reads with Asynchronous I/O (link)
Building 17 Agentic AI Patterns and Their Role in Large-Scale AI Systems (link)
MCMC & the art of Sampling without Sampling (link)
Building a Real-Time Profit & Loss Engine with RisingWave and Streaming SQL (link)
What Is The Best Diagramming Software in 2025 (link)
excalidraw (link)
Type Casting in Python (link)
SIMD: The real superpower behind super fast databases (link)
How to Start Learning Machine Learning: A Practical Guide (link)
Autoencoders for Defect Detection in Images (link)
What is an autoencoder? (link)
Integrating LLMs and AI Agents into Data Engineering Workflows (link)
Memory Management in Python (link)
Get Excited About Postgres 18 (link)
Optuna: The Hyperparameter Optimization Framework That Saved My Machine Learning Sanity (link)
7 Wonders of Data Science (link)
SQL Window Functions Explained Like a Story (link)
Why Generative AI Is Forcing Us to Rethink Data Modeling (link)
Stop Using Requests — Try This Modern HTTP Library Instead (link)
LangExtract (Google, Open Source): Turn Unstructured Text into Structured, Auditable Data (link)
Zero Degrees of Separation (link)
HTAP: Still the Dream, a Decade Later (link)
Demystifying Apache Spark (link)
mlflow (link)
Radically Simple Data Lineage (link)
DocumentDB (link)
AWS joins the DocumentDB (link)
A decade of database innovation: The Amazon Aurora story (link)
5 SQL Questions That Stump Even Senior Analysts (link)
SQLModel (link)
Effortless EDA with Sweetviz & YData-Profiling (link)
Introducing LangExtract: A Gemini powered information extraction library (link)
LangExtract (link)
Fun and weirdness with SSDs (link)
Vector Search Isn’t the Answer to Everything. (link)
What You Should Know About B-Trees on Disk (link)
PyCaret (link)
STOP Guessing Who Will Leave — How I Would Predict Customer Churn Before It Happens (link)
What is a t-test and When to Use It in Pandas? (link)
The Complete Beginner’s Guide to Python Modules (link)
Kimball Star Schema vs Palantir’s Ontology (link)
Database Connections in FastAPI: Best Practices for Efficient and Scalable APIs (link)
The Dark Side of @Transactional in Spring Boot—Exposed (link)
Customer-Facing Analytics Without Denormalizing Everything (link)
Building Agentic Adaptive RAG with LangGraph for Production (link)
How CERN Powers Ground-Breaking Physics with TimescaleDB (link)
RAG Without Embeddings? Here’s how OpenAI is doing this… (link)
Object Detection with Python and HuggingFace Transformers (link)
Implementing 12 AI Agent Evaluation Techniques Using LangSmith (link)
Why the Server Should Handle the Web Again (link)
From Chaos to Clarity: Building Modern Python Projects with UV (link)
Experimenting with SQL:2023 Property-Graph Queries in Postgres 18 (link)
A Database Schema for Engineering Project Management (link)
Featherweight - Lightning Fast Analytics with DuckDB and Postgres (link)
Why Your Data Lake Needs BLM, Not LLM (link)
SQL Isn’t a Query Language. It’s a Thinking Framework (link)
I spent $500 testing Replit/Lovable/Bolt/v0 & Cursor so you don’t have to (link)
Billions of Edges Per Second with Postgres (link)
pdot: Interactive Directed Graphs of Your Database (link)
Exploring Databases Visually (link)
pdot: Exploring Databases Visually, Part II (link)
pgai (link)
How to Build Near Real Time Data Pipelines with Incremental Loading (link)
5 Things You Didn’t Know About LocalStorage (link)
Building Real-Time Dashboards with FastAPI and HTMX (link)
HTMX Made Me Like the Web Again (link)
Build an AI Agent That Turns SQL Databases into Dashboards — No Queries Needed (link)
OpenAI: Scaling PostgreSQL to the Next Level (link)
DuckDB vs Databricks SQL Warehouse: Can We Save on Compute? (link)
Dynamic Data Source Routing in Spring Boot: Master Multi-Tenancy & Read-Write Separation (link)
SQLMesh Incremental Modeling with DuckDB: A Hands-On Tutorial (link)
Postgres Language Server: Initial Release (link)
Building a modern Data Warehouse from scratch (link)
A case where SQL joins struggle but MongoDB documents shine (link)
Beyond Materialized Views: Using DuckDB for In-Process Columnar Caching (link)
Life Altering Postgresql Patterns (link)
The Most Comprehensive Explanation of Session, Cookie, Token, and JWT (link)
PgBouncer: Don’t Let Connection Chaos Ruin Your Day (link)
Can Artificial Intelligence Created Better Tables Than You? (link)
Benchmarking PostgreSQL Batch Ingest (link)
This Happens Inside Python…When We Call a Function (link)
SQLAlchemy 2.0: The Most Powerful ORM for Python Yet (link)
Stop Writing Manual Validators! Use Pydantic for Data Validation (link)
Is Kimball Still Relevant in the Modern Data Warehouse Era? (link)
STOP Using Python Dictionaries Like This! (link)
Multi-Tenant Architecture using SpringBoot and PostgreSQL (link)
PyGWalker (link)
Data Warehouse Basics: How to Handle Changing Data with SCDs (link)
Postgres is all you need for vectors (link)
Building a Perfect Million Parameter LLM Like ChatGPT in Python (link)
I “vibe-coded” over 160,000 lines of code. It IS real. (link)
Hard-Earned Lessons from a Year of Building AI Agents (link)
How To Train Your PyTorch Models (Much) Faster (link)
Prompt Decorators: A Simple Way to Improve AI Responses (link)
Postgres query plan visualization tools (link)
Vector Search at 10,000 QPS in PostgreSQL with VectorChord (link)
Optimizing PostgreSQL Performance: Essential Queries for Monitoring and Maintenance (link)
One Line of SQL, All the LiteLLM Embeddings (link)
How to Map Column Values in a Pandas DataFrame? (link)
Top 6 Core App Dashboard Building Tools (link)
How I Learned to Love init.py : A Simple Guide (link)
Postgres as a Graph Database: (Ab)using pgRouting (link)
EdgeDB is now Gel and Postgres is the Future (link)
EdgeDB 1.0 (link)
Use PASSING with JSON_TABLE() To Make Calculations (link)
Delta Lake 4.0: Next-Level Big Data Management (link)
From Traditional BI to GenBI: Embracing a Smarter, More Human Approach (link)
ClickBench — a Benchmark For Analytical DBMS (link)
Modern CI-CD Pipelines of REST API Python Project with UV (link)
Real-Time Chat Application with FastAPI and WebSockets (link)
10 Advanced Python Concepts You Should Know To Be a Senior Developer (link)
MkDocs (link)
Representing graphs in Postgresql (link)
Redis with FastAPI for Lightning-Fast Applications (link)
Building the Modern PostgreSQL GUI With PopSQL (link)
The twelve-factor app (link)
20 Advanced Statistical Approaches Every Data Scientist Should Know (link)
How Uber Handles TRILLIONS of Transactions — The Secret (link)
Advanced SQL for Data Professionals (link)
Handling Slowly Changing Dimensions (SCD) in Modern Data Pipelines: A Complete Guide with SQL Examples (link)
RunSQL (link)
ChartDB (link)
Rethinking the frontend with HTMX (link)
SQLModel (link)
documentdb (link)
AlloyDB vs PostgreSQL: Unleash Performance, Slash Costs, Simplify Data Stack (link)
Don’t Fear Async: A Friendly Guide to Python’s Most Powerful Tool (link)
How We Built a Content Recommendation System With Pgai and Pgvectorscale (link)
A Visual Exploration of Semantic Text Chunking (link)
Jupyter Agent: Revolutionizing Data Analysis with LLMs (link)
Combining FastAPI, PostgreSQL, and Leaflet — GIS Tutorial (link)
Python Memory Management: Best Practices for Performance (link)
From query to plot: Exploring GeoParquet Overture Maps with Ibis, DuckDB, and Lonboard (link)
This is How I Use Swagger to Design REST APIs Before Starting the Development (link)