aaa
- Detecting Peaks and Valleys: Learn The Essentials for Accurate Analysis.
(link)
- Linear Regression Explained: From Theory to Real-World Implementation
(link)
- Linear Regression for Humans: Predicting the Future in Plain English
(link)
- How Google and Stanford made AI more Interpretable with a 20 year old Technique
(link)
- Visual Intro to Machine Learning
(link)
- Linear Algebra Concepts Every Data Scientist Should Know
(link)
- Table Transformer (TATR)
(link)
- Correlation vs. Regression: A Key Difference That Many Analysts Miss
(link)
- A New Coefficient of Correlation
(link)
- Frustration: One Year With R
(link)
- Precision & Recall
(link)
- An overview of time-aware cross-validation techniques
(link)
- Unsupervised Learning: What, Why, and Where?
(link)
- Does Isolation Forest really perform well in its task?
(link)
- EDA(Exploratory Data Analysis) On Haberman’s Cancer Survival Dataset
(link)
- Data Pre-processing in Python using Scikit-learn - Heart Disease Kaggle
(link)
- Matthews correlation coefficient - Tweet Raschka
(link)
- Supercharge Your Machine Learning Experiments with PyCaret and Gradio
(link)
- Feature Selection — Exhaustive Overview
(link)
- Introduction to Parallel Processing in Machine Learning using Dask
(link)
- Scikit-Learn: A silver bullet for basic machine learning
(link)
- Clustering using PyCaret!!!
(link)
- Data scientist’s guide to efficient coding in Python
(link)
- Applied Machine Learning: Part 1
(link)
- How to avoid machine learning pitfalls: a guide for academic researchers
(link)
- A data science project - Analysis of Berlin rental prices
(link)
- The Normal Distribution Simplified
(link)
- Visualizing Statistics with Python — Telling Stories with Matplot
(link)
- Analyzing the eigenvalues of a covariance matrix to identify multicollinearity
(link)
- Gradient Descent for Machine Learning
(link)
- PM4PY - Process Mining in Python - Fraunhofer
(link)
- 26 Datasets For Your Data Science Projects
(link)
- Practical Machine Learning Tutorial: Part.1 (Exploratory Data Analysis)
(link)
- Data-Driven Artificial Intelligence (AI) for Churn Reduction
(link)
- Feature Transformation for Machine Learning, a Beginners Guide
(link)
- A Reference Notebook for 30+ Statistical Charts in Seaborn
(link)
- Multicollinearity — How does it create a problem?
(link)
- MAE, MSE, RMSE, Coefficient of Determination, Adjusted R Squared — Which Metric is Better?
(link)
- Essential Math for Data Science: Information Theory
(link)
- Bulldozer Prices Prediction
(link)
- 3 must-have projects for your data science portfolio
(link)
- Understand Bayes’ Theorem Through Visualization
(link)
- A Complete Exploratory Data Analysis with Python
(link)
- What’s in the Black Box?
(link)
- How to peek inside a black box model — Understand Partial Dependence Plot
(link)
- Pitfalls To Avoid while Interpreting Machine Learning-PDP/ICE case
(link)
- Understanding Probability Distribution
(link)
- Building 10 Regression Models in Machine Learning with Python
(link)
- First neural network for beginners explained (with code)
(link)
- Data Pre-Processing in Machine Learning with Python and Jupyter
(link)
- Building +10 Classifier Models in Machine Learning
(link)
- A field guide to the most popular parameters
(link)
- Customer Segmentation Analysis with Python
(link)
- Data Preparation and Data Binning
(link)
- Pipelines: Automated machine learning with HyperParameter Tuning!
(link)
- Correlation in Statistics
(link)
- Normal distribution
(link)
- Hierarchical Clustering: It’s just the order of clusters!
(link)
- Understanding AUC - ROC Curve
(link)
- Ridge Regression for Better Usage
(link)
- Data Pre-Processing in Machine Learning with Python+Notebook
(link)
- Entropy is a measure of uncertainty
(link)
- Support Vector Machine
(link)
- Multi-Dimensional Data (PCA) — boon or bane?
(link)
- Intuitions on L1 and L2 Regularisation
(link)
- Top Five Methods to Identify Outliers in Data
(link)
- Bengaluru House Price Prediction
(link)
- Bayes’ Rule Applied
(link)
- Starbucks offers: Advanced customer segmentation with Python
(link)
- How to Not Misunderstand Correlation
(link)
- Logistic Regression — Detailed Overview
(link)
- Introduction to Markov chains
(link)
- Scaling vs. Normalizing Data
(link)
- Chi-Square Test for Feature Selection in Machine learning
(link)
- Handling imbalanced datasets in machine learning
(link)
- Better Heatmaps and Correlation Matrix Plots in Python
(link)
- Logistic Regression Model Tuning with scikit-learn — Part 1
(link)
- Building a Logistic Regression in Python
(link)
- Introduction to Bayesian Linear Regression
(link)
- Understanding Boxplots
(link)
- Patterns, Predictions, and Actions - Buch
(link)
- Gradient Descent in Python
(link)
- 17 types of similarity and dissimilarity measures used in data science
(link)
- Linear Regression using Gradient Descent
(link)
- Histograms and Density Plots in Python
(link)
- The Mathematics Behind Principal Component Analysis
(link)
- Probability concepts explained: Maximum likelihood estimation
(link)
- Fundamental Techniques of Feature Engineering for Machine Learning
(link)
- PCA using Python (scikit-learn)
(link)
- Machine Learning Basics with the K-Nearest Neighbors Algorithm
(link)
- Feature Selection with sklearn and Pandas
(link)
- How to Estimate the Bias and Variance with Python
(link)
- Comet - Supercharge Machine Learning
(link)
- Numerical Optimization: Understanding L-BFGS
(link)
- MLPerf
(link)
- Kaggle - Use Data from differnt Kernels
(link)
- Regular Expressions for Data Scientists
(link)
- Python Machine Learning (2nd Ed.) Code Repository
(link)
- Learning Math for Machine Learning
(link)
- Is R-squared Useless?
(link)
- Google Machine Learning Guides
(link)
- Machine Learning cheatsheets
(link)
- A Comprehensive Guide to Gradient Descent
(link)
- What’s the trade-off between Bias and Variance?
(link)
- Top 5 Machine Learning Algorithms Explained
(link)
- Encoding Categorical Variables in Machine Learning Dataset
(link)
- 17 Clustering Algorithms Used In Data Science and Mining
(link)
- Mathematics Ressources For ML
(link)
- LDA vs. PCA
(link)
- How to do matrix derivatives
(link)
- The Clustering Algorithm with Geolocation data
(link)
- The Poisson Distribution
(link)
- 9 Deadly Sins of Dataset Selection in ML
(link)
- Fraud detection — Unsupervised Anomaly Detection
(link)
- There is no classification — here’s why
(link)
- What Is Your Model Hiding? A Tutorial on Evaluating ML Models
(link)
- A Feature Selection Tool for Machine Learning in Python
(link)
- Transforming Scores Into Probability
(link)
- Probability vs Likelihood
(link)
- How to Remove Outliers for Machine Learning?
(link)
- Predicting House Prices in Ames, IA
(link)
- Using Random Forests to predict Housing Prices
(link)
- House Price Prediction using FastAI
(link)
- Customer Segmentation Using K Means Clustering
(link)
- Clustergam: visualisation of cluster analysis
(link)
- Bayes’ Theorem Unbound
(link)
CRF
- Performing Sequence Labelling using CRF in Python
(link)
- sklearn-crfsuite
(link)
- CRFsuite - Documentation
(link)
- Overview of Conditional Random Fields
(link)
- Conditional Random Fields for Sequence Prediction
(link)
- Getting started with Conditional Random Fields
(link)
- Introduction to Conditional Random Fields
(link)
Curse of Dimensionality
- What Is the Curse of Dimensionality?
(link)
- Curse of Dimensionality — A “Curse” to Machine Learning
(link)
Curse of Dimensionality - Notebook
(link)
- What is the Curse of Dimensionality? Simplest Explanation!
(link)
- Curse of Dimensionality
(link)
- Curse of Dimensionality - notebook
(link)
- The Curse of Dimensionality – Illustrated With Matplotlib
(link)
- The Curse of Dimensionality (part 1)
(link)
- Top 40 Curse of Dimensionality Interview Questions
(link)
Embeddings
- Vector Embeddings Explained for Developers!
(link)
- Explained: Tokens and Embeddings in LLMs
(link)
- Vector Embeddings 101: The New Building Blocks for Generative AI
(link)
- Meet AI’s multitool: Vector embeddings
(link)
- New and improved embedding model
(link)
- openai - embeddings
(link)
- Jurafski-Buch Kap 6
(link)
- Jurafski-Buch Kap 6 - Folien
(link)
- A Guide on Word Embeddings in NLP
(link)
- The Beginner’s Guide to Text Embeddings
(link)
explained.ai
- Twenty-five years of information extraction
(link)
Metrics
- Similarity Metrics in Vector Databases
(link)
- Distance Metrics in Vector Search
(link)
- 9 Distance Measures in Data Science
(link)
- Euclidean vs. Cosine Distance
(link)
- Cosine Similarity Vs Euclidean Distance
(link)
- When to use Cosine Similarity over Euclidean Similarity?
(link)
- Understanding Distance Metrics in Vector Embeddings: Cosine Similarity, Euclidean Distance, and Dot Product
(link)
- Understanding Vector Similarity for Machine Learning
(link)
- How the dot product measures similarity
(link)
- Similarity Measures: Check Your Understanding
(link)
projects
- Mapping Healthcare Access in Chicago
(link)
- Lessons from the Titanic Kaggle Dataset (Part 1): Aggresive Data Cleaning Isn’t Always Improve Model Accuracy
(link)
- Lessons from the Titanic Kaggle Dataset (Part 2): Which Features Matter Most in Predicting Survival?
(link)
- Diabetes Prediction Using Machine Learning Classification Approaches: A Capstone Project by Team Nabhan
(link)