Algorithms | Towards Data Science https://towardsdatascience.com/tag/algorithms/ Publish AI, ML & data-science insights to a global community of data professionals. Mon, 15 Dec 2025 20:49:55 +0000 en-US hourly 1 https://wordpress.org/?v=6.8.3 https://towardsdatascience.com/wp-content/uploads/2025/02/cropped-Favicon-32x32.png Algorithms | Towards Data Science https://towardsdatascience.com/tag/algorithms/ 32 32 The Machine Learning “Advent Calendar” Day 15: SVM in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-15-svm-in-excel/ Mon, 15 Dec 2025 19:41:01 +0000 https://towardsdatascience.com/?p=607912 Instead of starting with margins and geometry, this article builds the Support Vector Machine step by step from familiar models. By changing the loss function and reusing regularization, SVM appears naturally as a linear classifier trained by optimization. This perspective unifies logistic regression, SVM, and other linear models into a single, coherent framework.

The post The Machine Learning “Advent Calendar” Day 15: SVM in Excel appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 10: DBSCAN in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-10-dbscan-in-excel/ Wed, 10 Dec 2025 16:30:00 +0000 https://towardsdatascience.com/?p=607882 DBSCAN shows how far we can go with a very simple idea: count how many neighbors live close to each point.
It finds clusters and marks anomalies without any probabilistic model, and it works beautifully in Excel.
But because it relies on one fixed radius, HDBSCAN is needed to make the method robust on real data.

The post The Machine Learning “Advent Calendar” Day 10: DBSCAN in Excel appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 7: Decision Tree Classifier https://towardsdatascience.com/the-machine-learning-advent-calendar-day-7-decision-tree-classifier/ Sun, 07 Dec 2025 14:30:00 +0000 https://towardsdatascience.com/?p=607847 In Day 6, we saw how a Decision Tree Regressor finds its optimal split by minimizing the Mean Squared Error.
Today, for Day 7 of the Machine Learning "Advent Calendar", we switch to classification. With just one numerical feature and two classes, we explore how a Decision Tree Classifier decides where to cut the data, using impurity measures like Gini and Entropy.
Even without doing the math, we can visually guess possible split points. But which one is best? And do impurity measures really make a difference? Let us build the first split step by step in Excel and see what happens.

The post The Machine Learning “Advent Calendar” Day 7: Decision Tree Classifier appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 6: Decision Tree Regressor https://towardsdatascience.com/the-machine-learning-advent-calendar-day-6-decision-tree-regressor/ Sat, 06 Dec 2025 14:30:00 +0000 https://towardsdatascience.com/?p=607840 During the first days of this Machine Learning Advent Calendar, we explored models based on distances. Today, we switch to a completely different way of learning: Decision Trees.
With a simple one-feature dataset, we can see how a tree chooses its first split. The idea is always the same: if humans can guess the split visually, then we can rebuild the logic step by step in Excel.
By listing all possible split values and computing the MSE for each one, we identify the split that reduces the error the most. This gives us a clear intuition of how a Decision Tree grows, how it makes predictions, and why the first split is such a crucial step.

The post The Machine Learning “Advent Calendar” Day 6: Decision Tree Regressor appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 4: k-Means in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-4-k-means-in-excel/ Thu, 04 Dec 2025 16:30:00 +0000 https://towardsdatascience.com/?p=607826 How to implement a training algorithm that finally looks like “real” machine learning

The post The Machine Learning “Advent Calendar” Day 4: k-Means in Excel appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 3: GNB, LDA and QDA in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-3-gnb-lda-and-qda-in-excel/ Wed, 03 Dec 2025 16:30:00 +0000 https://towardsdatascience.com/?p=607802 From local distance to global probability

The post The Machine Learning “Advent Calendar” Day 3: GNB, LDA and QDA in Excel appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 1: k-NN Regressor in Excel https://towardsdatascience.com/day-1-k-nn-regressor-in-excel-how-distance-drives-prediction/ Mon, 01 Dec 2025 19:52:19 +0000 https://towardsdatascience.com/?p=607778 This first day of the Advent Calendar introduces the k-NN regressor, the simplest distance-based model. Using Excel, we explore how predictions rely entirely on the closest observations, why feature scaling matters, and how heterogeneous variables can make distances meaningless. Through examples with continuous and categorical features, including the California Housing and Diamonds datasets, we see the strengths and limitations of k-NN, and why defining the right distance is essential to reflect real-world structure.

The post The Machine Learning “Advent Calendar” Day 1: k-NN Regressor in Excel appeared first on Towards Data Science.

]]>
The Machine Learning and Deep Learning “Advent Calendar” Series: The Blueprint https://towardsdatascience.com/machine-learning-and-deep-learning-in-excel-advent-calendar-announcement/ Sun, 30 Nov 2025 15:00:00 +0000 https://towardsdatascience.com/?p=607760 Opening the black box of ML models, step by step, directly in Excel

The post The Machine Learning and Deep Learning “Advent Calendar” Series: The Blueprint appeared first on Towards Data Science.

]]>
The Greedy Boruta Algorithm: Faster Feature Selection Without Sacrificing Recall https://towardsdatascience.com/the-greedy-boruta-algorithm-faster-feature-selection-without-sacrificing-recall/ Sun, 30 Nov 2025 13:00:00 +0000 https://towardsdatascience.com/?p=607762 A modification to the Boruta algorithm that dramatically reduces computation while maintaining high sensitivity

The post The Greedy Boruta Algorithm: Faster Feature Selection Without Sacrificing Recall appeared first on Towards Data Science.

]]>
Implementing the Fourier Transform Numerically in Python: A Step-by-Step Guide https://towardsdatascience.com/implementing-the-fourier-transform-numerically-in-python-a-step-by-step-guide/ Tue, 21 Oct 2025 12:30:00 +0000 https://towardsdatascience.com/?p=607445 What if the FFT functions in NumPy and SciPy don’t actually compute the Fourier transform you think they do?

The post Implementing the Fourier Transform Numerically in Python: A Step-by-Step Guide appeared first on Towards Data Science.

]]>
Machine Learning Meets Panel Data: What Practitioners Need to Know https://towardsdatascience.com/machine-learning-meets-panel-data-what-practitioners-need-to-know/ Fri, 17 Oct 2025 17:15:40 +0000 https://towardsdatascience.com/?p=607417 How to avoid overestimating machine learning models’ performance, usefulness, and real-world applicability due to hidden data leakage

The post Machine Learning Meets Panel Data: What Practitioners Need to Know appeared first on Towards Data Science.

]]>
From Genes to Neural Networks: Understanding and Building NEAT (Neuro-Evolution of Augmenting Topologies) from Scratch https://towardsdatascience.com/from-genes-to-neural-networks-understanding-and-building-neat-neuro-evolution-of-augmenting-topologies-from-scratch/ Mon, 11 Aug 2025 18:03:44 +0000 https://towardsdatascience.com/?p=606830 Practical Neuroevolution: Reproducing NEAT’s Innovations and Code Walkthrough

The post From Genes to Neural Networks: Understanding and Building NEAT (Neuro-Evolution of Augmenting Topologies) from Scratch appeared first on Towards Data Science.

]]>
The Five-Second Fingerprint: Inside Shazam’s Instant Song ID https://towardsdatascience.com/the-five-second-fingerprint-inside-shazams-instant-song-id/ Tue, 08 Jul 2025 01:35:00 +0000 https://towardsdatascience.com/?p=606516 How Shazam recognizes songs in seconds

The post The Five-Second Fingerprint: Inside Shazam’s Instant Song ID appeared first on Towards Data Science.

]]>
A Gentle Introduction to Backtracking https://towardsdatascience.com/a-gentle-introduction-to-backtracking/ Mon, 30 Jun 2025 18:51:47 +0000 https://towardsdatascience.com/?p=606463 Conceptual overview and hands-on examples

The post A Gentle Introduction to Backtracking appeared first on Towards Data Science.

]]>
Adding Training Noise To Improve Detections In Transformers https://towardsdatascience.com/adding-training-noise-to-improve-detections-in-transformers/ Mon, 28 Apr 2025 17:56:52 +0000 https://towardsdatascience.com/?p=605817 Denoising, explained

The post Adding Training Noise To Improve Detections In Transformers appeared first on Towards Data Science.

]]>
Algorithm Protection in the Context of Federated Learning  https://towardsdatascience.com/algorithm-protection-in-the-context-of-federated-learning/ Fri, 21 Mar 2025 04:32:38 +0000 https://towardsdatascience.com/?p=605195 A pragmatic look into protecting algorithms and models deployed into real-world federated analysis and learning settings in healthcare.

The post Algorithm Protection in the Context of Federated Learning  appeared first on Towards Data Science.

]]>
Recursive Walks down User Referral Trees https://towardsdatascience.com/recursive-walks-down-user-referral-trees-0864e14042ec/ Wed, 15 Jan 2025 15:02:13 +0000 https://towardsdatascience.com/recursive-walks-down-user-referral-trees-0864e14042ec/ Measuring the total influence of users in a user referral program by traversing indirect referrals

The post Recursive Walks down User Referral Trees appeared first on Towards Data Science.

]]>
Understanding the Optimization Process Pipeline in Linear Programming https://towardsdatascience.com/understanding-the-optimization-process-pipeline-in-linear-programming-15569d92ba94/ Fri, 27 Dec 2024 11:01:35 +0000 https://towardsdatascience.com/understanding-the-optimization-process-pipeline-in-linear-programming-15569d92ba94/ An introduction to the backend and frontend processes in linear programming, including the mathematical programming system (mps) files

The post Understanding the Optimization Process Pipeline in Linear Programming appeared first on Towards Data Science.

]]>
Core AI For Any Rummy Variant https://towardsdatascience.com/core-ai-for-any-rummy-variant-4ff414da1703/ Sat, 09 Nov 2024 14:38:00 +0000 https://towardsdatascience.com/core-ai-for-any-rummy-variant-4ff414da1703/ Step by Step guide to a Rummy AI

The post Core AI For Any Rummy Variant appeared first on Towards Data Science.

]]>
Cyclic Partition: An Up to 1.5x Faster Partitioning Algorithm https://towardsdatascience.com/cyclic-partition-an-up-to-1-5x-faster-partitioning-algorithm-e38bf7948a5f/ Thu, 10 Oct 2024 17:28:03 +0000 https://towardsdatascience.com/cyclic-partition-an-up-to-1-5x-faster-partitioning-algorithm-e38bf7948a5f/ A sequence partitioning algorithm that does minimal rearrangements of values

The post Cyclic Partition: An Up to 1.5x Faster Partitioning Algorithm appeared first on Towards Data Science.

]]>