Towards Data Science https://towardsdatascience.com/ Publish AI, ML & data-science insights to a global community of data professionals. Mon, 15 Dec 2025 20:49:55 +0000 en-US hourly 1 https://wordpress.org/?v=6.8.3 https://towardsdatascience.com/wp-content/uploads/2025/02/cropped-Favicon-32x32.png Towards Data Science https://towardsdatascience.com/ 32 32 The Machine Learning “Advent Calendar” Day 15: SVM in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-15-svm-in-excel/ Mon, 15 Dec 2025 19:41:01 +0000 https://towardsdatascience.com/?p=607912 Instead of starting with margins and geometry, this article builds the Support Vector Machine step by step from familiar models. By changing the loss function and reusing regularization, SVM appears naturally as a linear classifier trained by optimization. This perspective unifies logistic regression, SVM, and other linear models into a single, coherent framework.

The post The Machine Learning “Advent Calendar” Day 15: SVM in Excel appeared first on Towards Data Science.

]]>
6 Technical Skills That Make You a Senior Data Scientist https://towardsdatascience.com/6-technical-skills-that-make-you-a-senior-data-scientist/ Mon, 15 Dec 2025 15:43:00 +0000 https://towardsdatascience.com/?p=607905 Beyond writing code, these are the design-level decisions, trade-offs, and habits that quietly separate senior data scientists from everyone else.

The post 6 Technical Skills That Make You a Senior Data Scientist appeared first on Towards Data Science.

]]>
Geospatial exploratory data analysis with GeoPandas and DuckDB https://towardsdatascience.com/geospatial-exploratory-data-analysis-with-geopandas-and-duckdb/ Mon, 15 Dec 2025 13:17:00 +0000 https://towardsdatascience.com/?p=607897 In this article, I’ll show you how to use two popular Python libraries to carry out some geospatial analysis of traffic accident data within the UK. I was a relatively early adopter of DuckDB, the fast OLAP database, after it became available, but only recently realised that, through an extension, it offered a large number […]

The post Geospatial exploratory data analysis with GeoPandas and DuckDB appeared first on Towards Data Science.

]]>
Lessons Learned from Upgrading to LangChain 1.0 in Production https://towardsdatascience.com/lessons-learnt-from-upgrading-to-langchain-1-0-in-production/ Mon, 15 Dec 2025 10:30:00 +0000 https://towardsdatascience.com/?p=607893 What worked, what broke, and why I did it

The post Lessons Learned from Upgrading to LangChain 1.0 in Production appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 14: Softmax Regression in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-14-softmax-regression-in-excel/ Sun, 14 Dec 2025 18:12:00 +0000 https://towardsdatascience.com/?p=607910 Softmax Regression is simply Logistic Regression extended to multiple classes.

By computing one linear score per class and normalizing them with Softmax, we obtain multiclass probabilities without changing the core logic.

The loss, the gradients, and the optimization remain the same.
Only the number of parallel scores increases.

Implemented in Excel, the model becomes transparent: you can see the scores, the probabilities, and how the coefficients evolve over time.

The post The Machine Learning “Advent Calendar” Day 14: Softmax Regression in Excel appeared first on Towards Data Science.

]]>
The Skills That Bridge Technical Work and Business Impact https://towardsdatascience.com/the-skills-that-bridge-technical-work-and-business-impact/ Sun, 14 Dec 2025 14:30:29 +0000 https://towardsdatascience.com/?p=607866 In the Author Spotlight series, TDS Editors chat with members of our community about their career path in data science and AI, their writing, and their sources of inspiration. Today, we’re thrilled to share our conversation with Maria Mouschoutzi.  Maria is a Data Analyst and Project Manager with a strong background in Operations Research, Mechanical […]

The post The Skills That Bridge Technical Work and Business Impact appeared first on Towards Data Science.

]]>
Stop Writing Spaghetti if-else Chains: Parsing JSON with Python’s match-case https://towardsdatascience.com/stop-writing-spaghetti-if-else-chains-parsing-json-with-pythons-match-case/ Sun, 14 Dec 2025 10:24:00 +0000 https://towardsdatascience.com/?p=607903 Introduction If you work in data science, data engineering, or as as a frontend/backend developer, you deal with JSON. For professionals, its basically only death, taxes, and JSON-parsing that is inevitable. The issue is that parsing JSON is often a serious pain. Whether you are pulling data from a REST API, parsing logs, or reading […]

The post Stop Writing Spaghetti if-else Chains: Parsing JSON with Python’s match-case appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 13: LASSO and Ridge Regression in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-13-lasso-and-ridge-regression-in-excel/ Sat, 13 Dec 2025 16:56:00 +0000 https://towardsdatascience.com/?p=607908 Ridge and Lasso regression are often perceived as more complex versions of linear regression. In reality, the prediction model remains exactly the same. What changes is the training objective. By adding a penalty on the coefficients, regularization forces the model to choose more stable solutions, especially when features are correlated. Implementing Ridge and Lasso step by step in Excel makes this idea explicit: regularization does not add complexity, it adds preference.

The post The Machine Learning “Advent Calendar” Day 13: LASSO and Ridge Regression in Excel appeared first on Towards Data Science.

]]>
How to Increase Coding Iteration Speed https://towardsdatascience.com/how-to-increase-coding-iteration-speed/ Sat, 13 Dec 2025 13:30:00 +0000 https://towardsdatascience.com/?p=607895 Learn how to become a more efficient programmer with local testing

The post How to Increase Coding Iteration Speed appeared first on Towards Data Science.

]]>
NeurIPS 2025 Best Paper Review: Qwen’s Systematic Exploration of Attention Gating https://towardsdatascience.com/neurips-2025-best-paper-review-qwens-systematic-exploration-of-attention-gating/ Sat, 13 Dec 2025 10:16:00 +0000 https://towardsdatascience.com/?p=607899 This one little trick can bring about enhanced training stability, the use of larger learning rates and improved scaling properties

The post NeurIPS 2025 Best Paper Review: Qwen’s Systematic Exploration of Attention Gating appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 12: Logistic Regression in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-12-logistic-regression-in-excel/ Fri, 12 Dec 2025 17:15:00 +0000 https://towardsdatascience.com/?p=607901 In this article, we rebuild Logistic Regression step by step directly in Excel.
Starting from a binary dataset, we explore why linear regression struggles as a classifier, how the logistic function fixes these issues, and how log-loss naturally appears from the likelihood.
With a transparent gradient-descent table, you can watch the model learn at each iteration—making the whole process intuitive, visual, and surprisingly satisfying.

The post The Machine Learning “Advent Calendar” Day 12: Logistic Regression in Excel appeared first on Towards Data Science.

]]>
Decentralized Computation: The Hidden Principle Behind Deep Learning https://towardsdatascience.com/the-power-of-decentralization/ Fri, 12 Dec 2025 15:47:00 +0000 https://towardsdatascience.com/?p=607888 Most breakthroughs in deep learning — from simple neural networks to large language models — are built upon a principle that is much older than AI itself: decentralization. Instead of relying on a powerful “central planner” coordinating and commanding the behaviors of other components, modern deep-learning-based AI models succeed because many simple units interact locally […]

The post Decentralized Computation: The Hidden Principle Behind Deep Learning appeared first on Towards Data Science.

]]>
EDA in Public (Part 1): Cleaning and Exploring Sales Data with Pandas https://towardsdatascience.com/eda-in-public-part-1-cleaning-exploring-sales-data-with-pandas/ Fri, 12 Dec 2025 13:20:00 +0000 https://towardsdatascience.com/?p=607886 Hey everyone! Welcome to the start of a major data journey that I’m calling “EDA in Public.” For those who know me, I believe the best way to learn anything is to tackle a real-world problem and share the entire messy process — including mistakes, victories, and everything in between. If you’ve been looking to level up […]

The post EDA in Public (Part 1): Cleaning and Exploring Sales Data with Pandas appeared first on Towards Data Science.

]]>
Spectral Community Detection in Clinical Knowledge Graphs https://towardsdatascience.com/spectral-community-detection-in-clinical-knowledge-graphs/ Fri, 12 Dec 2025 10:30:00 +0000 https://towardsdatascience.com/?p=607884 Introduction How do we identify latent groups of patients in a large cohort? How can we find similarities among patients that go beyond the well-known comorbidity clusters associated with specific diseases? And more importantly, how can we extract quantitative signals that can be analyzed, compared, and reused across different clinical scenarios? The information associated to […]

The post Spectral Community Detection in Clinical Knowledge Graphs appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 11: Linear Regression in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-11-linear-regression-in-excel/ Thu, 11 Dec 2025 16:31:00 +0000 https://towardsdatascience.com/?p=607891 Linear Regression looks simple, but it introduces the core ideas of modern machine learning: loss functions, optimization, gradients, scaling, and interpretation.
In this article, we rebuild Linear Regression in Excel, compare the closed-form solution with Gradient Descent, and see how the coefficients evolve step by step.
This foundation naturally leads to regularization, kernels, classification, and the dual view.
Linear Regression is not just a straight line, but the starting point for many models we will explore next in the Advent Calendar.

The post The Machine Learning “Advent Calendar” Day 11: Linear Regression in Excel appeared first on Towards Data Science.

]]>
Drawing Shapes with the Python Turtle Module https://towardsdatascience.com/drawing-shapes-with-the-python-turtle-module/ Thu, 11 Dec 2025 15:00:00 +0000 https://towardsdatascience.com/?p=607880 A step-by-step tutorial that explores the Python Turtle Module

The post Drawing Shapes with the Python Turtle Module appeared first on Towards Data Science.

]]>
7 Pandas Performance Tricks Every Data Scientist Should Know https://towardsdatascience.com/7-pandas-performance-tricks-every-data-scientist-should-know/ Thu, 11 Dec 2025 13:30:00 +0000 https://towardsdatascience.com/?p=607878 What I've learned about making Pandas faster after too many slow notebooks and frozen sessions

The post 7 Pandas Performance Tricks Every Data Scientist Should Know appeared first on Towards Data Science.

]]>
How Agent Handoffs Work in Multi-Agent Systems https://towardsdatascience.com/how-agent-handoffs-work-in-multi-agent-systems/ Thu, 11 Dec 2025 12:00:00 +0000 https://towardsdatascience.com/?p=607875 Understanding how LLM agents transfer control to each other in a multi-agent system with LangGraph

The post How Agent Handoffs Work in Multi-Agent Systems appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 10: DBSCAN in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-10-dbscan-in-excel/ Wed, 10 Dec 2025 16:30:00 +0000 https://towardsdatascience.com/?p=607882 DBSCAN shows how far we can go with a very simple idea: count how many neighbors live close to each point.
It finds clusters and marks anomalies without any probabilistic model, and it works beautifully in Excel.
But because it relies on one fixed radius, HDBSCAN is needed to make the method robust on real data.

The post The Machine Learning “Advent Calendar” Day 10: DBSCAN in Excel appeared first on Towards Data Science.

]]>
How to Maximize Agentic Memory for Continual Learning https://towardsdatascience.com/how-to-maximize-agentic-memory-for-continual-learning/ Wed, 10 Dec 2025 15:00:00 +0000 https://towardsdatascience.com/?p=607873 Learn how to become an effective engineer with continual learning LLMs

The post How to Maximize Agentic Memory for Continual Learning appeared first on Towards Data Science.

]]>