Data Science | Towards Data Science https://towardsdatascience.com/tag/data-science/ Publish AI, ML & data-science insights to a global community of data professionals. Tue, 16 Dec 2025 09:06:59 +0000 en-US hourly 1 https://wordpress.org/?v=6.8.3 https://towardsdatascience.com/wp-content/uploads/2025/02/cropped-Favicon-32x32.png Data Science | Towards Data Science https://towardsdatascience.com/tag/data-science/ 32 32 6 Technical Skills That Make You a Senior Data Scientist https://towardsdatascience.com/6-technical-skills-that-make-you-a-senior-data-scientist/ Mon, 15 Dec 2025 15:43:00 +0000 https://towardsdatascience.com/?p=607905 Beyond writing code, these are the design-level decisions, trade-offs, and habits that quietly separate senior data scientists from everyone else.

The post 6 Technical Skills That Make You a Senior Data Scientist appeared first on Towards Data Science.

]]>
Lessons Learned from Upgrading to LangChain 1.0 in Production https://towardsdatascience.com/lessons-learnt-from-upgrading-to-langchain-1-0-in-production/ Mon, 15 Dec 2025 10:30:00 +0000 https://towardsdatascience.com/?p=607893 What worked, what broke, and why I did it

The post Lessons Learned from Upgrading to LangChain 1.0 in Production appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 14: Softmax Regression in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-14-softmax-regression-in-excel/ Sun, 14 Dec 2025 18:12:00 +0000 https://towardsdatascience.com/?p=607910 Softmax Regression is simply Logistic Regression extended to multiple classes.

By computing one linear score per class and normalizing them with Softmax, we obtain multiclass probabilities without changing the core logic.

The loss, the gradients, and the optimization remain the same.
Only the number of parallel scores increases.

Implemented in Excel, the model becomes transparent: you can see the scores, the probabilities, and how the coefficients evolve over time.

The post The Machine Learning “Advent Calendar” Day 14: Softmax Regression in Excel appeared first on Towards Data Science.

]]>
The Skills That Bridge Technical Work and Business Impact https://towardsdatascience.com/the-skills-that-bridge-technical-work-and-business-impact/ Sun, 14 Dec 2025 14:30:29 +0000 https://towardsdatascience.com/?p=607866 In the Author Spotlight series, TDS Editors chat with members of our community about their career path in data science and AI, their writing, and their sources of inspiration. Today, we’re thrilled to share our conversation with Maria Mouschoutzi.  Maria is a Data Analyst and Project Manager with a strong background in Operations Research, Mechanical […]

The post The Skills That Bridge Technical Work and Business Impact appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 13: LASSO and Ridge Regression in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-13-lasso-and-ridge-regression-in-excel/ Sat, 13 Dec 2025 16:56:00 +0000 https://towardsdatascience.com/?p=607908 Ridge and Lasso regression are often perceived as more complex versions of linear regression. In reality, the prediction model remains exactly the same. What changes is the training objective. By adding a penalty on the coefficients, regularization forces the model to choose more stable solutions, especially when features are correlated. Implementing Ridge and Lasso step by step in Excel makes this idea explicit: regularization does not add complexity, it adds preference.

The post The Machine Learning “Advent Calendar” Day 13: LASSO and Ridge Regression in Excel appeared first on Towards Data Science.

]]>
NeurIPS 2025 Best Paper Review: Qwen’s Systematic Exploration of Attention Gating https://towardsdatascience.com/neurips-2025-best-paper-review-qwens-systematic-exploration-of-attention-gating/ Sat, 13 Dec 2025 10:16:00 +0000 https://towardsdatascience.com/?p=607899 This one little trick can bring about enhanced training stability, the use of larger learning rates and improved scaling properties

The post NeurIPS 2025 Best Paper Review: Qwen’s Systematic Exploration of Attention Gating appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 12: Logistic Regression in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-12-logistic-regression-in-excel/ Fri, 12 Dec 2025 17:15:00 +0000 https://towardsdatascience.com/?p=607901 In this article, we rebuild Logistic Regression step by step directly in Excel.
Starting from a binary dataset, we explore why linear regression struggles as a classifier, how the logistic function fixes these issues, and how log-loss naturally appears from the likelihood.
With a transparent gradient-descent table, you can watch the model learn at each iteration—making the whole process intuitive, visual, and surprisingly satisfying.

The post The Machine Learning “Advent Calendar” Day 12: Logistic Regression in Excel appeared first on Towards Data Science.

]]>
EDA in Public (Part 1): Cleaning and Exploring Sales Data with Pandas https://towardsdatascience.com/eda-in-public-part-1-cleaning-exploring-sales-data-with-pandas/ Fri, 12 Dec 2025 13:20:00 +0000 https://towardsdatascience.com/?p=607886 Hey everyone! Welcome to the start of a major data journey that I’m calling “EDA in Public.” For those who know me, I believe the best way to learn anything is to tackle a real-world problem and share the entire messy process — including mistakes, victories, and everything in between. If you’ve been looking to level up […]

The post EDA in Public (Part 1): Cleaning and Exploring Sales Data with Pandas appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 11: Linear Regression in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-11-linear-regression-in-excel/ Thu, 11 Dec 2025 16:31:00 +0000 https://towardsdatascience.com/?p=607891 Linear Regression looks simple, but it introduces the core ideas of modern machine learning: loss functions, optimization, gradients, scaling, and interpretation.
In this article, we rebuild Linear Regression in Excel, compare the closed-form solution with Gradient Descent, and see how the coefficients evolve step by step.
This foundation naturally leads to regularization, kernels, classification, and the dual view.
Linear Regression is not just a straight line, but the starting point for many models we will explore next in the Advent Calendar.

The post The Machine Learning “Advent Calendar” Day 11: Linear Regression in Excel appeared first on Towards Data Science.

]]>
7 Pandas Performance Tricks Every Data Scientist Should Know https://towardsdatascience.com/7-pandas-performance-tricks-every-data-scientist-should-know/ Thu, 11 Dec 2025 13:30:00 +0000 https://towardsdatascience.com/?p=607878 What I've learned about making Pandas faster after too many slow notebooks and frozen sessions

The post 7 Pandas Performance Tricks Every Data Scientist Should Know appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 9: LOF in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-9-lof-in-excel/ Tue, 09 Dec 2025 17:45:00 +0000 https://towardsdatascience.com/?p=607869 In this article, we explore LOF through three simple steps: distances and neighbors, reachability distances, and the final LOF score. Using tiny datasets, we see how two anomalies can look obvious to us but completely different to different algorithms. This reveals the key idea of unsupervised learning: there is no single “true” outlier, only definitions. Understanding these definitions is the real skill.

The post The Machine Learning “Advent Calendar” Day 9: LOF in Excel appeared first on Towards Data Science.

]]>
How to Develop AI-Powered Solutions, Accelerated by AI https://towardsdatascience.com/how-to-develop-ai-powered-solutions-accelerated-by-ai/ Tue, 09 Dec 2025 15:00:00 +0000 https://towardsdatascience.com/?p=607861 From idea to impact :  using AI as your accelerating copilot

The post How to Develop AI-Powered Solutions, Accelerated by AI appeared first on Towards Data Science.

]]>
A Realistic Roadmap to Start an AI Career in 2026 https://towardsdatascience.com/a-realistic-roadmap-to-start-an-ai-career-in-2026/ Tue, 09 Dec 2025 12:00:00 +0000 https://towardsdatascience.com/?p=607855 How to learn AI in 2026 through real, usable projects

The post A Realistic Roadmap to Start an AI Career in 2026 appeared first on Towards Data Science.

]]>
Bridging the Silence: How LEO Satellites and Edge AI Will Democratize Connectivity https://towardsdatascience.com/bridging-the-silence-how-leo-satellites-and-edge-ai-will-democratize-connectivity/ Mon, 08 Dec 2025 19:00:00 +0000 https://towardsdatascience.com/?p=607853 Why on-device intelligence and low-orbit constellations are the only viable path to universal accessibility

The post Bridging the Silence: How LEO Satellites and Edge AI Will Democratize Connectivity appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 8: Isolation Forest in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-8-isolation-forest-in-excel/ Mon, 08 Dec 2025 18:26:42 +0000 https://towardsdatascience.com/?p=607851 Isolation Forest may look technical, but its idea is simple: isolate points using random splits. If a point is isolated quickly, it is an anomaly; if it takes many splits, it is normal.

Using the tiny dataset 1, 2, 3, 9, we can see the logic clearly. We build several random trees, measure how many splits each point needs, average the depths, and convert them into anomaly scores. Short depths become scores close to 1, long depths close to 0.

The Excel implementation is painful, but the algorithm itself is elegant. It scales to many features, makes no assumptions about distributions, and even works with categorical data. Above all, Isolation Forest asks a different question: not “What is normal?”, but “How fast can I isolate this point?”

The post The Machine Learning “Advent Calendar” Day 8: Isolation Forest in Excel appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 7: Decision Tree Classifier https://towardsdatascience.com/the-machine-learning-advent-calendar-day-7-decision-tree-classifier/ Sun, 07 Dec 2025 14:30:00 +0000 https://towardsdatascience.com/?p=607847 In Day 6, we saw how a Decision Tree Regressor finds its optimal split by minimizing the Mean Squared Error.
Today, for Day 7 of the Machine Learning "Advent Calendar", we switch to classification. With just one numerical feature and two classes, we explore how a Decision Tree Classifier decides where to cut the data, using impurity measures like Gini and Entropy.
Even without doing the math, we can visually guess possible split points. But which one is best? And do impurity measures really make a difference? Let us build the first split step by step in Excel and see what happens.

The post The Machine Learning “Advent Calendar” Day 7: Decision Tree Classifier appeared first on Towards Data Science.

]]>
Artificial Intelligence, Machine Learning, Deep Learning, and Generative AI — Clearly Explained https://towardsdatascience.com/artificial-intelligence-machine-learning-deep-learning-and-generative-ai-clearly-explained/ Sun, 07 Dec 2025 13:00:00 +0000 https://towardsdatascience.com/?p=607834 Understanding AI in 2026 — from machine learning to generative models

The post Artificial Intelligence, Machine Learning, Deep Learning, and Generative AI — Clearly Explained appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 6: Decision Tree Regressor https://towardsdatascience.com/the-machine-learning-advent-calendar-day-6-decision-tree-regressor/ Sat, 06 Dec 2025 14:30:00 +0000 https://towardsdatascience.com/?p=607840 During the first days of this Machine Learning Advent Calendar, we explored models based on distances. Today, we switch to a completely different way of learning: Decision Trees.
With a simple one-feature dataset, we can see how a tree chooses its first split. The idea is always the same: if humans can guess the split visually, then we can rebuild the logic step by step in Excel.
By listing all possible split values and computing the MSE for each one, we identify the split that reduces the error the most. This gives us a clear intuition of how a Decision Tree grows, how it makes predictions, and why the first split is such a crucial step.

The post The Machine Learning “Advent Calendar” Day 6: Decision Tree Regressor appeared first on Towards Data Science.

]]>
The Machine Learning “Advent Calendar” Day 5: Gaussian Mixture Model in Excel https://towardsdatascience.com/the-machine-learning-advent-calendar-day-5-gmm-in-excel/ Fri, 05 Dec 2025 17:00:00 +0000 https://towardsdatascience.com/?p=607838 This article introduces the Gaussian Mixture Model as a natural extension of k-Means, by improving how distance is measured through variances and the Mahalanobis distance. Instead of assigning points to clusters with hard boundaries, GMM uses probabilities learned through the Expectation–Maximization algorithm – the general form of Lloyd’s method.

Using simple Excel formulas, we implement EM step by step in 1D and 2D, and we visualise how the Gaussian curves or ellipses move during training. The means shift, the variances adjust, and the shapes gradually settle around the true structure of the data.

GMM provides a richer, more flexible way to model clusters, and becomes intuitive once the process is made visible in a spreadsheet.

The post The Machine Learning “Advent Calendar” Day 5: Gaussian Mixture Model in Excel appeared first on Towards Data Science.

]]>
TDS Newsletter: How to Design Evals, Metrics, and KPIs That Work https://towardsdatascience.com/tds-newsletter-how-to-design-evals-metrics-and-kpis-that-work/ Fri, 05 Dec 2025 01:40:18 +0000 https://towardsdatascience.com/?p=607849 On the challenges of producing reliable insights and avoiding common mistakes

The post TDS Newsletter: How to Design Evals, Metrics, and KPIs That Work appeared first on Towards Data Science.

]]>