Never miss a new edition of The Variable, our weekly newsletter featuring a top-notch selection of editors’ picks, deep dives, community news, and more.
Data represents the actions, decisions, and opinions of human beings—or the outputs of the tools humans built. That means that, more often than not, it’s messy.
As the volume and complexity of available data grew exponentially in recent years, so did the importance of data science and machine learning professionals. To help you stay up-to-date with effective problem-solving and insight-gathering approaches, we’ve selected several outstanding articles on the workflows that can help keep your data tidy, focused, and actionable. Let’s dive in.
Mining Rules from Data
Mariya Mansurova‘s new must-read is both comprehensive and accessible. It shows how you can leverage your available data to generate business-focused rules, and walks us through a hands-on example (with a full Python implementation) demonstrating the power of decision trees.
No More Tableau Downtime: Metadata API for Proactive Data Health
Before you can analyze data, you need to ensure its availability. Alle Sravani explains how you can preemptively make your data pipelines more robust.
Ivory Tower Notes: The Problem
Marina Tosic shares time-tested insights on the steps you should take to tackle business problems methodically and focus on root causes.
How to Measure Real Model Accuracy When Labels Are Noisy
Clear and concise, Krishna Rao‘s article unpacks the math behind “true” accuracy and error correlation.
Other Recommended Reads
Why not branch out into some other fascinating areas in the wide world of data, ML, and AI?
- What happens when we combine CNNs and transformers? Eric Chung invites us to explore the new frontiers of hybrid architectures.
- Every complex technology relies on an integrated set of tools and systems. Ed Izaguirre outlines the different layers of the AI stack.
- Focusing more specifically on the components of the generative-AI tech stack, Sarah Schürch covers foundation models, agents, vector databases, and more.
- For an immersive technical deep dive, don’t miss Muhammad Ardi‘s excellent guide to implementing a diffusion model from scratch with PyTorch.
Meet Our New Authors
Explore top-notch work from some of our recently added contributors:
- Felipe Bandeira (along with coauthors Giselle Fretta, Thu Than, and Elbion Redenica) presents cutting-edge work on how to evaluate the performance of MCMC samplers.
- Swapnil Patil works on product analytics at Google; his debut TDS article leverages his experience in a thorough discussion of the power of ROC curves.
- Nikhil Dasari just launched a beginner-friendly series on a perennially important topic: time-series forecasting.
Contribute to TDS
We love publishing articles from new authors, so if you’ve recently written an interesting project walkthrough, tutorial, or theoretical reflection on any of our core topics, why not share it with us?







