Never miss a new edition of The Variable, our weekly newsletter featuring a top-notch selection of editors’ picks, deep dives, community news, and more.
More and more data professionals now encounter large language models as integral components in core data science workflows. From basic data analysis to complex extraction processes, LLMs play an increasingly visible role in areas where they used to have a minimal footprint (if any).
How should you navigate this rapid change? What can data scientists do to avoid human-LLM turf wars and instead leverage their powers to produce better, more streamlined results? This week, we zoom in on data-specific use cases with articles that show how agents, prompts, and other LLM-powered tools can enhance, rather than jeopardize, the value of your work.
Before we jump in: in case you missed it, we recently published our latest Author Spotlight, an insight-filled Q&A with longtime TDS contributor Egor Howell discussing his career journey and offering advice for aspiring ML engineers.
LangChain for EDA: Build a CSV Sanity-Check Agent in Python
Tired of performing the same exploratory data-analysis chores time after time? Sarah Schürch walks us through an automation project — powered by Python and LangChain — that produces agents with the ability to display columns, detect missing values, and retrieve descriptive statistics, among other time-saving benefits.
The End-to-End Data Scientist’s Prompt Playbook
If you’re skeptical about LLMs’ place in data scientists’ toolkit, Sara Nobrega’s latest exploration of prompting techniques — especially in the area of stakeholder communication — might just change your mind.
Extracting Structured Data with LangExtract: A Deep Dive into LLM-Orchestrated Workflows
Subha Ganapathi offers a hands-on guide to building modular workflows for structured intelligence, ensuring schema alignment and fact completeness.
This Week’s Most-Read Stories
The articles our community has been buzzing about in recent days cover MCP, the future of data generalists, and more:
Using LangGraph and MCP Servers to Create My Own Voice Assistant, by Benjamin Lee
The Generalist: The New All-Around Type of Data Professional?, by Loizos Loizou
The Machine Learning Lessons I’ve Learned This Month, by Pascal Janetzky
Other Recommended Reads
From climate data to Python essentials, here are a few more recent must-reads we wanted to highlight:
- AI FOMO, Shadow AI, and Other Business Problems, by Stephanie Kirmer
- Stochastic Differential Equations and Temperature — NASA Climate Data pt. 2, by Marco Hening Tallarico
- What Being a Data Scientist at a Startup Really Looks Like, by Yu Dong
- MobileNetV1 Paper Walkthrough: The Tiny Giant, by Muhammad Ardi
- Implementing the Coffee Machine in Python, by Mahnoor Javed
Meet Our New Authors
Explore excellent work from some of our recently added contributors:
- James Gibbins is a data scientist with a multidisciplinary background, and has been publishing a popular series on hyperparameter tuning.
- Erika G. Gonçalves joins us with expertise in applied math and statistics, as well as deep industry experience; her first article looks under the hood of AI applications.
- Sean Moran has led multiple AI/ML initiatives at large-scale enterprises. His TDS debut looks into a potential future where scientific innovation might be heavily AI-assisted.
We love publishing articles from new authors, so if you’ve recently written an interesting project walkthrough, tutorial, or theoretical reflection on any of our core topics, why not share it with us?







