TDS Newsletter: What You Need to Know About LLMs for Data Science

How new tools are transforming data scientists' essential workflows

Sep 11, 2025

3 min read

Image by Wayee Tan, via Unsplash

Never miss a new edition of The Variable, our weekly newsletter featuring a top-notch selection of editors’ picks, deep dives, community news, and more.

Subscribe today

More and more data professionals now encounter large language models as integral components in core data science workflows. From basic data analysis to complex extraction processes, LLMs play an increasingly visible role in areas where they used to have a minimal footprint (if any).

How should you navigate this rapid change? What can data scientists do to avoid human-LLM turf wars and instead leverage their powers to produce better, more streamlined results? This week, we zoom in on data-specific use cases with articles that show how agents, prompts, and other LLM-powered tools can enhance, rather than jeopardize, the value of your work.

Before we jump in: in case you missed it, we recently published our latest Author Spotlight, an insight-filled Q&A with longtime TDS contributor Egor Howell discussing his career journey and offering advice for aspiring ML engineers.

LangChain for EDA: Build a CSV Sanity-Check Agent in Python

Tired of performing the same exploratory data-analysis chores time after time? Sarah Schürch walks us through an automation project — powered by Python and LangChain — that produces agents with the ability to display columns, detect missing values, and retrieve descriptive statistics, among other time-saving benefits.

LangChain for EDA: Build a CSV Sanity-Check Agent in Python

The End-to-End Data Scientist’s Prompt Playbook

If you’re skeptical about LLMs’ place in data scientists’ toolkit, Sara Nobrega’s latest exploration of prompting techniques — especially in the area of stakeholder communication — might just change your mind.

The End-to-End Data Scientist’s Prompt Playbook

Extracting Structured Data with LangExtract: A Deep Dive into LLM-Orchestrated Workflows

Subha Ganapathi offers a hands-on guide to building modular workflows for structured intelligence, ensuring schema alignment and fact completeness.

Extracting Structured Data with LangExtract: A Deep Dive into LLM-Orchestrated Workflows

This Week’s Most-Read Stories

The articles our community has been buzzing about in recent days cover MCP, the future of data generalists, and more:

Using LangGraph and MCP Servers to Create My Own Voice Assistant, by Benjamin Lee

Using LangGraph and MCP Servers to Create My Own Voice Assistant

The Generalist: The New All-Around Type of Data Professional?, by Loizos Loizou

The Generalist: The New All-Around Type of Data Professional?

The Machine Learning Lessons I’ve Learned This Month, by Pascal Janetzky

The Machine Learning Lessons I’ve Learned This Month

Meet Our New Authors

Explore excellent work from some of our recently added contributors:

James Gibbins is a data scientist with a multidisciplinary background, and has been publishing a popular series on hyperparameter tuning.

A Visual Guide to Tuning Random Forest Hyperparameters

Erika G. Gonçalves joins us with expertise in applied math and statistics, as well as deep industry experience; her first article looks under the hood of AI applications.

AI Operations Under the Hood: Challenges and Best Practices

Sean Moran has led multiple AI/ML initiatives at large-scale enterprises. His TDS debut looks into a potential future where scientific innovation might be heavily AI-assisted.

From Tokens to Theorems: Building a Neuro-Symbolic AI Mathematician

We love publishing articles from new authors, so if you’ve recently written an interesting project walkthrough, tutorial, or theoretical reflection on any of our core topics, why not share it with us?