Publish AI, ML & data-science insights to a global community of data professionals.

TDS Newsletter: What You Need to Know About LLMs for Data Science

How new tools are transforming data scientists' essential workflows

Image by Wayee Tan, via Unsplash

Never miss a new edition of The Variable, our weekly newsletter featuring a top-notch selection of editors’ picks, deep dives, community news, and more.

More and more data professionals now encounter large language models as integral components in core data science workflows. From basic data analysis to complex extraction processes, LLMs play an increasingly visible role in areas where they used to have a minimal footprint (if any). 

How should you navigate this rapid change? What can data scientists do to avoid human-LLM turf wars and instead leverage their powers to produce better, more streamlined results? This week, we zoom in on data-specific use cases with articles that show how agents, prompts, and other LLM-powered tools can enhance, rather than jeopardize, the value of your work. 

Before we jump in: in case you missed it, we recently published our latest Author Spotlight, an insight-filled Q&A with longtime TDS contributor Egor Howell discussing his career journey and offering advice for aspiring ML engineers.


LangChain for EDA: Build a CSV Sanity-Check Agent in Python

Tired of performing the same exploratory data-analysis chores time after time? Sarah Schürch walks us through an automation project — powered by Python and LangChain — that produces agents with the ability to display columns, detect missing values, and retrieve descriptive statistics, among other time-saving benefits.

The End-to-End Data Scientist’s Prompt Playbook

If you’re skeptical about LLMs’ place in data scientists’ toolkit, Sara Nobrega’s latest exploration of prompting techniques — especially in the area of stakeholder communication — might just change your mind.

Extracting Structured Data with LangExtract: A Deep Dive into LLM-Orchestrated Workflows

Subha Ganapathi offers a hands-on guide to building modular workflows for structured intelligence, ensuring schema alignment and fact completeness.


This Week’s Most-Read Stories

The articles our community has been buzzing about in recent days cover MCP, the future of data generalists, and more:

Using LangGraph and MCP Servers to Create My Own Voice Assistant, by Benjamin Lee

The Generalist: The New All-Around Type of Data Professional?, by Loizos Loizou

The Machine Learning Lessons I’ve Learned This Month, by Pascal Janetzky


Other Recommended Reads

From climate data to Python essentials, here are a few more recent must-reads we wanted to highlight:

  • AI FOMO, Shadow AI, and Other Business Problems, by Stephanie Kirmer
  • Stochastic Differential Equations and Temperature — NASA Climate Data pt. 2, by Marco Hening Tallarico
  • What Being a Data Scientist at a Startup Really Looks Like, by Yu Dong
  • MobileNetV1 Paper Walkthrough: The Tiny Giant, by Muhammad Ardi
  • Implementing the Coffee Machine in Python, by Mahnoor Javed

Meet Our New Authors

Explore excellent work from some of our recently added contributors:

  • James Gibbins is a data scientist with a multidisciplinary background, and has been publishing a popular series on hyperparameter tuning.
  • Erika G. Gonçalves joins us with expertise in applied math and statistics, as well as deep industry experience; her first article looks under the hood of AI applications.
  • Sean Moran has led multiple AI/ML initiatives at large-scale enterprises. His TDS debut looks into a potential future where scientific innovation might be heavily AI-assisted.

We love publishing articles from new authors, so if you’ve recently written an interesting project walkthrough, tutorial, or theoretical reflection on any of our core topics, why not share it with us?


Subscribe to Our Newsletter


Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

Related Articles