Publish AI, ML & data-science insights to a global community of data professionals.

TDS Newsletter: How to Keep LLMs Effective and Reliable Over Time

On the nitty-gritty details of evaluations, guardrails, and ongoing optimization

Photo by Patricia Serna via Unsplash

Never miss a new edition of The Variable, our weekly newsletter featuring a top-notch selection of editors’ picks, deep dives, community news, and more.

Those of you who’ve worked with LLM-powered applications know this: by now, building and deploying these tools is (relatively) straightforward, but maintaining their reliability and long-term value for the organization is not. 

There’s no magic solution to this challenge, but several approaches have emerged to make your life as data and ML professionals easier. Our weekly highlights zoom in on the nitty-gritty details of evaluations, guardrails, and ongoing optimization, so if you’d like to expand your LLM know-how and be more effective in your role — read on.


AI Engineering and Evals as New Layers of Software Work

Clara Chong‘s compelling premise is that “the real work is about solving business problems with the tools we already have.” She unpacks AI’s impact on tech workers’ daily rhythms: writing code might have become a lot easier (or at least faster), but ensuring it follows the best practices of eval-driven development introduces several layers of complexity into your projects.

Notes on LLM Evaluation

If you’re ready to dig deeper into the intricacies of evals, Felipe Adachi recently shared a comprehensive, step-by-step guide to the components that make up a robust pipeline. It zooms in on data preparation, the choices you might face along the way, and the adjustments you’ll need to implement once the results are in.

RAG Explained: Reranking for Better Answers

Retrieval-augmented generation is a technique for improving LLM performance, but it, too, often requires fine-tuning and optimization. Maria Mouschoutzi introduces us to reranking and its potential to boost LLM outputs’ relevance.

Introducing the AI-3P Assessment Framework: Score AI Projects Before Committing Resources

Sometimes, tweaking a tool post-deployment might just be too little, too late. Marina Tosic presents a novel framework to help you avoid that fate by focusing on projects that are likelier to succeed.


This Week’s Most-Read Stories

From DataViz basics to AI agents, here are the recent articles that resonated the most with our audience.

How to Build Effective Agentic Systems with LangGraph, by Eivind Kjosbakken

Data Visualization Explained (Part 2): An Introduction to Visual Variables, by Murtaza Ali

MCP in Practice, by Sruly Rosenblat, Ilan Strauss, Isobel Moure, and Tim O’Reilly


Other Recommended Reads

Cutting-edge research, marketing-data deep dives, AI’s role in the job-search process, and more: don’t miss these standout articles.

  • Are Foundation Models Ready for Your Production Tabular Data?, by Carmen Adriana Martínez Barbosa
  • Building Fact-Checking Systems: Catching Repeating False Claims Before They Spread, by Iva Pezo
  • Prediction vs. Search Models: What Data Scientists Are Missing, by Derek Tran
  • Why Are Marketers Turning To Quasi Geo-Lift Experiments? (And How to Plan Them), by Tomas Jancovic
  • .How I Used ChatGPT to Land My Next Data Science Role, by Yu Dong

Meet Our New Authors

We hope you take the time to explore the excellent work from the latest cohort of TDS contributors:

  • Kenneth McCarthy charts the visual “fingerprints” of 20 languages with the help of basic statistics.
  • Ankit Singh Chauhan published a lucid writeup of recent research that promises “a smarter way to scale reasoning tasks without wasting a massive amount of computation.”

We love publishing articles from new authors, so if you’ve recently written an interesting project walkthrough, tutorial, or theoretical reflection on any of our core topics, why not share it with us?


Subscribe to Our Newsletter


Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

Related Articles