Inference | Towards Data Science

Optimizing PyTorch Model Inference on AWS Graviton

Deep Learning

Tips for accelerating AI/ML on CPU — Part 2

Chaim Rand

December 10, 2025

11 min read

Optimizing PyTorch Model Inference on CPU

Deep Learning

Flyin’ Like a Lion on Intel Xeon

Chaim Rand

December 8, 2025

20 min read

I Made My AI Model 84% Smaller and It Got Better, Not Worse

Large Language Models

The counterintuitive approach to AI optimization that’s changing how we deploy models

Arjun Kaarat

September 29, 2025

20 min read

Evaluating LLMs for Inference, or Lessons from Teaching for Machine Learning

Large Language Models

It’s like grading papers, but your student is an LLM

Stephanie Kirmer

June 2, 2025

12 min read

The Case for Centralized AI Model Inference Serving

Machine Learning

Optimizing highly parallel AI algorithm execution

Chaim Rand

April 1, 2025

11 min read

Author's image of the Poisson distribution as green, yellow, and blue bar graph lines

Mastering the Poisson Distribution: Intuition and Foundations

Data Science

Take a dive into the foundations and exemplifying use cases of the Poisson distribution

Alejandro Alvarez Perez

March 20, 2025

17 min read

How LLMs Work: Pre-Training to Post-Training, Neural Networks, Hallucinations, and Inference

Large Language Models

With the recent explosion of interest in large language models (LLMs), they often seem almost…

Clara Chong

February 18, 2025

9 min read

Combining Large and Small LLMs to Boost Inference Time and Quality

Artificial Intelligence

Implementing Speculative and Contrastive Decoding

Richa Gadgil

December 5, 2024

8 min read

Figure 1. Intuition of hard vs easy decisions

Dynamic Execution

Artificial Intelligence

Getting your AI task to distinguish between Hard and Easy problems

Haim Barad

November 3, 2024

12 min read

The Poisson Bootstrap

Statistics

Bootstrapping over large datasets

David Clarance

August 12, 2024

10 min read