-

A 27M-parameter model just outperformed giants like DeepSeek R1, o3-mini, and Claude 3.7 on reasoning…
11 min read -

The Tokenizer Has Been a Necessary Evil, but This Radical Approach Shows That It Might…
16 min read -

Exploring Titans: A new architecture equipping LLMs with human-inspired memory that learns and updates itself…
17 min read

