Evaluation | thingsithinkithink

Jul 06, 2025

LLM Evals Course Lesson 3: Building Automated Evaluators

Notes from lesson 3 of Hamel and Shreya's LLM evaluation course - implementing automated evaluators, building reliable LLM-as-judge systems, and avoiding common pitfalls.

Jun 22, 2025

LLM Evals Course: Lesson 2b (office hrs)

A few things from Evals Course office hrs following lesson 2 of Hamel and Shreya's LLM evaluation course.

Jun 21, 2025

LLM Evals Lesson 2 Error Analysis

Notes from lesson 2 of Hamel and Shreya's LLM evaluation course - covering error analysis, open and axial coding, and systematic approaches to understanding where AI systems fail.

Jun 08, 2025

Hamel & Shreya's LLM Evals Course: Lesson 1

Notes from the first lesson of Parlance Lab's Maven course on evaluating LLM applications - covering the Three Gulfs model and why eval is where most people get stuck.

Apr 13, 2025

Error Analysis for Improving LLM Applications

A systematic approach to analysing and improving large language model applications through error analysis.

Apr 12, 2025

Why we need Experiment-based Roadmaps in the AI Product Era

Why evaluation-driven experimentation creates better roadmaps in AI products.