
LLM Evals Course Lesson 4: Multi-turn and Collaborative Evaluation
Notes from lesson 4 of Hamel and Shreya's LLM evaluation course - handling multi-turn conversations and building evaluation criteria through collaboration.
Notes from lesson 4 of Hamel and Shreya's LLM evaluation course - handling multi-turn conversations and building evaluation criteria through collaboration.
How Isaac Flath built a medical flashcard annotation tool for AnkiHub using FastHTML, and why custom annotation tools beat generic ones for complex domains.
Notes from lesson 3 of Hamel and Shreya's LLM evaluation course - implementing automated evaluators, building reliable LLM-as-judge systems, and avoiding common pitfalls.
A few things from Evals Course office hrs following lesson 2 of Hamel and Shreya's LLM evaluation course.
Pearson FT published AI Demystified offers a gentle introduction for business leaders who want to understand how AI might impact their field.
Notes from lesson 2 of Hamel and Shreya's LLM evaluation course - covering error analysis, open and axial coding, and systematic approaches to understanding where AI systems fail.
Notes from the first lesson of Parlance Lab's Maven course on evaluating LLM applications - covering the Three Gulfs model and why eval is where most people get stuck.
Trying to blend together two AI Framework styling into one that's more practically useful
I like bits of Brunig's and Mollick's AI frameworks, but neither quite works for me.