
LLM Evals Course Lesson 4: Multi-turn and Collaborative Evaluation
Notes from lesson 4 of Hamel and Shreya's LLM evaluation course - handling multi-turn conversations and building evaluation criteria through collaboration.
Notes from lesson 4 of Hamel and Shreya's LLM evaluation course - handling multi-turn conversations and building evaluation criteria through collaboration.
How Isaac Flath built a medical flashcard annotation tool for AnkiHub using FastHTML, and why custom annotation tools beat generic ones for complex domains.
Notes from lesson 3 of Hamel and Shreya's LLM evaluation course - implementing automated evaluators, building reliable LLM-as-judge systems, and avoiding common pitfalls.
A systematic approach to analysing and improving large language model applications through error analysis.
Why evaluation-driven experimentation creates better roadmaps in AI products.
Understanding the combinatorial complexity problem that plagues many software systems, and how modern architectures solve it.
Bavaro's approach to strategy: Vision, Strategic Framework, and Roadmap.
Are frameworks actually useful? Exploring how they enable communication, engagement, and focused thinking
Simon Willison was a guest on Logan Kilpatrick's Google podcast. Topics covered: AI as a 'cyborg enhancement', the non-intuitive challenges of mastering LLM use, and the legitimate need for uncensored language models in fields like journalism.