Posts

Exploring the Claude Agent SDK

Exploring the Claude Agent SDK

Learning in public: experimenting with Anthropic's Agent SDK after hearing that anything Claude Code can do, you can do with the SDK.

Three Things I Did Over Christmas

Three Things I Did Over Christmas

One of the nice things about time off is the chance to play a little.

LLM Evals Course: Complete Course Recap

LLM Evals Course: Complete Course Recap

This is my recap of Hamel and Shreya's LLM evaluation course. I'm hoping I come back here in the future every time I need to remind myself of how to do this the right way.

LLM Evals Lesson 8: Improving LLM Products

LLM Evals Lesson 8: Improving LLM Products

Notes from the final lesson of Hamel and Shreya's LLM evaluation course - practical strategies for improving accuracy and reducing costs through prompt refinement, architecture changes, fine-tuning, and model cascades.

LLM Evals Course Lesson 7: Interfaces for Human Review

LLM Evals Course Lesson 7: Interfaces for Human Review

Notes from lesson 7 of Hamel and Shreya's LLM evaluation course - interface design principles and strategic sampling.

Building an AI Sandbox with Docker

Building an AI Sandbox with Docker

How to set up a persistent Docker environment for AI coding tools without losing your authentication every time you restart the container.