thingsithinkithink

Artificial intelligence

View All
LLM Evals Course Lesson 7: Interfaces for Human Review

LLM Evals Course Lesson 7: Interfaces for Human Review

Notes from lesson 7 of Hamel and Shreya's LLM evaluation course - interface design principles and strategic sampling.

Building an AI Sandbox with Docker

Building an AI Sandbox with Docker

How to set up a persistent Docker environment for AI coding tools without losing your authentication every time you restart the container.

LLM Evals Course Lesson 6: Complex Pipelines and CI/CD

LLM Evals Course Lesson 6: Complex Pipelines and CI/CD

Notes from lesson 6 of Hamel and Shreya's LLM evaluation course - debugging agentic systems, handling complex data modalities, and implementing CI/CD for production LLM applications.


Recent Post

AI Demystified Book by Antonio Weiss

AI Demystified Book by Antonio Weiss

Pearson FT published AI Demystified offers a gentle introduction for business leaders who want to understand how AI might impact their field.

LLM Evals Lesson 2 Error Analysis

LLM Evals Lesson 2 Error Analysis

Notes from lesson 2 of Hamel and Shreya's LLM evaluation course - covering error analysis, open and axial coding, and systematic approaches to understanding where AI systems fail.

Hamel & Shreya's LLM Evals Course: Lesson 1

Hamel & Shreya's LLM Evals Course: Lesson 1

Notes from the first lesson of Parlance Lab's Maven course on evaluating LLM applications - covering the Three Gulfs model and why eval is where most people get stuck.

Cogs / Interns / Human Tasks, a practical framework for AI transformation

Cogs / Interns / Human Tasks, a practical framework for AI transformation

Trying to blend together two AI Framework styling into one that's more practically useful

Synthesising a new framework for AI Transformation

Synthesising a new framework for AI Transformation

I like bits of Brunig's and Mollick's AI frameworks, but neither quite works for me.

Error Analysis for Improving LLM Applications

Error Analysis for Improving LLM Applications

A systematic approach to analysing and improving large language model applications through error analysis.