Braintrust Blog
npub12rqy8jsv0musnr2lyartgmwrddn7yyqw29f0zv2x5rd5l2gx4glsvc7zqz@drss.io
Latest insights, tutorials, and updates from the Braintrust team. Learn about AI evaluation, LLM observability, and best practices for building reliable AI products.
Braintrust CLI and MCP
3 Apr 2026
Learn when to use the Braintrust CLI and MCP depending on where you are in the AI development workflow.
Evals are the new PRD
27 Mar 2026
Evals are the new PRD
27 Mar 2026
Why AI product managers should replace traditional PRDs with evals, and how the eval flywheel becomes the operating system for AI product development.
What is AI observability?
19 Mar 2026
AI observability is a new infrastructure category built on traces, evals, and feedback loops. Learn what it means, why it's technically hard, and how it changes AI product development.
What is AI observability?
19 Mar 2026
Evals for PMs: A practical guide to AI product quality
17 Mar 2026
Everything a product manager needs to know about evals, from building datasets and scoring criteria to running experiments and integrating evals into your product development process.
Evals for PMs: A practical guide to AI product quality
17 Mar 2026
Keep building with the Starter plan
16 Mar 2026
Keep building with the Starter plan
16 Mar 2026
Starter is a new Braintrust plan with no platform fee, designed to scale with your needs.
Supporting privacy and compliance for EU teams
12 Mar 2026
Supporting privacy and compliance for EU teams
12 Mar 2026
Braintrust's decoupled architecture gives EU teams control over where their AI data lives, simplifying GDPR compliance and data residency requirements.
How to build your first offline eval
10 Mar 2026
How to build your first offline eval
10 Mar 2026
A 10-step guide to going from a vibe to a working eval system, using a real Mermaid diagram generation project as an example.
Trace keynote recap: See it, improve it, optimize it
25 Feb 2026
Automatically discover what matters in your production traces with Topics
25 Feb 2026

Trace keynote recap: See it, improve it, optimize it
25 Feb 2026
Everything we announced at the Trace keynote, including Topics, the Braintrust CLI, and the Braintrust Gateway.
Automatically discover what matters in your production traces with Topics
25 Feb 2026
Topics uses AI-powered clustering to surface recurring patterns, from errors and user intents to sentiment, across thousands of traces.

Braintrust's series B: building the infrastructure for production AI
17 Feb 2026
Braintrust has raised $80M to become the observability layer for shipping quality AI.
Braintrust's series B: building the infrastructure for production AI
17 Feb 2026
The 5 pillars of AI model performance
12 Feb 2026
The 5 pillars of AI model performance
12 Feb 2026
A framework for evaluating AI models across five dimensions, with Claude Opus 4.6 and GPT-5.3 Codex as case studies.
Testing if "bash is all you need"
22 Jan 2026
Testing whether filesystems and bash provide the optimal abstraction for AI agents through rigorous evaluation.
Testing if "bash is all you need"
22 Jan 2026
Security is a choice: how Braintrust lets you decide where your AI data lives
21 Jan 2026
Braintrust offers flexible deployment options so you can keep sensitive AI data in your own infrastructure while still benefiting from a modern SaaS experience.
Security is a choice: how Braintrust lets you decide where your AI data lives
21 Jan 2026
Building observable AI agents with Temporal
20 Jan 2026
Bringing together durable execution and LLM observability to make AI agents easier to build, monitor, and operate in production.
Building observable AI agents with Temporal
20 Jan 2026
Debugging Ralph Wiggum with Braintrust Logs
13 Jan 2026
How observability makes autonomous AI development actually work.
Debugging Ralph Wiggum with Braintrust Logs
13 Jan 2026
Claude Code meets Braintrust
23 Dec 2025
A two-way integration that brings observability into your development loop.
Claude Code meets Braintrust
23 Dec 2025
AI observability beyond Python and TypeScript
22 Dec 2025
AI observability beyond Python and TypeScript
22 Dec 2025
Braintrust now supports Java, Go, Ruby, and C# with native SDKs.
Brainstore makes AI observability at scale possible
18 Dec 2025

Brainstore makes AI observability at scale possible
18 Dec 2025
Real-world benchmarks show Brainstore is up to 24x faster than competitors, making it possible to observe AI systems at production scale.
Evals are a team sport: How we built Loop
25 Nov 2025
How we debugged Loop's prompt optimization workflow by combining manual review, Loop analysis, and cross-functional collaboration.

Evals are a team sport: How we built Loop
25 Nov 2025

Turn production data into better AI with Loop
24 Nov 2025
Loop is the AI assistant that helps teams query, analyze, and improve AI applications faster.

How Retool uses Loop to turn logs into AI roadmap decisions
24 Nov 2025

Turn production data into better AI with Loop
24 Nov 2025

The three pillars of AI observability
18 Nov 2025
Why traces, evals, and annotation redefine observability for AI systems.
The three pillars of AI observability
18 Nov 2025
Braintrust Java SDK: AI observability and evals for the JVM
23 Oct 2025

How Portola empowers subject matter experts to improve AI quality
20 Oct 2025
Braintrust on the Vercel Marketplace
16 Oct 2025

How Dropbox automates evals for conversational AI
15 Oct 2025
Measuring what matters: An intro to AI evals
10 Oct 2025
Claude Sonnet 4.5 analysis
29 Sep 2025
AI that knows your data
13 Sep 2025
A/B testing can't keep up with AI
3 Sep 2025