ai evals claudecode claude tech
TIKTOK

ai evals claudecode claude tech

Mar 13, 2026
172 words 70% confidence
I ran an eval that called clod code on a bunch of different tasks, but I ran into an issue where because clod code was running as a subprocess, the traces weren't actually showing up in my eval. So this image of my trace shows the problem, it says run clod agent, but I don't actually see any of what the clod agent did. So first I grabbed my current brain trust span and got the span ID, root span ID, and experiment ID. I passed those three values as environment variables to my clod code subprocess. Inside clod code, a hook reads those environment variables. When it creates the root span for the clod code session, it sets the root span ID as a parent's root span. Basically it's telling brain trust that this trace belongs under that trace. And this is what it looks like with the change. And this way we can see all the LLM calls with their input and output as well as the command line execution.

The video discusses resolving tracing issues in clod code evaluations by using environment variables to manage span IDs, enabling better visibility of LLM calls and executions.

  1. Clod code was evaluated on various tasks but faced tracing issues.
  2. Traces from the subprocess were not visible in the evaluation.
  3. Environment variables were used to pass span IDs to the subprocess.
  4. A hook in clod code reads these variables for trace management.
  5. The root span ID is set as a parent's root span for tracking.
  6. This change allows visibility of all LLM calls and executions.
  • LinkedIn post: Steps to manage tracing in AI evaluations
  • Tweet: How to pass environment variables for better tracing
  • Checklist: Ensure visibility of LLM calls in evaluations

Save videos. Search everything.

Build your personal library of inspiration. Find any quote, hook, or idea in seconds.

Create Free Account No credit card required
Original