What you get
- Root-cause analysis — plain-English explanation of what went wrong in a specific run, using the actual trace inputs/outputs
- Prompt fixes — one sentence to append to the system prompt; applied via Langfuse in one click
- Code/infra fixes — for structural failures (
CONTEXT_BLOAT,SLOW_STEP, etc.), opens a draft GitHub PR with a unified diff - Fix tracking — the dashboard shows whether recurrence dropped after a fix was applied
Prerequisites
- Dunetrace backend running (
docker compose up -d) - Langfuse account (cloud or self-hosted) with a project and API keys
- One LLM API key for the analysis call (
ANTHROPIC_API_KEYpreferred,OPENAI_API_KEYaccepted as fallback)
Step 1: Install
pip install 'dunetrace[langchain,langfuse]'
Step 2: Add credentials to .env
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_HOST=https://cloud.langfuse.com # omit for cloud; set for self-hosted
ANTHROPIC_API_KEY=sk-ant-...
# OPENAI_API_KEY=sk-... # accepted as fallback
Restart the API container after editing .env:
docker compose up -d api
Step 3: Run both callbacks together
Pass DunetraceCallbackHandler and LangfuseCallbackHandler in the same callbacks list. They are fully independent — no coupling required.
from dunetrace import Dunetrace
from dunetrace.integrations.langchain import DunetraceCallbackHandler
from langfuse.langchain import CallbackHandler as LangfuseCallbackHandler # v4+
dt = Dunetrace(endpoint="http://localhost:8001")
dt_cb = DunetraceCallbackHandler(dt, agent_id="my-agent", model="gpt-4o-mini", tools=["web_search"])
lf_cb = LangfuseCallbackHandler() # reads LANGFUSE_* from env
result = agent.invoke(
{"messages": [("human", query)]},
config={"callbacks": [dt_cb, lf_cb]},
)
dt.shutdown(timeout=5)
import langfuse as lf_module
lf_module.get_client().flush() # ensure trace is uploaded before querying
# IDs for the join:
dt_run_id = dt_cb.last_run_id # e.g. "b5ed23be-e4f0-43bc-..."
lf_trace_id = lf_cb.last_trace_id # e.g. "b5ed23bee4f043bc..." (same UUID, no dashes)
Step 4: Call the explain endpoint
POST /v1/signals/{signal_id}/explain
Content-Type: application/json
Authorization: Bearer <your-key>
{
"langfuse_trace_id": "b5ed23bee4f043bc8625914223875508"
}
Response includes root_cause, fix_content, fix_type, apply_blocked, and langfuse_prompt_name.
fix_type | Meaning | Dashboard action |
|---|---|---|
prompt_addition | One sentence to append to the system prompt | Apply via Langfuse — pushes new prompt version |
code_change | Code or infra fix (CONTEXT_BLOAT, SLOW_STEP, CASCADING_TOOL_FAILURE, etc.) | Open PR on GitHub ↗ — creates a draft PR with unified diff |
no_auto_apply | Security signal (PROMPT_INJECTION_SIGNAL) — never auto-apply | No apply action — review manually |
Step 5a: Apply a prompt fix via Langfuse
When fix_type is prompt_addition and langfuse_prompt_name is returned:
POST /v1/signals/{signal_id}/apply-fix
Content-Type: application/json
Authorization: Bearer <your-key>
{
"fix_content": "Do not repeat a search query you have already executed in this run.",
"langfuse_prompt_name": "research-agent-prompt"
}
The fix is appended to the current prompt text and published as a new version. The dashboard shows "Applied as v4 in Langfuse ↗" with a link.
Step 5b: Open a GitHub PR for code-change fixes
When fix_type is code_change, the dashboard shows an Open PR on GitHub ↗ button. Prerequisites: set GITHUB_TOKEN (needs repo scope) and GITHUB_REPO (owner/repo) in .env:
GITHUB_TOKEN=ghp_...
GITHUB_REPO=owner/repo
GITHUB_BASE_BRANCH=main # optional, default: main
Step 6: Track fix effectiveness
GET /v1/signals/{signal_id}/fix-status
Authorization: Bearer <your-key>
Returns runs_after_fix, recurrences_after_fix, and a verdict:
verified— ≥10 runs, 0 recurrenceslikely_fixed— ≥5 runs, 0 recurrencesstill_occurringinsufficient_data
How the trace lookup works
- Signal fires →
run_idstored in Postgres - Dashboard calls
POST /v1/signals/{id}/explainwith optionallangfuse_trace_id - API fetches the Langfuse trace (retries up to 4× for ingestion lag)
- Extracts system prompt from GENERATION observation
messages[]arrays - Builds a prompt: signal type + evidence + system prompt + relevant span inputs/outputs
- Calls Anthropic Haiku (or GPT-4o-mini fallback) — max 900 tokens
- Returns structured response with fix type classification
The Langfuse trace is never stored — fetched, analysed, discarded.
ID alignment
| System | ID format | Example |
|---|---|---|
Dunetrace run_id | UUID with dashes | b5ed23be-e4f0-43bc-8625-... |
Langfuse trace_id | 32-char hex, no dashes | b5ed23bee4f043bc8625... |
The Dunetrace API normalises the format automatically when querying Langfuse.