LLMs Demand Observability-Driven Development
Honeycomb
SEPTEMBER 20, 2023
Some of these things are related to cost/benefit tradeoffs, but most are about weak telemetry, instrumentation, and tooling. Instead, ML teams typically build evaluation systems to evaluate the effectiveness of the model or prompt. There is a much longer list of things that make software less than 100% debuggable in practice.
Let's personalize your content