Basalt Agents - Evaluate AI workflows and reach 99% AI quality.

Typeform

•5mo ago

Basalt, #1 AI observability tool for teams, is launching its brand new Agent Builder : prototype, test, and deploy complex AI flows composed of multiple prompts, and run them through a dataset of scenarios.

Replies

Best

UI Bakery

Great launch! How does Basalt prevent overfitting on the evaluation dataset — e.g. if prompts start optimizing too much to test cases and lose generality?

Report

5mo ago

Basalt

Maker

@vladimir_lugovsky great question !! To avoid overfitting, we recommend creating a dynamic dataset, meaning that you continuously enrich it with new test cases from your logs (something you can do from Basalt or programmatically) !

Report

5mo ago

Happy to discover the product! No more excel sheets for evals ;))

Report

5mo ago

Basalt

Maker

@steffanb exactly !! Thanks :)

Report

5mo ago

Congrats on the launch! Curious if there's any example workflow with multimodal (video, speech) results?

Report

5mo ago

This looks solid! The evaluation-first approach is exactly what's needed. Curious how you handle workflows that need human approval gates between steps, not just eval metrics, but hard stops for review before continuing?

Report

2mo ago

1 2