Launched this week

VizPy
Turn prompt failures into executable rules β for AI agents
87 followers
Turn prompt failures into executable rules β for AI agents
87 followers
VizPy automatically optimizes your LLM prompts by learning from failures. With a single API call, it improves prompts and reasoning workflows so your apps, agents, and pipelines deliver more reliable results, without manual prompt tweaking.








Hey Product Hunt π
Excited to launch VizPy today.
VizPy automatically optimizes your LLM prompts by learning from failures. One API call, dramatically better results.
For most LLM apps, prompt optimization is still manual β test, tweak, repeat. Itβs slow, inconsistent, and hard to scale.
VizPy makes that process systematic, helping AI apps, agents, and pipelines produce better, more reliable outputs.
You can try it here: https://vizpy.vizops.ai/
Would love to hear how youβre handling prompt optimization today. π
Congrats on the launch!
Automatic prompt optimization is a pain point for a lot of devs.
How does it handle cases where 'failure' is subjective or hard to define?
@moonblood2077Β Thanks a lot! and this is a great question. While contraprompt works the best when you have well defined labels. ContraPrompt has a structural advantage here. It doesn't need you to define failure absolutely -- it only needs you to say "this attempt was better than that attempt on the same input." Relative preference is way easier to define than absolute quality. You can use an LLM-as-judge that picks A vs B without needing to assign a score. As long as there's a delta between two attempts, it mines a rule. PromptGrad in its current form might be harder to use since it needs a numeric score but PromptGrad can benefit from something like LLM-as-judge incase we don't have proper well defined metrics.
@rishav99Β The structure is difficult, so it's not easy for someone like me who codes vibes to understand, haha.
Awesome, glad to see such a tool. Very useful.
Haven't played around with the libraries yet, but is it possible to supply black-box models like GPT, Claude, Gemini? Given promptGrad is book-keeping gradients, it's mostly designed for models where we can access such info.
But can your libraries be extended to popular cloud-based LLM APIs? I would love to set a budget in such cases and be able to tune prompts within that budget. I think GPT does provide one, but unsure if state of the art black box models can do better---very likely should be able to outperform its optimizer.
Really solid idea imo, Anyone building with LLMs knows how messy prompt tweaking gets, test, fail, tweak again repeat is irratting.
Turning those failures into executable rules is actually very smart i would say
VizPy looks like a clean way to make AI workflows more reliable without the constant jugaad with prompts.
Excited to see where you guys take this. All the best for the launch π