Deepchecks LLM Evaluation - Validate, monitor, and safeguard LLM-based apps
byβ’
Continuously validate LLM-based applications including LLM hallucinations, performance metrics, and potential pitfalls throughout the entire lifecycle from pre-deployment and internal experimentation to production.π
Replies
Best
Very Excited for the launch
Report
Awesome and great
Report
I've been experimenting with LLM evaluation metrics on my own for a while now. This is a pretty good solution, will definitely try it out. How do you imagine the future of CI/CD for LLM applications?
@sakameister great question, this has been a question for testing classic ML as well. I can imagine a process kind of like GitHub Actions that runs suites of tests, and some of them may need to involve making sure some manual annotations happened
Replies
Deepchecks Monitoring
ProGPTs