Giskard

Name: Giskard
Rating: 5.0 (2 reviews)

The Testing platform for AI systems

5.0•2 reviews•

994 followers

The Testing platform for AI systems

5.0•2 reviews•

994 followers

Visit website

Testing and QA software

•

AI Metrics and Evaluation

Fast LLM System at scale 🛡️ Detect hallucinations & biases automatically 🔍 Enterprise LLM Evaluation Hub ☁️ Self-hosted / cloud 🤝 Integrated with 🤗, MLFlow, W&B

Free

Launch tags:Open Source•Developer Tools•Artificial Intelligence

Launch Team

Mintlify Editor — Knowledge infrastructure for AI agents

Knowledge infrastructure for AI agents

Promoted

Congrats. This will really help lower the barrier to get models to production.

Report

2yr ago

Maker

@cschagen thank you! Indeed, we're hoping to help get higher quality models in production with low effort required!

Report

2yr ago

love the integration part. congratulations on the launch!

Report

2yr ago

Maker

@naveed_rehman thank you for supporting our launch! 🌟

Report

2yr ago

Scade.pro

Hey Alex and the Giskard Team! 👋 I'm amazed at the comprehensive approach Giskard 2.0 is providing to tackle ML Testing. The fact this is open-source and compatible with multiple platforms makes it even more appealing. 🚀 Could you shed some light on how Giskard manages AI risk range and the mitigation process? Also, it's impressive how you aim to cover different model types. 🙌 Looking forward to seeing Giskard revolutionize ML Testing!

Report

2yr ago

Maker

@builov84 thanks so much for the kind words! 🌟 Our detailed open-source documentation pages outline the specific vulnerabilities that Giskard 2.0 can detect. For mitigation strategies, our enterprise hub offers robust debugging features, designed to not only identify risks but also provide actionable insights into the sources of the issues detected. Feel free to dive into our docs and reach out with any further queries! 🚀🛠️ https://docs.giskard.ai/en/lates...

Report

2yr ago

Maker

Hi @builov84, in order to estimate the risk range, 1. We first curate a list of the most relevant and checked issues that reflect critical risks if they are detected. For tabular and NLP models, we have several categories: Performance, Robustness, Calibration, Data Leakage, Stochasticity, etc. For LLMs, we have Injection attacks, Hallucination & misinformation, Harmful content generation, Stereotypes, Information disclosure, and Output formatting. 2. Under each category, we mostly rely on tailored statistical procedures and metrics to estimate the probability of occurrence, statistical significance, and severity level for each of the issues found. We provide the option to use procedures like Benjamini-Hochberg to decrease the false discovery rate. We also provide an explanation of the impact an issue could have on your ML pipeline. 3. Although our default risk range assessment is carefully crafted, we provide the user with the option to set up his own by configuring the statistical threshold and severity levels based on his own use case if needed. Our Giskard Hub is then dedicated to the mitigation process, 1. From the issues found during the scan, the user can automatically generate a set of tests and upload them into our Hub. Each of the tests generated reflects an issue found and embeds a quantitative metric (the one we relied on to estimate the severity level). 2. Once uploaded to the Hub, it becomes possible to customize these tests, use them with other models and datasets for comparison, and most importantly, use them to debug a specific model by investigating, one by one, the samples in your data that made these tests fail. 3. While debugging, we equip you with explanation tools like SHAP in order to shed some light on the features' importance for tabular and NLP models. 4. Per sample investigated, we provide you automatically with additional insights that allow you to detect critical patterns in your data, create additional tests, and assess the stability of your model against small data perturbations.

Report

2yr ago

Hello, Alex! Congratulations on the exciting launch! how does the framework navigate the nuanced landscape of ethical considerations, ensuring a balanced approach without introducing unintended biases in its testing processes? Best wishes for the success!

Report

2yr ago

Giskard

Maker

@kinzarra That's a good question, and precisely why we made the scan and testing functions open-source. This way, you (and the rest of the community) can audit the code, challenge us, contribute, customize, and make it your own. We're not in the business of black boxes, and we're not claiming to own any kind of truth related to AI ethics. Rather, we think that AI ethics are a projection of social, cultural and political constructs. Our only goal is that our project can serve as a pragmatic, open & quantitive tool to sustain balanced discussions about the ethics of AI projects - including its nuances.

Report

2yr ago

TranslateVideo

Very interesting project! Glad to know we have a way to test LLMs.

Report

2yr ago

Wow, Giskard 2.0 sounds like it's on another level! What's the first thing you'd recommend trying out in Giskard for a newbie?

Report

2yr ago

Maker

@e__m_c_2 thanks for the comment! I recommend starting with the LLM quickstart notebook! Our quickstarts are right here: https://docs.giskard.ai/en/lates...

Report

2yr ago

OkFeedback

Congratulations on launching Giskard! but how you balance automated testing for scale with the need for some manual validation of results.

Report

2yr ago

Maker

@sarvpriy_arya thank you! Our open source solution offers a solid base layer through automated testing. This is then supplemented with semi-manual validation (we have automated insights in the platform that give tips and guide users on what else to test). The combination of both results in a lot of time saved (time usually spent doing manual QA) ⏰

Report

2yr ago

•••

4 5 6

•••

5.0

Based on 2 reviews

Review Giskard?

Reviews

Most Informative