SkeptAI - The adversarial AI agent that challenges LLM outputs

SkeptAI is the adversarial reasoning layer that challenges AI outputs before you act on them. Paste any response from Claude, ChatGPT, or Gemini. CRIT runs four structured passes, then it takes action. CRIT generates a revised output with all critical findings addressed, runs web verification on factual claims inline, and exports a GitHub issue template if you need to escalate. We built this because LLMs optimize for confidence. CRIT optimizes for honesty.

Hey Everyone, My name is Daton Pope, and I am the builder of SkeptAI. The thing that inpired me to build this, was me using Claude to analyze a vendor policy (my background is in ERM). The output was almost perfect. However, it cited Anthropic's API as supporting image generation, which it doesn't. This set off an immediate red flag for me. Had I used the output for a real work-related task, it could have easily caused a lot of trouble. And there lies the problem. LLMs aren't wrong randomly. They're wrong confidently, and in ways that are hard to catch if you're simply browsing through and not being vigilant. So, I figured, why not do something about it? In response, I created CRIT! CRIT (Challenge, Reveal, Interrogate, Transmit) is the adversarial framework I built to fix the problem for myself. This eventually turned into SkeptAI. A few things I know you'll ask: WHY NOT JUST ASK THE LLM TO CRITIQUE ITSELF? I tried it. When you ask a model to evaluate its own output, it defends its own logic. CRIT solves this by always routing to a different model for the critique pass. The Claude output gets challenged by GPT-4o, and vice versa. The "CRITIQUED BY" label in every report makes this transparent. WHY ISN'T THE ORIGINAL PROMPT OPTIONAL? CRIT evaluates outputs as decision-making artifacts (the thing you're about to act on), regardless of how it was generated. The optional prompt field improves the Completeness score when provided. It doesn't gate the analysis when absent. Be on the lookout for the ability to provide the original prompt in v0.2. WHAT'S THE MOAT IF THE PROMPTS ARE OPEN SOURCE? The prompts are the framework. The action layer, REMEDIATE, VERIFY, ESCALATE , is the product...and domain modules (Compliance CRIT, Contract CRIT, Architecture CRIT) require the kind of expertise that compounds over time. This is exactly why I've made this an open-source project. The community builds the moat. Hit the playground and run the "Try An Example" button. The output is deliberately loaded with bad assumptions. Watch what happens. Thanks for stopping by and trying out the playground! I'm more than happy to answer any questions. I look forward to you all's feedback! Daton

SkeptAI - The adversarial AI agent that challenges LLM outputs

Replies