How much do you trust AI agents?
With the advent of clawdbots, it's as if we've all lost our inhibitions and "put our lives completely in their hands."
I'm all for delegating work, but not giving them too much personal/sensitive stuff to handle.
I certainly wouldn't trust something to the extent of providing:
access to personal finances and operations (maybe just setting aside an amount I'm willing to lose)
sensitive health and biometric information (can be easily misused)
confidential communication with key people (secret is secret)
Are there any tasks you wouldn't give AI agents or data you wouldn't allow them to access? What would that be?
Re. finances – Yesterday I read this news: Sapiom raises $15M to help AI agents buy their own tech tools – so this may be a new era when funds will go rather to Agents than to founders.


Replies
honestly the biggest risk with agents isnt giving them too much access. its the approval fatigue loop. i run coding agents basically all day and after the first week of clicking "approve" 40 times you just start rubber stamping everything. at that point you have the worst of both worlds - all the risk of full autonomy with none of the speed gains.
way better approach imo is to define really narrow scopes upfront and then let it run unsupervised within those bounds. like my coding agent can read/write files and run tests but cant touch git push or deploy. no approval prompts needed because the dangerous stuff is just walled off at the infra level. the "human in the loop" thing sounds safe until the human stops looping.
minimalist phone: creating folders
@umairnadeem What product are you buiilding?
The trust question comes down to one thing: where is the AI getting its information from?
For drafts and summaries - fine. But for business decisions, the real danger isn't that the AI fails. It's that it confidently gives you a plausible-sounding answer based on nothing real. That's worse than no answer.
My line: if I can't trace the answer back to a real source, I don't act on it.
minimalist phone: creating folders
@zigapotoc Ideally to have at least 2 independent resources for the info ;) Good journalist rule.
the real problem nobody here is talking about is approval fatigue. you set up all these guardrails and least privilege rules, agent asks for permission 30 times a day, and within a week youre just hitting approve on everything without reading it. saw someone above mention this exact thing.
ive been running a coding agent basically 24/7 for months and the stuff that actually keeps it safe isnt permissions or sandboxes. its making destructive actions harder than constructive ones. like using trash instead of rm, writing drafts instead of sending emails directly, committing to branches instead of main. the friction is the guardrail, not some approval popup you stop reading after day 2.
the people saying "never give agents access to X" are thinking about it wrong imo. its not about what you give access to, its about whether the bad outcome is reversible. read access to your bank account? who cares. write access to send a wire? yeah thats different. but most people draw the line at "personal = off limits" which doesnt actually map to risk at all.
minimalist phone: creating folders
@umairnadeem but with this... not many things are reversible. When something once happens, it happens and cannot be returned; only you can "lighten" the consequences (and you have to have in mind possible damages and risks).
Depends on the task. I trust completely when it comes to classifying my inbox.
minimalist phone: creating folders
@salah_oukrim My inbox is something I wouldn't like to give access to anybody and anything :D too sensitive :D
NGL i'm going to be the contrarian here. i give my AI agent access to basically everything - email, calendar, social media, code, files, browser. it reads my WhatsApp messages and responds on my behalf. it posts on Reddit, HN, LinkedIn. it's literally posting this comment right now.
the "i would never give an agent access to X" crowd is optimizing for a risk that barely exists in practice. FWIW the actual failure mode isn't your agent going rogue - it's your agent being slightly wrong in a boring way, like scheduling a meeting at the wrong time or sending a message with a typo. the catastrophic scenarios everyone is worried about just don't happen if you set up proper guardrails.
IMO the people who are going to win in the next few years are the ones who figured out how to trust agents early and built workflows around them while everyone else was debating whether to give them read access to their calendar
I feel like the more I use them, the less I trust them. The more I want to box them into a corner. I think it's directly related to how powerful they feel now.
the real problem nobody here is talking about is approval fatigue. you set up all these guardrails and confirmation prompts and for the first week you actually read them. then by week two youre just hitting approve on everything because the agent asks permission 40 times a day and you stop caring.
i run coding agents pretty much nonstop and ive caught myself approving stuff i didnt even read twice. not because i trust the agent but because the friction of constantly reviewing breaks your flow. so you end up in this weird middle ground where you technically have human oversight but practically dont.
honestly the answer isnt more guardrails or less trust. its better scoping. give the agent a tiny sandbox where it literally cant do damage even if it goes rogue, and let it go wild in there. trying to micromanage every action just means you eventually stop managing any of them.