How much do you trust AI agents?
With the advent of clawdbots, it's as if we've all lost our inhibitions and "put our lives completely in their hands."
I'm all for delegating work, but not giving them too much personal/sensitive stuff to handle.
I certainly wouldn't trust something to the extent of providing:
access to personal finances and operations (maybe just setting aside an amount I'm willing to lose)
sensitive health and biometric information (can be easily misused)
confidential communication with key people (secret is secret)
Are there any tasks you wouldn't give AI agents or data you wouldn't allow them to access? What would that be?
Re. finances – Yesterday I read this news: Sapiom raises $15M to help AI agents buy their own tech tools – so this may be a new era when funds will go rather to Agents than to founders.


Replies
For me, trust is heavily tied to how long the "collaboration" has been going on, especially when the context is still pretty dynamic and the input can change over the course of months. Financials are less of a concern in that sense. But when it’s just about quick iterations, like making a greeting card illustration from a personal photo, no worries tbh.
minimalist phone: creating folders
@marces_wiliam Unil the output remains in my computer and needs approval first, I am okay with that ;)
AI agents are great for analysis, automation, and summarising information, but I’d still keep them away from things that require high trust or irreversible actions like direct control over finances, sensitive health data, or private communications.
For now, I see them more as decision-support systems, not decision-makers.
minimalist phone: creating folders
@sansstuti_aggarwal as analysing tools – okay, I am buying it too :) We are on the same page in this.
i think most people in this thread are conflating "trust" with "giving full access" and those are completely different things. you dont need to trust an agent to use it effectively, you just need to scope its permissions correctly.
i run coding agents basically all day and the stuff that actually burned me wasnt the agent going rogue or leaking data. it was the agent confidently doing the wrong thing in a way that looked right. thats way more dangerous than some theoretical privacy breach because you dont catch it until its already in production.
the real risk with agents isnt malice or data theft, its competence drift. they work great for 45 minutes then the context window fills up and they start hallucinating solutions to problems that dont exist. if youre not checking their work regularly you end up with this slow accumulation of subtle bugs that no amount of sandboxing or VMs will prevent.
minimalist phone: creating folders
@umairnadeem Is there any way to prevent that hallucination in the future? Because one guy was presenting me some solution that it keeps the information about you and the AI will remember them. But I was a little bit confused with the target audience.
honestly the biggest risk with agents isnt giving them too much access. its the approval fatigue loop. i run coding agents basically all day and after the first week of clicking "approve" 40 times you just start rubber stamping everything. at that point you have the worst of both worlds - all the risk of full autonomy with none of the speed gains.
way better approach imo is to define really narrow scopes upfront and then let it run unsupervised within those bounds. like my coding agent can read/write files and run tests but cant touch git push or deploy. no approval prompts needed because the dangerous stuff is just walled off at the infra level. the "human in the loop" thing sounds safe until the human stops looping.
minimalist phone: creating folders
@umairnadeem What product are you buiilding?
The trust question comes down to one thing: where is the AI getting its information from?
For drafts and summaries - fine. But for business decisions, the real danger isn't that the AI fails. It's that it confidently gives you a plausible-sounding answer based on nothing real. That's worse than no answer.
My line: if I can't trace the answer back to a real source, I don't act on it.
minimalist phone: creating folders
@zigapotoc Ideally to have at least 2 independent resources for the info ;) Good journalist rule.
the real problem nobody here is talking about is approval fatigue. you set up all these guardrails and least privilege rules, agent asks for permission 30 times a day, and within a week youre just hitting approve on everything without reading it. saw someone above mention this exact thing.
ive been running a coding agent basically 24/7 for months and the stuff that actually keeps it safe isnt permissions or sandboxes. its making destructive actions harder than constructive ones. like using trash instead of rm, writing drafts instead of sending emails directly, committing to branches instead of main. the friction is the guardrail, not some approval popup you stop reading after day 2.
the people saying "never give agents access to X" are thinking about it wrong imo. its not about what you give access to, its about whether the bad outcome is reversible. read access to your bank account? who cares. write access to send a wire? yeah thats different. but most people draw the line at "personal = off limits" which doesnt actually map to risk at all.
minimalist phone: creating folders
@umairnadeem but with this... not many things are reversible. When something once happens, it happens and cannot be returned; only you can "lighten" the consequences (and you have to have in mind possible damages and risks).
Depends on the task. I trust completely when it comes to classifying my inbox.
minimalist phone: creating folders
@salah_oukrim My inbox is something I wouldn't like to give access to anybody and anything :D too sensitive :D
the real problem nobody here is talking about is approval fatigue. you set up all these guardrails and confirmation prompts and for the first week you actually read them. then by week two youre just hitting approve on everything because the agent asks permission 40 times a day and you stop caring.
i run coding agents pretty much nonstop and ive caught myself approving stuff i didnt even read twice. not because i trust the agent but because the friction of constantly reviewing breaks your flow. so you end up in this weird middle ground where you technically have human oversight but practically dont.
honestly the answer isnt more guardrails or less trust. its better scoping. give the agent a tiny sandbox where it literally cant do damage even if it goes rogue, and let it go wild in there. trying to micromanage every action just means you eventually stop managing any of them.
Murror
Building Murror has made me think about this from both sides. I am building an AI that people trust with something really personal, their loneliness and emotional state. So the question of trust is not abstract for me.
My take is that trust with AI agents is earned the same way it is with people: through consistency and transparency about what they are actually doing. The fear is not the autonomy itself, it is not knowing what happened.
For Murror I think about it this way: the AI should be a safe space, not a black box. Users need to feel like they understand what it does with what they share. That is the design challenge I care about most right now.
minimalist phone: creating folders
@astrovinh for any agent, I would like to have a middle step: approval process – dunno, I am a kind of paranoid :D
I think AI is great for automation and assistance, but there are clear boundaries. I wouldn’t give it full access to personal finances, biometric data, or private conversations. Those are areas where a mistake, breach, or misuse could have serious consequences.
For me, AI works best as a tool that suggests and helps, not something that has complete control over sensitive parts of my life.