Control your Mac with detailed mouse, keyboard, screen, and window management capabilities.
Works really well with Gemini 2.5 pro + good prompting :D
Let me know if any feedback or bugs :)
Hey guys, thanks for checking out the automation mcp. Wrote this in a night or two after a conversation with a friend to see whether this was possible.
Turns out it was and this didn't exist, so here we are :D
Report
This looks super cool!
Do you have some example use cases that you’ve tested? For example, can it launch a browser and perform a web search?
@tleyden Yes it can technically do anything you can but, performance varies on the model.
I've tested browser searches and WhatsApp + Discord messages.
Gemini 2.5 pro works well. But isn't there 100%.
I suggest adding this at the end of the prompt:
screenInfo and screenshot are required for you to move the mouse to the correct position. Take a screenshot after each action to be sure you execute correctly.
Report
Full device access? 🚨 What's the sandboxing model - does it use virtual input drivers or just raw hooks? And please tell me there's permission granularity beyond 'yes/all'! 🔐🛡️
Replies
Swiddle
This looks super cool!
Do you have some example use cases that you’ve tested? For example, can it launch a browser and perform a web search?
Swiddle
@tleyden Yes it can technically do anything you can but, performance varies on the model.
I've tested browser searches and WhatsApp + Discord messages.
Gemini 2.5 pro works well. But isn't there 100%.
I suggest adding this at the end of the prompt:
screenInfo and screenshot are required for you to move the mouse to the correct position. Take a screenshot after each action to be sure you execute correctly.
Full device access? 🚨 What's the sandboxing model - does it use virtual input drivers or just raw hooks? And please tell me there's permission granularity beyond 'yes/all'! 🔐🛡️