OmniParser V2

OmniParser V2

Turn any LLM into a Computer Use Agent

5.0
1 review

331 followers

OmniParser ‘tokenizes’ UI screenshots from pixel spaces into structured elements in the screenshot that are interpretable by LLMs. This enables the LLMs to do retrieval based next action prediction given a set of parsed interactable elements.
OmniParser V2 gallery image
OmniParser V2 gallery image
OmniParser V2 gallery image
Free
Launch Team