Ferret

Ferret

Refer and ground anything anywhere at any granularity

5.0
โ€ข1 reviewโ€ข

207 followers

A new type of multimodal large language model (MLLM) from Apple that excels in both image understanding and language processing, particularly demonstrating significant advantages in understanding spatial references.
Ferret gallery image
Ferret gallery image
Ferret gallery image
Free
Launch tags:Open Sourceโ€ขArtificial Intelligenceโ€ขGitHub
Launch Team