Stan Girard

Megaparse [LW24] - Open-source Document Parser to Markdown with OCR/LLMs

Megaparse is a file parser optimized for LLM Ingestion. It can parse PDFs, DOCX, PPTX in a format that is ideal for LLMs. All of that accessible from a python package, an API, or a queue.

Add a comment

Replies

Best
Ioannis Tsiokos
Love it. Markdown is becoming the de-facto in AI input processing, and proper conversion to it (without having to install a million packages) will be paramount.
Robin Philibert
Really nice! Open source, with OCR and table optimization, perfect for LLM workflows. Congrats to the team! 🙌