markitdown
K-Dense-AI/scientific-agent-skills
MarkItDown is a powerful tool designed to convert a vast array of file formats—including PDF, DOCX, PPTX, XLSX, images, and audio—into structured Markdown. It is highly optimized for modern Large Language Models (LLMs), providing clean, token-efficient text output. Features include Optical Character Recognition (OCR) for scanned documents, automatic transcription for audio, and advanced integrations like Azure Document Intelligence, making it essential for digital content processing and LLM data preparation.