Supported file formats are TXT, MD, DOCX, PPTX, PDF, and HTML. When possible, the engine uses document headers to divide the content into extraction units.
Supported file formats are PPTX, PPT, DOCX, DOC, ODP, ODT, ODS, PDF, JPG, JPEG & PNG files. This engine effectively handles documents that may include visual elements such as charts, graphics and tables.