Implementing AI tools in real-life work often ends up costing as much time and money in troubleshooting and setup as its practitioners hope to save with the tools themselves. Getting the process right means focusing AI’s terrific (and expensive) computing power where it can be most useful.

One frequent hurdle is sifting through “unstructured data”—various files like PDFs and documents on company drives—and converting them into a format that AI can easily process. Unstructured Technologies was founded in July 2022, and two months later, it released its first open-source product—a platform that converts these complex, unstructured data formats into AI-friendly JSON files. The company spent 16 months analyzing more than 1 million annotated documents and stitched together over 400 different code libraries, plus its own custom code, to build a platform that can break a document into its constituent parts and identify its components, such as its header, tables, and more.

Since launch, the company’s open-source software has been downloaded over 6 million times, and Unstructured has been used by more than 35,000 companies and government organizations, including McKinsey & Company, BlackRock, the U.S. Air Force, U.S. Space Force, and United States Special Operations Command.

The company—which says it’s expecting to hit $5 million in revenue in 2023 and $12 million in 2024—raised $25 million from the likes of Bain Capital Venture Associates and MongoDB Ventures in July 2023, and another $40 million earlier this March.