Template-driven extraction
Define fields once — name, type, and AI instructions. Every document runs through your template.
AI Document Extraction
Upload PDFs and Word docs. AI extracts the fields you care about — with confidence scores. Review, approve, and push directly to your CMS.
Drag in PDFs or Word files — one at a time or in bulk. Pith reads the document and counts pages before extraction begins.
Your extraction template tells AI exactly what fields to find. Every value comes back with a confidence score so you know what to verify.
Edit any field before it ships. Approve, and Pith pushes structured content directly to your CMS — Portable Text, references, and all.
Purpose-built for teams migrating document content into a CMS.
Define fields once — name, type, and AI instructions. Every document runs through your template.
Every extracted value shows its confidence level. High confidence ships, low confidence gets flagged for review.
Content goes in as proper structured data — Portable Text for Sanity, not pasted HTML.
AI automatically links related documents using your CMS's reference system. No manual cross-linking.
Upload hundreds of documents at once. Track progress in real time, review in bulk.
Configurable slug rules with prefix support and multi-field composition. Clean URLs out of the box.
All-in AI extraction cost — baked into our pricing
Per month, no overages
Per month, with batch upload
Starter tier, no charge until day 15
“We had thousands of administrative orders sitting as PDFs. Pith let us migrate the entire backlog to our CMS in weeks instead of months.”
“The confidence scores changed how we work. Our editors only touch the fields that actually need a second look. Everything else ships automatically.”
“We manage CMS migrations for clients constantly. Pith cut the document-to-content step down from days to hours per project.”
A 1-page order uses 1 credit. A 200-page manual uses 200. Fair pricing aligned with actual processing cost.
For small teams and solo operators getting started with document extraction.
For teams processing high document volumes who need batch upload and advanced templates.
For agencies managing multiple client organizations with high volume and API access.
Moving PDFs into Sanity isn't a copy-paste job. Here's how to do it with structured output — Portable Text, references, and all.
Courts are required to make administrative orders accessible online. Here's how the First Judicial Circuit moved thousands of PDFs to structured web content.
Government agencies hold document backlogs that represent decades of policy, procedure, and public record. AI extraction is now fast and accurate enough to process them at scale.