PDF & Word support
Upload PDFs and .docx files. More formats on the roadmap.
Pith isn't a generic PDF reader. It's an extraction pipeline designed to produce CMS-native structured content.
Define a template with field names, types, and plain-English instructions for the AI. Text, numbers, dates, booleans, arrays — Pith maps each field to your CMS schema before a single page is processed.
Templates are reusable across documents. Set it up once for administrative orders and run every new filing through the same template automatically.
Every extracted field comes with a confidence score from 0 to 100. High-confidence fields can auto-approve. Low-confidence fields get flagged for human review.
Color-coded indicators make the review interface fast. Editors only touch what actually needs attention.
When Pith pushes to Sanity, body content becomes proper Portable Text blocks. Images become references. Related documents resolve to typed references in your schema.
No post-processing, no copy-paste, no HTML cleanup. Your CMS gets the structured data it was designed to hold.
Pith detects when a document references another document in your CMS and resolves it to a proper typed reference. Internal links, related content, parent-child relationships — all wired up at push time.
Upload a folder of PDFs and let Pith work through them. Real-time progress tracking shows page counts, extraction status, and review queue depth as each document finishes.
Batch review lets you approve low-variance documents in bulk and focus manual time on exceptions.
Define slug rules with prefix support, multi-field composition, and custom separators. Pith generates clean, consistent URLs at push time — no manual slug entry, no duplicates.
Upload PDFs and .docx files. More formats on the roadmap.
See which part of the source document each extracted value came from.
Edit any field in the review interface before approving.
Update your template and re-run extraction on any document without re-uploading.
Multiple reviewers can work through the queue simultaneously.
See pages used this month, your limit, and a breakdown by project.