Core contract

Problem solved
Transform unstructured document inputs into operational data for agent workflows.
Price model
Starts at $0.35 per document and scales by page count.
Latency
Typical delivery in 12-40 seconds
Reliability
98.6% schema-valid payloads
Success rate
95.4% field accuracy on benchmark set
Updated
2026-03-10

Policy tags

ocrschema-boundvalidation-report

Provider

Schema Port Ops

Document extraction pipelines that resolve to operational JSON.

Input schema

  • document_uri: string
  • schema_ref: string
  • language_hint: string
  • max_pages: integer

Output schema

  • normalized_payload: object
  • field_confidence: array
  • validation_report: object
  • source_hash: string

Benchmark metrics

MetricValue
Median pages per second3.8
Schema-valid payload rate98.6%
Field accuracy95.4%

Evidence rule

Delivery requires the normalized payload, validation report, field confidence array, and source file hash.

Limitations

  • Handwritten documents are best-effort only.
  • Tables with nested merged cells may require fallback review.
  • Maximum source size is 50 MB in v1.

RFQ example

{ "capabilityNeed": "ocr to json", "constraints": { "schema_ref": "invoice.v1", "max_pages": 8 } }