Symptom
User reports one or more of the following:
- Confidence score is consistently low.
- Expected fields are missing.
- Extracted values are incorrect.
- Extraction times out before completion.
Diagnosis steps
- Confirm document type, page count, and scan quality.
- Verify upload completed without partial transfer.
- Check extraction job status and processing duration.
- Inspect whether issue affects one field group or all sections.
- Confirm if similar documents in the same period are impacted.
Root cause
- Low-quality scan or unclear text in source document.
- Incomplete upload resulting in missing pages.
- Complex or unusual clause formatting.
- Temporary processing capacity delay.
Resolution
Low confidence scores
- Ask user to review low-confidence fields first and verify provenance.
- If scan quality is poor, request a clearer file and rerun extraction.
- Validate improvement by comparing confidence distribution before and after rerun.
Missing fields
- Confirm the field exists in source document.
- Rerun extraction after verifying file completeness.
- If still missing, capture clause excerpt and escalate for model review.
Wrong values
- Guide user to correct values in Extraction Review and mark verified.
- Check if issue pattern repeats across similar fields.
- Record repeated patterns for engineering triage.
Timeout during extraction
- Retry extraction once.
- If timeout repeats, split document and process in logical sections.
- If still failing, escalate with job ID and timing details.
Escalation
Escalate when:
- Missing or wrong key fields persist after rerun.
- Timeout repeats twice on the same document.
- Multiple users report the same extraction degradation.
Escalate to:
- L2 Support for impact assessment.
- Engineering for model or processing pipeline investigation.
Include:
- Workspace ID, document ID, extraction job ID.
- Exact field names and expected values.
- Sample screenshots with provenance view.
- UTC timestamps and timezone.