Tutorial January 8, 2026 15 min read

Building a Document Risk Analyzer with AI Guardrails

Step-by-step guide to building an RFP analyzer that never hallucinates. Includes code samples and architecture patterns.

Enterprise document analysis is one of the highest-value applications for AI agents. RFPs, contracts, and technical specifications contain critical information that, if misinterpreted, can cost millions. In this tutorial, we'll build a document risk analyzer that extracts facts without inventing them.

The "Ground Truth First" Architecture

The key insight is simple: never let the LLM generate facts. Instead, we use a retrieval-augmented generation (RAG) approach where the LLM can only reference content that exists in the source document.

Architecture Overview
1. Document Ingestion - Parse PDF/DOCX to structured text
2. Chunking & Embedding - Create searchable vector index
3. Risk Pattern Matching - Identify clauses matching risk signatures
4. Contextual Analysis - LLM explains why each clause is risky
5. Confidence Scoring - Rate certainty for human review

Step 1: Document Ingestion

First, we need to extract text from documents while preserving structure. We use pymupdf for PDFs and python-docx for Word documents.

document_parser.py
import fitz  # pymupdf
from dataclasses import dataclass

@dataclass
class DocumentChunk:
    text: str
    page: int
    section: str
    clause_id: str | None

def parse_pdf(file_path: str) -> list[DocumentChunk]:
    doc = fitz.open(file_path)
    chunks = []

    for page_num, page in enumerate(doc):
        text = page.get_text()
        # Split by section headers
        sections = extract_sections(text)

        for section in sections:
            chunks.append(DocumentChunk(
                text=section.content,
                page=page_num + 1,
                section=section.header,
                clause_id=extract_clause_id(section.content)
            ))

    return chunks

Step 2: Risk Pattern Library

We maintain a library of known risk patterns. These aren't generated by the LLM - they're curated by legal and engineering teams.

risk_patterns.py
RISK_PATTERNS = {
    "ambiguous_sla": {
        "keywords": ["reasonable", "best effort", "commercially reasonable"],
        "severity": "HIGH",
        "description": "SLA terms that lack measurable commitments"
    },
    "unlimited_liability": {
        "keywords": ["unlimited liability", "no cap on damages"],
        "severity": "CRITICAL",
        "description": "Clauses exposing unlimited financial risk"
    },
    "unilateral_termination": {
        "keywords": ["terminate at any time", "sole discretion"],
        "severity": "MEDIUM",
        "description": "One-sided termination rights"
    },
    "ip_assignment": {
        "keywords": ["all intellectual property", "work product"],
        "severity": "HIGH",
        "description": "Broad IP transfer clauses"
    }
}

Step 3: The LangGraph Agent

Here's where LangGraph shines. We define a state machine that ensures the agent follows a strict analysis path.

risk_analyzer_agent.py
from langgraph.graph import StateGraph, END
from typing import TypedDict

class AnalysisState(TypedDict):
    document_chunks: list[DocumentChunk]
    matched_patterns: list[dict]
    risk_assessments: list[dict]
    confidence_scores: dict
    needs_human_review: bool

def build_analyzer_graph():
    graph = StateGraph(AnalysisState)

    # Define nodes
    graph.add_node("pattern_match", pattern_matching_node)
    graph.add_node("context_analysis", context_analysis_node)
    graph.add_node("confidence_score", confidence_scoring_node)
    graph.add_node("human_checkpoint", human_review_node)

    # Define edges
    graph.add_edge("pattern_match", "context_analysis")
    graph.add_edge("context_analysis", "confidence_score")
    graph.add_conditional_edges(
        "confidence_score",
        route_by_confidence,
        {
            "high_confidence": END,
            "low_confidence": "human_checkpoint"
        }
    )
    graph.add_edge("human_checkpoint", END)

    graph.set_entry_point("pattern_match")
    return graph.compile()

Step 4: Guardrails That Prevent Hallucination

The critical guardrail: every claim the LLM makes must reference a specific location in the source document.

guardrails.py
ANALYSIS_PROMPT = """
You are analyzing a document for risk clauses.

STRICT RULES:
1. ONLY reference text that appears in the provided chunks
2. ALWAYS include the exact quote and page number
3. If uncertain, output "AMBIGUITY_ALERT" instead of guessing
4. Never infer information not explicitly stated

Document chunks:
{chunks}

Identified pattern: {pattern}

Provide analysis in this exact format:
- Quote: "[exact text from document]"
- Location: Page X, Section Y
- Risk Level: {severity}
- Explanation: [why this is risky]
- Confidence: [HIGH/MEDIUM/LOW]
"""

def validate_response(response: str, chunks: list[DocumentChunk]) -> bool:
    """Verify that quoted text exists in source document"""
    quotes = extract_quotes(response)
    for quote in quotes:
        if not any(quote in chunk.text for chunk in chunks):
            raise HallucinationError(f"Quote not found: {quote}")
    return True

Step 5: Output Format

The final output is a structured risk report that humans can verify.

# Sample Output
RISK: Ambiguous SLA (HIGH)
Quote: "Vendor shall provide reasonable support during business hours"
Location: Page 12, Section 4.2.1
Issue: "Reasonable" is undefined. Recommend specifying response times.
Confidence: HIGH
RISK: IP Assignment (MEDIUM)
Quote: "All work product shall become property of Client"
Location: Page 8, Section 3.1
Issue: Broad scope may include pre-existing IP.
Confidence: MEDIUM - Flagged for legal review

Results in Production

We've deployed this architecture for clients analyzing RFPs and contracts. Key metrics:

Key Takeaways

  1. Never let LLMs generate facts - Use RAG to ground all outputs
  2. Validate every claim - Check that quoted text exists in source
  3. Score confidence - Route uncertain outputs to humans
  4. Maintain audit trails - Log every state transition

Want us to build this for your documents?

Book a 2-week sprint and we'll deploy a custom document analyzer for your workflow.

Start Assessment