Halluci-NOT
Two-layer governance pipeline: regex and allowlists catch structural violations deterministically, GPT-4o mini handles context-sensitive reasoning. Every output scores 0-100 and routes to PASS, REVIEW, or BLOCK.
Catch rate
100%
Listed test scenarios
False positives
0
Stated test set
Triage
Policy-gated output
Project README
Layer
Input
LLM output
Layer
Layer 1
Regex + allowlists
Layer
Layer 2
GPT-4o mini reasoning
Layer
Allowed
Clean output
Layer
Human review
Needs judgment
Layer
Blocked risk
Rejected output
Problem
Generic filters do not know business policy, approved vendors, or real SAP terms.
- PII leaks create compliance risk.
- Hallucinated commands create operational risk.
- Policy violations need business-specific rules.
System
Halluci-NOT uses deterministic checks first, then model reasoning for context-sensitive classification.
- Regex catches structural PII patterns.
- Allowlists verify SAP terms and vendors.
- GPT-4o mini reasons over policy context.
Shipped proof
The test set exercises failure modes directly.
- PII exposure blocked.
- Fake T-code blocked.
- Clean query passed.
Lesson
AI safety improves when checks produce structured decisions that downstream systems can enforce.
- Severity scoring beats vague warnings.
- JSON contracts support audit workflows.
- Deterministic allowlists reduce avoidable model work.
Evidence links