AI Safety Policy

Purpose Eko uses LLMs only to interpret detected change. This policy defines guardrails for hallucinations, over-quoting, tone drift, and prompt-failure handling.

When AI Is Allowed

LLMs may run only after the system has:

detected a change event (diff-first), and
prepared a bounded, non-substitutive input (delta + minimal context).

LLMs must not be used to:

fetch, crawl, or “fill in” missing page content,
infer paywalled/blocked text,
summarize entire pages without a delta.

Output Requirements

All summaries must:

Be delta-first (what changed → why it matters → confidence).
Avoid reconstructing source content.
Use plain language and label uncertainty.
Include a clear confidence rating (High/Medium/Low).

Hallucination Controls

Hard constraints

The model may only reference facts present in the provided diff/context.
If information is missing, the model must say so explicitly.

Heuristics that trigger a downgrade

Claims about motives/intent (“they changed this because…”) without evidence.
New numbers, dates, or names not present in the input.
Overly specific operational guidance that isn’t supported by the delta.

Required phrasing when uncertain

“The page now suggests…”
“No explicit announcement accompanies this change.”
“This appears to be a wording/structure change; impact is unclear.”

Over-Quoting Controls

Prefer paraphrase.
Quotes are permitted only when strictly necessary to pinpoint a change.
Never include long passages or multiple paragraphs from the source.

Prompt-Failure Handling

If the model fails any guardrail (hallucination, substitution, excessive quoting, tone drift):

Suppress the summary or truncate to safe minimal bullets.
Fallback to a deterministic change report (structure + counts + timestamps).
Log the failure with a categorized reason:
- hallucination_risk
- substitutive_output_risk
- excessive_quotes
- tone_drift
- input_too_large
- ambiguous_diff

Safety Tiers

Tier 0 (Deterministic): structural delta only (always safe)
Tier 1 (Constrained summary): delta + minimal context; no quotes
Tier 2 (Quoted pinpoints): very short quotes to disambiguate a change

Default is Tier 1. Tier 2 requires an explicit need (ambiguity) and must remain minimal.

Human Review Triggers

Flag for review when:

The page is legal/policy/ToS and change is non-trivial.
The diff suggests price, eligibility, security, or compliance impact.
The model confidence is Low but the change score is high.

Compliance

This policy complements /docs/policies/fair-use-and-non-substitutive.md and is enforced by prompt templates, output validators, and fallback behavior.

#AI Safety Policy

#When AI Is Allowed

#Output Requirements

#Hallucination Controls

#Over-Quoting Controls

#Prompt-Failure Handling

#Safety Tiers

#Human Review Triggers

#Compliance