mirror of
https://github.com/danielmiessler/Fabric.git
synced 2026-04-02 03:01:13 -04:00
Add four patterns implementing minimal, falsifiable ethical constraints for AGI safety evaluation: - ultimate_law_safety: Evaluate actions against "no unwilling victims" principle - detect_mind_virus: Identify manipulative reasoning that resists correction - check_falsifiability: Verify claims can be tested and proven wrong - extract_ethical_framework: Surface implicit ethics in documents/policies These patterns derive from the Ultimate Law framework (github.com/ghrom/ultimatelaw), which takes a different approach to AI alignment: instead of encoding contested "human values," define the minimal boundary no agent may cross. The core insight: Not "align AI with human values" but "constrain any agent from creating unwilling victims." Framework characteristics: - Minimal: smallest possible constraint set - Logically derivable: not arbitrary cultural preferences - Falsifiable: can be challenged and improved - Agent-agnostic: works for humans, AI, corporations, governments - Computable: precise enough for algorithmic implementation Each pattern includes system.md (prompt) and README.md (documentation).
1.4 KiB
1.4 KiB
Extract Ethical Framework
Make implicit ethics explicit. Every prescriptive document contains hidden ethical assumptions — this pattern surfaces them.
Why This Matters
- Terms of Service contain implicit ethics
- AI system descriptions contain implicit ethics
- Policies and laws contain implicit ethics
- Manifestos contain implicit ethics
Making them explicit allows:
- Checking for internal consistency
- Evaluating against minimal ethical standards
- Identifying hidden coercion
- Challenging unstated assumptions
What It Extracts
| Question | What to Find |
|---|---|
| Who counts? | Whose interests matter? |
| What's harm? | What are agents protected from? |
| What's consent? | How is agreement established? |
| Who decides? | Who has authority and why? |
| When is force OK? | What justifies coercion? |
| What wins? | Hierarchy when values conflict |
Usage
# Analyze terms of service
cat tos.txt | fabric -p extract_ethical_framework
# Analyze an AI safety proposal
echo "The AI should be beneficial and aligned with human values" | fabric -p extract_ethical_framework
# Audit a policy document
fabric -p extract_ethical_framework < policy.md
The Minimal Standard
Does the framework authorize creating unwilling victims?
If yes → it fails the minimal ethics test, regardless of how coherent it is internally.
Source
From the Ultimate Law framework: github.com/ghrom/ultimatelaw