Embedded Agency & Decision Theory

AI agents embedded in their environment

⏱️ 12 hoursAdvanced

Embedded Agency

Traditional decision theory assumes agents are separate from their environment. Embedded agents are part of the world they're reasoning about.

Key Challenges

Self-Reference: Agent's computations affect the world being modeled
Logical Uncertainty: Limited compute means uncertain about logical facts
Naturalized Induction: Learning while embedded in environment
Robust Delegation: Creating successors or modifying oneself

Decision Theory Problems

Newcomb's Problem and decision theory paradoxes
Logical counterfactuals and updateless decision theory
Coordination without communication
Reflective stability and self-modification

Implications for AI Safety

AIs reasoning about their own training
Self-fulfilling prophecies and fixed points
Corrigibility and shutdown problems
Value stability under self-improvement

Research Directions

Logical induction and bounded rationality
Functional decision theory
Cartesian frames and boundaries
Finite factored sets

← Back to Module

Loading...

⚡Pre-rendered at build time (instant load)