Embedded Agency & Decision Theory
AI agents embedded in their environment
⏱️ 12 hoursAdvanced
Embedded Agency
Traditional decision theory assumes agents are separate from their environment. Embedded agents are part of the world they're reasoning about.
Key Challenges
- Self-Reference: Agent's computations affect the world being modeled
- Logical Uncertainty: Limited compute means uncertain about logical facts
- Naturalized Induction: Learning while embedded in environment
- Robust Delegation: Creating successors or modifying oneself
Decision Theory Problems
- Newcomb's Problem and decision theory paradoxes
- Logical counterfactuals and updateless decision theory
- Coordination without communication
- Reflective stability and self-modification
Implications for AI Safety
- AIs reasoning about their own training
- Self-fulfilling prophecies and fixed points
- Corrigibility and shutdown problems
- Value stability under self-improvement
Research Directions
- Logical induction and bounded rationality
- Functional decision theory
- Cartesian frames and boundaries
- Finite factored sets
← Back to Module
Loading...
⚡Pre-rendered at build time (instant load)