Embedded Agency & Decision Theory

AI agents embedded in their environment

⏱️ 12 hoursAdvanced

Embedded Agency

Traditional decision theory assumes agents are separate from their environment. Embedded agents are part of the world they're reasoning about.

Key Challenges

  • Self-Reference: Agent's computations affect the world being modeled
  • Logical Uncertainty: Limited compute means uncertain about logical facts
  • Naturalized Induction: Learning while embedded in environment
  • Robust Delegation: Creating successors or modifying oneself

Decision Theory Problems

  • Newcomb's Problem and decision theory paradoxes
  • Logical counterfactuals and updateless decision theory
  • Coordination without communication
  • Reflective stability and self-modification

Implications for AI Safety

  • AIs reasoning about their own training
  • Self-fulfilling prophecies and fixed points
  • Corrigibility and shutdown problems
  • Value stability under self-improvement

Research Directions

  • Logical induction and bounded rationality
  • Functional decision theory
  • Cartesian frames and boundaries
  • Finite factored sets
Pre-rendered at build time (instant load)