Ethical Games
Training AI in moral reasoning by turning "doing good" into a game.
Introduction
Modern AI excels when learning through games—like Chess, Go, or Atari—thanks to techniques such as reinforcement learning. We explore the concept of Ethical Games, where we can apply this same game-based approach to moral or ethical decision-making. By structuring real-world dilemmas as simulations with clear reward signals (e.g., "minimize harm," "avoid collisions," "maximize fairness"), advanced AI models can self-train to follow "good" strategies—much like they master board games.
Concretely, this means mapping ethical theories (consequentialism, deontology, virtue ethics, etc.) into reward-based environments. The AI repeatedly plays out various dilemmas, self-correcting until it converges on more "ethical" policies. While still far from perfect—human morality is nuanced—this project lays out the why (prevent unsafe or biased decisions), the how (repurposing RL and simulation), and the what (practical scenarios: self-driving cars, content filtering, or medical triage) of ethical AI via game-based training.
Videos
Mu-Zero and the Ethical Game
The Types of Games
The Future
Practical Applications
Self-Driving: Train collision-avoidance or pedestrian-protection as a scoring mechanic. The vehicle learns to handle morally ambiguous "trolley problem" scenarios from repeated simulations, rather than naive rules.
Moderation: AI can "play" scenarios around harmful content and user freedoms. The aim is not just removing everything or letting everything slide, but finding balanced moderation strategies with minimal harm.
Healthcare: Simulate triage decisions where the AI receives a "reward" for fair resource allocation under uncertain conditions. Over time, it refines "ethical best practices."
How It Helps
By turning moral guidelines into repeatable "games," we reduce the barrier between abstract philosophical ideas and machine-friendly logic. Reinforcement learning thrives on iteration and feedback loops. We can use Ethical Games to give AI exposure to morally charged choices before it acts in the real world. It effectively "sandbox-tests" moral reasoning, revealing hidden biases or dangerous strategies early.
Conclusion
While no algorithm can perfectly mirror human morality (yet!), exploring Ethical Games offers a practical step toward bridging the gap. The same self-learning that enabled AI to dominate Chess and Go can also internalize guidelines for harm reduction, fairness, or compassion—provided we define them cleverly. As researchers continue refining simulation-based training, we inch closer to AI that not only understands how to "win," but also how to care about how it wins.
Selected References
• Schrittwieser, J., et al. "Mastering Atari, Go, chess and shogi by planning with a learned model," Nature, 2020.
• Silver, D., et al. "A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play," Science, 2018.
• Guarini, M. (2006). "Particularism and the Classification and Reclassification of Moral Cases," IEEE Intelligent Systems, 21(4), 22–28.