December 10, 2025 | Research Brief

The Limits of Artificial Intelligence in Ethical Decision Making

Judges and doctors often face deep uncertainty in decision making. Artificial intelligence might help, but only if it can learn to reflect human values.

by Brooke Fox

Every day, physicians and judges make high-stakes decisions under intense time pressure and imperfect information. A new paper by Yale Economist W. Bentley MacLeod develops an economic framework to explain how these professionals reason under uncertainty—and when artificial intelligence can meaningfully assist them.

MacLeod begins by distinguishing between slow, deliberative decisions and fast, intuitive ones, a distinction rooted in behavioral science (often described as System 2 vs. System 1). Slow decisions resemble classical rational choice: gather information, weigh probabilities, and choose the best option. Fast decisions rely on pattern recognition—doctors quickly identifying a heart attack, judges instantly recognizing a high-risk defendant. Both forms of reasoning are essential, but both are constrained by uncertainty and limited feedback.

Professional work is uniquely challenging because no choice guarantees the correct outcome: a patient may not recover even after ideal treatment; a low-risk defendant may still reoffend. As MacLeod notes, this means neither the client nor the professional ever fully knows whether a given decision was right. In these gray zones—what the economist Frank Knight called uncertainty—human values matter. Professionals rely on ethical norms (such as judicial impartiality or physicians’ duty to prioritize patient welfare) to guide choices when evidence is ambiguous.

“It remains an open question whether AI systems can be trained to make high-quality decisions in complex cases where outcomes are uncertain. In such cases, the way a client evaluates trade-offs is an important input into the optimal decision; this raises a legitimate concern that AI systems may not appropriately incorporate human values.”

AI enters this landscape as both promise and puzzle. When high-quality, labeled data exist—as in radiology or certain bail-risk predictions—machine-learning systems can outperform humans or enhance human accuracy. But MacLeod emphasizes that most professional decisions occur in the “long tail”: rare, complex, poorly labeled situations where rules are unclear and values differ across clients. In these cases, AI trained purely to minimize prediction error can produce recommendations misaligned with human preferences—for example, undervaluing patient risk tolerance or misinterpreting judicial fairness standards.

The paper argues that AI decision tools must ultimately incorporate human values—not just data—to be useful and trustworthy. Guidelines, defaults, and professional norms will continue to play a critical role, and AI must be designed to work within these structures rather than override them. For now, MacLeod concludes, AI can assist professional judgment, but cannot replace the human reasoning required when information is limited and values matter most.