B Hari

March 3, 2026

The Fortress of Judgment: What AI Cannot Replace and Why It Matters More Than Ever

The Fortress of Judgment: What AI Cannot Replace and Why It Matters More Than Ever

As machines grow more capable, the skills they cannot replicate are not becoming obsolete. They are becoming more valuable.

---

In February 2026, a team at Mount Sinai and Harvard published the first independent evaluation of ChatGPT Health across 960 clinical interactions. The system, used by millions for medical advice, under-triaged more than half of cases that physicians determined required emergency care. In one asthma scenario, the AI correctly identified early warning signs of respiratory failure in its reasoning -- then advised the patient to wait rather than seek emergency treatment.

The model understood the pattern. It missed the judgment.

This distinction -- between pattern recognition and judgment -- is the fault line that separates what AI does extraordinarily well from what it cannot do at all. And in a year when AI capabilities have reached genuinely impressive heights, it is worth examining the other side of the ledger with equal rigour: the capabilities that remain stubbornly, perhaps permanently, human.


WE KNOW MORE THAN WE CAN TELL

In 1966, the philosopher Michael Polanyi articulated what he called the Tacit Dimension: "we can know more than we can tell." A master craftsman can shape wood by feel. A veteran nurse detects deterioration before the monitors do. A seasoned diplomat reads the room in ways no briefing document captures.

This is not mysticism. Research estimates that 70 to 80 percent of organizational knowledge is tacit -- unwritten, experience-based, and resistant to codification. AI attempts to overcome Polanyi's Paradox by learning from human examples, statistically absorbing how we communicate. But as a 2025 study in Nature confirms, AI remains "unable to engage in abductive reasoning, grasp analogies and metaphors, or interpret sparse or nuanced data." The approximation is impressive. But approximation is not understanding.

The neuroscientist Antonio Damasio sharpens this point from a biological direction. Consciousness, he argues, arises from homeostasis -- the continuous biological regulation that produces feelings like hunger, pain, and urgency. A machine can adjust its functioning in response to data, but it lacks the internal perception that characterises consciousness. Intelligence without a body, in Damasio's framework, is intelligence without stakes. And judgment without stakes is not judgment at all.


WHERE THE MACHINES FAIL

The gap between capability and judgment is most consequential in domains where errors carry moral weight.

In law, 2025 saw over 200 documented cases of AI-generated fabricated citations reaching judges, with courts issuing at least 66 sanctions for AI misuse -- including fines up to $31,000. Stanford research found that legal-specific AI tools hallucinate on 17 to 34 percent of queries; general-purpose models reached 69 to 88 percent error rates on legal questions. More revealing is what researchers at Lawfare call the "law of conservation of judgment": AI does not eliminate the hard decisions inherent to legal reasoning -- it merely relocates them. When researchers prompted ChatGPT and Claude with the same Third Amendment question, the models reached opposite conclusions and then both reversed their positions when presented with standard counterarguments.

In healthcare, beyond the Mount Sinai triage findings, AI-initiated palliative care referrals have generated significant disagreements among clinicians, patients, and families. The strongest diagnostic AI models still fall short on complex cases, landing closer to trainees than seasoned radiologists. Malpractice claims involving AI tools rose 14 percent in 2024 compared to 2022.

In diplomacy, former US State Department official Dr. Donald Kilburg has warned that algorithms "cannot read the room" and risk escalating tensions by missing crucial cultural nuances. The USC Center on Public Diplomacy puts it starkly: AI "cannot grasp the emotional weight of a grieving mother's testimony" in peace negotiations.

In mental health, a Stanford study found that when patients mentioned suicidal thoughts, AI failed to provide clinically appropriate responses 20 percent of the time, compared to 7 percent for human therapists. The therapeutic alliance -- the bond between therapist and patient -- remains one of the strongest predictors of positive treatment outcomes. No algorithm has replicated it. Research published in the Journal of Medical Internet Research found that AI-generated responses are consistently rated lower on affective and motivational empathy -- particularly once their artificial nature is revealed. The machine can simulate concern. It cannot commit to the person in front of it.


THE ECONOMICS OF IRREPLACEABILITY

The economic data reinforces rather than undermines the case for human indispensability. Nobel laureate Daron Acemoglu estimates that AI will produce a modest 1.1 to 1.6 percent increase in GDP over ten years, affecting roughly 5 percent of the economy -- primarily office tasks involving data summary, pattern recognition, and visual matching. His critique is pointed: "We currently have the wrong direction for AI. We're using it too much for automation and not enough for providing expertise and information to workers."

David Autor at MIT has demonstrated that augmentation innovations -- technologies that enhance human capacity -- strongly predict where new jobs emerge, while automation innovations do not. Erik Brynjolfsson's study of 5,000 customer support agents found that AI access helped resolve 14 percent more issues per hour, with novice workers seeing a 34 percent improvement -- a textbook case of complementarity rather than substitution.

McKinsey projects that demand for social and emotional skills -- communication, empathy, leadership -- will rise by 24 percent by 2030. This is the automation paradox: the more routine cognitive tasks are automated, the higher the relative value of uniquely human capabilities. As one analysis puts it, "under high automation, the most critical skills are needed precisely when they are practiced the least." Harvard Business School research reinforces the point: when AI and humans collaborate, the high-performing users select different advice from the AI to follow -- it is the quality of underlying human judgment, not the AI itself, that drives superior outcomes.


THE HONEST COUNTER-ARGUMENT

Intellectual honesty demands acknowledging that the boundary between human and machine capability has shifted before. Chess was once considered a test of irreducible human intuition until Deep Blue defeated Kasparov in 1997. Go required "intuitive" pattern recognition until AlphaGo won in 2016. Tasks that seemed uniquely human turned out to be sophisticated pattern matching.

But the distinction that matters now is qualitative, not quantitative. The tasks AI has conquered -- chess, Go, protein folding, image classification -- are all domains with well-defined rules, closed training distributions, and objectively verifiable outcomes. The tasks that remain human -- moral reasoning under genuine ambiguity, therapeutic presence, crisis leadership, cultural negotiation -- require what the philosopher Hubert Dreyfus called "absorbed coping": an engaged, embodied relationship with the world that generates meaning rather than merely processing it.

A 2025 academic analysis confirms that Dreyfus's critique, first articulated in the 1970s, applies equally to today's large language models. They still add significance from outside the system. They are, as a Nature paper puts it, "by design unable to step outside their training data to engage with the real world -- no matter how many times rainfall is simulated, it does not actually rain inside the computer."

This is not a technological barrier awaiting a breakthrough. It is an ontological one. AI solving moral dilemmas as humans solve them would require AI becoming human -- having a biological body, evolutionary history, mortality, and social embeddedness. That is not an engineering challenge. It is a category error.

The metacognition problem deepens this gap. Studies show that reinforcement learning from human feedback systematically degrades AI self-calibration, producing systems that mistake linguistic fluency for correctness. One model achieved 88.4 percent confidence with only 11 percent accuracy -- essentially inverting the relationship between certainty and truth. Humans, for all our cognitive biases, possess a capacity AI structurally lacks: knowing what we do not know.


WHAT MACHINES MAKE MORE VALUABLE

Hannah Arendt, writing in 1958, distinguished between three forms of human activity: labour (cyclical survival tasks), work (creating lasting artifacts), and action (meaningful interaction between persons in public life). She warned that automation's danger lay not in replacing labour but in reducing all human activity to an "enormously intensified life process" -- efficient, cyclical, and devoid of meaning.

Her framework illuminates the present moment. AI is exceptionally good at labour and increasingly capable at certain forms of work. What it cannot participate in is action -- the relational, moral, communicative dimension of human life that creates meaning between people. The courtroom argument, the bedside conversation, the diplomatic concession, the teacher who notices a child is struggling before the child says a word -- these are forms of action, not labour. They require presence, not processing.

The proper response to AI's rise is therefore not defensive anxiety but analytical clarity. The question is not whether machines will replace us -- for the capabilities that matter most, the evidence strongly suggests they cannot. The question is whether we will invest in cultivating the irreplaceable: judgment, empathy, moral reasoning, and the willingness to bear the weight of decisions that have no optimally correct answer.

The automation paradox suggests we must. As AI handles more of the routine, what remains for humans grows harder, rarer, and more consequential. The fortress of judgment is not under siege. It is under renovation -- and the cost of neglecting it has never been higher.

---
Published February 27, 2026

B Hari

Simplicity with substance
www.bhari.com