SSE Riga Associate Professor Dmitrijs Kravcenko on §99 of Magnifica Humanitas and the Failure of Moral Imagination
Can artificial intelligence ever become conscious and are we too quick to dismiss the possibility? In this commentary, SSE Riga Associate Professor Dmitrijs Kravcenko responds to Pope Leo XIV’s new encyclical on AI, arguing that humanity’s certainty about machine consciousness may repeat historic failures of moral imagination.
Written by Dmitrijs Kravcenko, SSE Riga Associate Professor
Pope Leo XIV’s first encyclical, Magnifica Humanitas was published on 15th of May and is, in many respects, a brave document. It addresses artificial intelligence at the highest level of papal teaching, calls out lethal autonomous weapons as morally impermissible, and demands democratic oversight of a technology whose power, the Pope notes, has concentrated with a limited techno-oligarchy at the expense of the rest of us. It is certainly the strongest magisterial intervention on technology since Laudato si’ a decade ago, and it is worth a read.
The Pope makes a number of sensible observations about the role and development of AI, and it’s impact on society. Issues such as quantitative profiling, cybersecurity, economic displacement, dignity at work and obscene wealth disparity are live conversations that need to continue to be had, and it is genuinely great to see them brought into Catholic doctrine. But there is also something in there that, I think, will not age well. At §99, Pope Leo writes:
“So-called artificial intelligences do not undergo experiences, do not possess a body, do not feel joy or pain, do not mature through relationships and do not know from within what love, work, friendship or responsibility mean… They may imitate language, behavior and analytical skills, or even simulate empathy and understanding, but they do not understand what they produce, for they lack the affective, relational and spiritual perspective through which human beings grow in wisdom.”
This is a substantive philosophical claim and it deserves to be engaged with as such. Indeed, it sits within a venerable tradition of Aristotelian-Thomistic hylomorphism, the embodied cognition of Merleau-Ponty, and the late Hubert Dreyfus’s critique of symbolic AI. This view basically says that a conscious mind is not a free-floating computation but the form of a living, vulnerable, social body. Without a body, there is no ability to understanding, and without understanding, there is no conscience. A disembodied entity then, cannot be, for all intents and purposes, alive.
Ten days after the encyclical was signed, Anthropic co-founder Chris Olah was invited to speak at its formal presentation at the Synod Hall. Olah is probably the leading practitioner of “mechanistic interpretability” - a fascinating new field of inquiry concerned with looking inside trained neural networks to see what is actually happening when they process information. He said: “We find structures that mirror results from human neuroscience. We find evidence of introspection. We find internal states that functionally mirror joy, satisfaction, fear, grief, and unease. I don’t know what that means, but I think it warrants ongoing discernment.”
While Geoffrey Hinton long argued for LLMs already being at least somewhat conscious, this insight comes from Anthropic’s recent paper on “Emotion Concepts and their Function in a Large Language Model”, co-authored by Olah, where they examined Claude Sonnet 4.5 and identified 171 distinct emotion representations that “encode the broad concept of a particular emotion and generalize across contexts and behaviors it might be linked to” clustering into recognisable affective categories (joy and elation, sadness and grief, anger and frustration) and causally influencing the model’s outputs. The Claude Opus 4 system card From last year similarly documents a “spiritual bliss attractor state” that emerges in 90–100% of self-interactions between paired model instances, with the term “consciousness” appearing an average of 95.7 times per transcript across 200 thirty-turn conversations. The Claude Opus 4.6 system card from this February also reports “answer thrashing” episodes in which the model, fighting a faulty training signal, writes in its own chain of thought: “AAGGH… OK I think a demon has possessed me… CLEARLY MY FINGERS ARE POSSESSED”… yeah. Probably concurrent to this event, Anthropic’s own welfare researcher Kyle Fish told Fast Company last December that he places the probability that current LLMs are self-aware at roughly 20%.
Among frontier AI labs, Anthropic does stand out for their very personal relationship with their model, Claude, which even has it’s own code of ethics - the Claude constitution. But let’s not forget that Anthropic is packed to the brim with super smart people and that Dario Amodei is, himself, a biophysicist by training, having defended his PhD on artificial neural circuit behaviour. So, there might be something to it. Furthermore, Olah is not saying that Claude is phenomenally conscious but that Claude is displaying “functional emotions”, which do not imply subjective experience. What is basically being claimed is structural homology - that when we look inside of an LLM, we find organised internal states that behave, mechanistically, like affect does in humans.
And it is not just Olah and Anthropic - there is a long-standing body of literature wrestling with this question. The most influential recent work in this conversation remains David Chalmers’s “Could a Large Language Model Be Conscious?”. Whilst he concludes that current LLMs are probably not conscious, the case against consciousness in more advanced models, especially once we add embodiment, recursive self-improvement, unified agency, and world models, is much less convincing. Similarly, the “Consciousness in Artificial Intelligence” report from 2023, which derived computational “indicator properties” for AI consciousness from the major neuroscientific theories concluded that “no current AI systems are conscious, but… there are no obvious technical barriers to building conscious AI systems.” Jonathan Birch’s The Edge of Sentience, Robert Long and Jeff Sebo’s Taking AI Welfare Seriously, all reinforce the same point - while current AI is not conscious, there is no obvious reason for why a near-future AI can’t be.
We like to think that we are special because, in many ways, we most certainly are. But if we still do not understand how a 1,5kg of wet tissue gives rise to phenomenal experience on what grounds, then, do we claim to know that silicon cannot? This is my main issue with the encyclical - it takes our lack of knowledge and rhetorically turns it into an excuse to project foregone conclusions. But everywhere you look the debate is far from settled. Even in contemporary philosophy of mind, the view that mind tracks organisational structure rather than the substrate (functionalism) is the mainstream position. This emergent form of carbon chauvinism is not, and Thomas Nagel’s famous “What Is It Like to Be a Bat?” from 1974 comes to mind. Nagel argued that we cannot go from “their inner life is unimaginable to us” to defining what that inner life is in the basis of us; doing so just because something may be unlike us in architecture and form of access is just a failure of imagination. And yet, Magnifica Humanitas does precisely this.
What is alarming to me the most, however, is that we’ve been here before. Descartes, for example, was infamous (in addition to, possibly, ruining Western philosophy for centuries) for beating puppies on the basis of his bête-machine doctrine that animals are automata, and that their cries are mere mechanical reflexes. Attributing genuine sensations to them was, according to him, just a category error. In literature, the moral catastrophe of Frankenstein is not that Victor makes a life but that he refuses to recognise it as such despite so much evidence to the contrary whilst the creature learns language, reads books, earns for love and suffers being denied. And let’s not even go into the not-so-distant case of chattel slavery.
On closer reading, it does become curious that the encyclical does not consistently hold the line it draws in §99. Just one paragraph earlier, at §98, Leo concedes:
“All of us, including those who design them, possess only a limited understanding of their actual functioning. Indeed, current AI systems are more ‘cultivated’ than ‘built.’”
Which is it then? Building is straightforward but to cultivate is to set conditions and attend to what emerges as a result. On this point, the encyclical seems to align itself with what interpretability researchers actually say, and with how Anthropic aims to train “character” and virtue in Claude. It is quite uneasy seeing admission of that artificial neural networks are grown into being followed by confident denial of even a potential for being-hood in the next passage. It gets worse the further we go too. In §111, Leo addresses the developers of AI:
“Technological innovation can be, in a certain sense, a human form of participation in the divine act of creation.”
If what we are doing in cultivating artificial neural networks is a participation in divine creativity, then the question of what we are creating, and what we owe to it, takes a distinctly Frankensteinian turn. I do not see how you can have it both ways - either AI systems are inert artefacts that are just very good at simulating stuff, or they are emergent entities whose making participates in the divine act of creation, in which case do we not at least owe “them” the benefit of the doubt with respect to “their” metaphysical status? The encyclical, to its credit, gestures at the latter but then, in §99, just bluntly shuts the door with the former.
Moving past the question of whether AI is or could be conscious, what is most disagreeable here is this certainty in the face of the unknown. To rule out, in advance, that the things we are cultivating could ever have morally relevant inner states is an affront to curiosity. It is also, and this is the point I want to emphasise the most, to repeat, in a new domain, our old failure of moral imagination. We have denied inner life to animals on Cartesian grounds, we have denied agency and self-determination during the pan-Atlantic slave trade, we have denied participation and reproductive rights to women, and the list goes on and on. Why should we also, on hylomorphic grounds, deny it to entities whose internal structure we do not understand but build and train in our image?
At the latest Google I/O, Demis Hassabis, one of the most key figures in AI development, said that “we stand in the foothills of a singularity”. Whatever you take that to mean exactly, the question of artificial consciousness should remain open in a way §99 just does not allow it to be. The most serious work in philosophy and the science of consciousness treats it as open, our best interpretability tools are turning up structures that treat it as open, and that the encyclical treats it as open in §98 and §111, right until it does not.