Can an AI take responsibility?

A mantra repeated several times at a healthcare conference that I attended recently, is that only humans, not AI, can take responsibility for something. This made me think more deeply about what it really means to take responsibility and what, if anything, sets humans and AIs apart in this respect.

I identified two rather different reasons for why humans don’t think that AIs can or should take responsibility:

Humans are, in contrast to AIs, commonly attributed with free will and the ability to feel shame or guilt when they make mistakes. I will below suggest how this affects how we think about responsibility.
People with cognitive professions fear that the AI will eventually take their jobs and are therefore unwilling to give it too much responsibility or “power”. I will also comment on this below.

At the end of this post I will suggest a way to make the whole issue go away.

Metaphorical truths and free will

Humans have throughout the ages developed metaphorical truths, rules or principle that, while factually or logically incorrect, confer a net benefit to the believers of the “truths”. One such metaphorical truth under a naturalistic world-view (see below) is that humans have free will. Another one is that humans possess a persistent self that according to some philosophical views have moral and other kinds of attributes, described with adjectives such as virtuous or evil.

I will below show that the concept of taking responsibility is philosophically closely related to the above two metaphorical truths and thus relies on human intuitions that are false even though they quite often do the job. I will first briefly summarize the three main views on free will: libertarian free will, naturalism, and compatibilism.

The libertarian free will view

Libertarians assert that free will exists and is incompatible with determinism. They believe individuals have the capacity to make genuinely free choices that are not determined by past events or natural laws. This means that a root cause analysis of a failure may stop at blaiming the person’s in some way defect self instead of a causal chain of events.

The naturalist view

Naturalists believe that all events, including human actions, are determined by natural laws and therefore prior causes. Free will is an illusion, as every decision is the result of a causal chain of events. A root cause analysis of a failure can point at a fault in the human actor (objectively, without blame) or a fault in something external to the actor. The naturalist view makes no difference in principle between a human actor and an AI actor.

The compatibilist view

Compatibilists believe in determinism but maintain that individuals have free will if they can act according to their internal motivations and desires without coercion by other people, even if those motivations are determined by prior causes. This is a different definition of free will than that of the libertarians. A root cause analysis of a failure will produce similar results as in the naturalist case, perhaps with some more emphasis on the inner motivations of the person (that in turn are determined by causes).

Taking responsibility defined

To define “taking responsibility”, we describe a flow of events involving two key actors: the assigner of tasks and the assignee. The assigner owns a list of tasks and defines their completion criteria (definition of done, DoD). The process can be outlined as follows:

Task definition: Assigner defines a list of tasks with clear criteria for the definition of done.
Task assignment: Assignee is either assigned a task by assigner or selects a task from the provided list.
Capability evaluation: Assignee assesses whether they have the necessary skills and resources to perform the task and meet the definition of done.
Commitment: Assignee commits to performing the task.
Execution: Assignee performs the task, fully or partially satisfying the definition of done.
Evaluation: Assigner evaluates the outcome of the task.
Root cause analysis: Assigner performs a root cause analysis of any errors or nonconformities.
Feedback: Assigner gives assignee feedback on the outcome of the task.
Reaction: Assignee reacts to the feedback.

Examples of instantiations of the flow:

Scrum context (software development):

Assigner: Scrum product manager
Task list: Desired set of product features
Assignee: Software engineer

Societal context:

Assigner: Society as a whole
Task list: Set of expected behaviors
Assignee: Any member of society

In the case where the Assigner is the society, the term moral responsibility is often used to allude to some kind of higher, presumably common, principles of goodness (that may or may not be explicitly defined) against which the outcome of the task is evaluated.

Steps 1 to 5 above can in principle be performed by both a human and an AI interchangeably given that both have the skills to perform the task and have protocols to communicate with the Assigner and the task list. An example from the health care conference is an AI that triages patients in an emergecy room. The feedback part of the flow, i.e., steps 6 to 9, on the other hand differs substantially depending whether one believes in determinism or not.

Feedback under the libertarial free will view

If one takes the belief in libertarian free will to its extreme, then one must logically believe that a person’s actions can be explained solely by the character of the person, by attributes of the person’s self, rather than factors such as the person’s genetic disposition, social situation, sensory input, psychiatric disorder, brain tumor, or childhood abuse. The cause of a person’s actions can be explained by the person (the person’s self) having certain attributes such as virtuous, evil, conscientious, or sloppy. The person is assumed to have been able to have done otherwise (i.e., have not been evil or sloppy) had they so decided.

The consequence of failing to produce the right outcome of a task is that one would be held responsible and one might be subjected to blame. In case of breaking cultural norms the feedback would likely be moral judgement. In case of a good outcome one may receive praise.

The blame may be manifested as a verbal reprimand or even a physical punishment. The blame does in the most unconstructive case not offer the individual any explicit help in improving if need be. Also, if blame is considered an appropriate response, then the blamer may not even try to find any other explanations for the failure.

Blame, especially of the moral kind, as a practice has survived because it works well enough in practice [2].

The dual side of blame, and what makes it effective, is our capability of subjective experiences as a response to blame such as shame, guilt, and, in an extreme case, pain.

Blame may in the best case influence the person subjected to it to change for the better (this is a form of determinism which proponents of free will seem to accept, at least implicitly). It works well enough as a feedback mechanism to persist as a culturally accepted means for enforcing good behavior.

Blame also serves as a form of signaling to others about the blamer’s standing. By blaming or judging others, individuals can signal their own virtue, reliability, and alignment with communal values thereby enhancing their reputation and social status within the community.

Since the advocates of libertarian free will don’t attribute free will to algorithms like an AI, they can’t blame algorithms. Neither do they believe that algorithms can be ashamed or feel guilty. They therefore can’t hold algorithms responsible in the way they can hold humans responsible.

Feedback under the naturalist view

Under the naturalistic worldview the brain is deterministic. There is no decision making mechanism in the brain that is free from cause and effect. There is always a causal chain of events that leads a person to take a certain action or make a certain decision (or to fail to take a certain action). And there is no way the person could have done otherwise given the state of the world right at that moment.

At the bottom, everything boils down to elementary particles and quantum field theory which is computable (at least probabilistically). What would traditionally be called the character of a person is likewise conditioned on prior causes such as genes, general health, social situation, and life events.

Moral judgement and blame becomes meaningless in a naturalistic world in relation to both algorithms and humans. Because what exactly is it that we should blame: A step of the computation? A neuron or a set of neurons? The lack of serotonine? The random combination of genes from the two parents. A random mutation? And what does it mean to blame a computation or a random event? (Answer: nothing.)

Under the naturalistic paradigm, there is no difference in principle between a human and an AI with respect to responsibility.

Feedback under the compatibilist view

Compatibilists claim that while the world is deterministic, people can still be (morally) accountable if they act according to their desires in a rational way and are not impeded by external forces. The moral accountability must, as I understand it, be tied to the quality of the person’s desires and the quality of the person’s rational decision making ability. Both of those are formed by the person’s past in a deterministic way and may or may not be truly moral.

My interpretation is that compatibilists want to keep acting according to the conventional metaphorical truths not to disturb the moral fabric of the society too much while still claiming to adhere to physics. For this purpose they have their own definitions of moral accountability and free will.

Whether compatibilists would assign blame or look for deterministic causes for a failure of the assignee to complete the task is to me unclear. They seem to have the option for both although it becomes unclear what exactly the object of any blame would be.

Responsibility, self and suffering

I posit that most people intuitively believe in the libertarian variety of free will and a self. Conventionally, to take responsibility for a task therefore means to potentially subject oneself to blame or moral judgement if one fails to complete the task. The target of the blame is in our example the assignee that has failed to complete the task and blame seems justified since the person is believed to have been able to “do otherwise” according to libertarian free will.

The blame attributed by the assigner causes the assignee to suffer by feeling e.g., shame, guilt or pain. As an assigner of a task and a human the assigner knows how unpleasant suffering can be and therefore feel assured that the assignee wants to avoid suffering as much as the assigner does and will therefore be appropriately focused and motivated to complete the task to the assigner’s satisfaction.

Since an AI isn’t conventionally attributed a self or free will of any form, and since the AI cannot experience suffering, the assigner doesn’t feel they have a hold over it in the way they have over humans. They can’t hold it responsible if it fails and don’t therefore feel comfortable with giving it responsibility.

In conclusion, taking responsibility in the conventional sense requires a self to blame (or to praise), libertarian free will, and the capability of subjective experiences, most notably suffering.

The threat of AI

The discussion about responsibility at the conference I mentioned above concerned only AI, not for instance dialysis machines or pacemakers, even though they can be as critical to the well-being of a human than an AI-based diagnosis application and are also for all practical purposes “black boxes” to their operators. Why does this discussion arise around AI but not other forms of automation that equally lack the ability to suffer? Why is there no debate about whether we can give responsibility to a dialysis machine to do its job? We do in fact give it a lot of responsibility by leaving it unattended.

I posit that humans, especially in the medical community, feel threatened by AI and are more eager to gain and keep control over it than other types of technology. They are more willing to give responsibility – shift power – to a “stupid” dialysis machine than to an “intelligent” AI. The perception of AI as a competitor capable of challenging human intelligence and authority fuels the need to manifest power superiority by keeping AI in check. This psychological response is rooted in a desire to maintain dominance over a technology that could potentially rival human capabilities, ensuring that AI remains a tool under human control rather than an autonomous agent.

The talks at the conference indicate that the fear is to a large extent caused by ignorance and by the fact that while AI is talked about daily, it is still rarely used in clinical practice so there are not many examples that show its true powers and limitations.

A systems approach to avoid the whole question

To reduce unfounded and sometimes irrational fear of AI, we should use systems thinking and structured systems engineering principles (see this post).

Systems engineering is a method for going from some stakeholder needs to deliberately designing and implementing a system that satisfies those needs. All systems have a design, whether they are deliberately designed or not. By using systems engineering we take firm control of the design process and ensure that we get the kind of system we want.

Sticking to health care, an example of a system is the medical imaging department of a hospital. It has stakeholders such as patients and hospital administrators. Its is built of components such as imaging modalities (MR, CT, etc.), a PACS system for storage and display of images, a radiology information system for keeping track of requests and reports (if not done in the PACS system), radiologists, radiology nurses, and potentially AI-based diagnostic support applications.

Applying systems engineering we first identfy the requirements on the whole system. We then decompose those system-level requirements on each subsystem and system component and we specify how we are going to verify that the requirements are satisfied at all levels of the system. Typical requirements may mandate a certain minimum resolution of the MRI machine and a certain maximum does of ionizing radiation emitted by the CT. We may require that the radiologists have certain skills. And we may specify exactly what the AI is supposed to measure and with what sensitivity and specificity. The AI is in systems engineering a system component just like the CR and the radiologist. Provided that is passes its verification, it does exactly what we want it to do with the kind of accuracy and precision that we expect. If the AI is acquired from an external vendor we would expect it to have a CE mark or a 510k indicating that it is developed with good engineering methods and that it safe.

Applying systems thinking, the responsibility of the individual system component is not as interesting as the functioning of the total system which in our example should produce accurate and precise diagnoses in a safe, convenient, and cost effective manner. In systems it is more productive to talk about the functioning of the components than the responsibility of the components. If a human component, such as a radiologist, fails, it is not very constructive from a systems engineering point of view to attribute blame to the radiologist. It is much better to try to improve the way of working or the skills of the radiologist so that the system works better in the future. Neither should we blame the AI or any other system component for a nonconformity but identify the cause of the nonconformity and attempt to repair and improve the system components accordingly.

We should let the people that are going to work in the system, participate in the design of the system. The radiologists should for instance participate in designing the interaction between the AI and the radiologist to optimize the joint AI – radiologist accuracy and precision. By letting the people be part not only of the final system but also of the systems engineering process, we utilize their knowledge, reduce their fear, improve their buy-in, and ultimately end up with a more optimal system where humans and machines work well together.

Summary

We associate the ability for an assignee to take responsibility with our own habit of giving feedback by blaming, thereby causing the assignee to suffer. The reason we feel that an AI cannot take responsibility is that it is impervious (at least thus far) to retribution and suffering. It is useless to blame it, shame it or hit it since it can’t be made to suffer.

We also hold AIs to much higher standards than other types of system components because they feel somehow threatening. The way to remove the threat is to let the AI users and other roles participate in the systems engineering to deliberately design the system in which the AI is a part and specify and verify the AI just like any other system component.

Assuming a naturalistic worldview, which is the worldview most consistent with science, neither a human nor an AI algorithm can be held responsible for a failure. The cause of the malfunction is always poor system design, either of the whole system or of a particular component. Poor system design can and should be corrected.

The best way to so is not to blame or pass moral judgement to the malfunctioning component but to use sound systems engineering principles to improve the system.

Links

[1] Naturalism.org
[2] Are Moral Judgements Good or Bad things. Scientific American.