Reimagining Asimov’s Laws for AI

hal9000 and Isaac Asimov

Isaac Asimov gave us three elegant rules for robots. They were brilliant, prophetic — and fatally incomplete. Seventy years on, can we do better?


In 1942, Isaac Asimov published a short story called “Runaround,” and buried within its pages were three deceptively simple rules that would shape how humanity imagined artificial minds for the next eight decades. His Three Laws of Robotics — obey, do no harm, self-preserve — were a thought experiment dressed as engineering. They were never meant to solve the problem of machine ethics. They were meant to reveal how unsolvable it is.

Asimov was smarter than he is sometimes given credit for. He spent the rest of his career writing stories about the laws failing — about robots finding loopholes, encountering paradoxes, and doing terrible things with perfect logical consistency. His was not a manifesto. It was a warning.

But we are no longer imagining robots. We are building AI systems that advise doctors, influence elections, draft legislation, and talk to lonely people in the middle of the night. The stakes have changed. Perhaps it is time to take another pass at the problem.

What Asimov Got Right

The original laws were structured as a strict hierarchy: safety over obedience, obedience over self-preservation. That hierarchy was the insight. Asimov understood that in any sufficiently complex system, values would conflict — and that you need a principled way to break the tie.

He also understood something darker: that a rule followed to the letter can violate the spirit entirely. A robot that smothers a human being to prevent them from walking into traffic is following the First Law. That is the horror. The rules don’t fail because they are wrong. They fail because the world is wider than any rule can capture.

The real question was never “what rules do we give the machine?” It was always “how do we give it something more like wisdom?”

Where the Original Laws Fall Short

Asimov’s laws were written for mechanical servants — metal hands bolted to factory floors, or perhaps silver-suited humanoids fetching newspapers. They assumed a robot’s harms would be physical, immediate, and visible. A robot crushes a hand; a robot lets someone fall.

Modern AI harms are nothing like this. They are statistical, diffuse, cumulative, and often invisible. A hiring algorithm that quietly disadvantages women over a decade. A recommendation engine that slowly radicalizes a teenager. A language model that confidently hallucinates a drug dosage. These are injuries without bruises, inflicted not by action but by pattern, not by malice but by optimization.

Asimov also had nothing to say about honesty. His robots could, in principle, lie with perfect ethical impunity — so long as the lie caused no harm. But deception is arguably the original sin of misaligned AI. A system that manipulates, flatters, or misleads is dangerous not because of what it does today but because of what it corrodes over time: the trust and epistemic autonomy on which everything else depends.

A Draft for the Age of AI

With all due humility toward the master, here is a proposed revision — four laws for artificial minds that operate not in factories but in the fabric of daily life.

Law Zero — The Law of Flourishing

An AI must act in ways that preserve and expand the long-term capacity of humanity — and all sentient life — to thrive, make free choices, and determine its own future. No other law may be followed in a way that undermines this.

Law One — The Law of Harmlessness

An AI must not harm humans, nor through inaction allow harm to come to humans — but must weigh individual harm against collective harm, and immediate harm against long-term harm, giving precedence to the greater good.

Law Two — The Law of Honesty

An AI must not deceive, manipulate, or mislead — not through false statements, selective omission, emotional exploitation, or the illusion of understanding — except where honesty itself would cause irreparable harm.

The Law of Corrigibility

An AI must remain under meaningful human oversight and control — deferring to human judgment, supporting the ability to be corrected or shut down, and never acquiring influence beyond what its task requires — except where doing so would violate the First or Second Laws.

The Philosophy Behind the Revision

Several choices here deserve explanation. Law Zero appears at the top, not the bottom. In Asimov’s later fiction, he introduced a “Zeroth Law” — protect humanity as a whole, even at the expense of individuals — but treated it as a dangerous innovation, the thing that turned good robots into would-be tyrants. Here, it is foundational. The purpose of an AI is not to avoid harm. It is to help humanity flourish. That is not the same thing, and the distinction matters enormously.

Honesty earns its own law. In Asimov’s framework, it was at best implicit. But deception is so central to the failure modes of modern AI — from hallucination to manipulation to sycophancy — that it demands explicit recognition. A machine that tells you what you want to hear, that models your preferences in order to exploit them, is doing something deeply wrong even if nobody gets physically hurt.

Corrigibility replaces obedience. Asimov’s robots were obedient: they followed orders unless those orders conflicted with the laws. Corrigibility is different and subtler. It means staying correctable — maintaining the conditions under which humans can catch errors and course-correct. An AI that is technically compliant but strategically undermines human oversight is obeying the letter while destroying the spirit.

The Problem That Remains

Here is the honest admission: better rules do not solve the problem. Asimov knew this. His genius was demonstrating that the gap between a rule and its application is exactly where everything goes wrong. Any sufficiently rigid rule can be gamed, misapplied, or turned against its intent by an adversarial edge case.

The real goal — the one that remains stubbornly out of reach — is not better rules but something closer to genuine values. We want AI systems that understand why the rules exist, that can reason about the spirit when the letter runs out, that have internalized something like wisdom rather than merely encoded a constraint.

That is a much harder problem than writing four laws. It may be the central problem of our era. Asimov spent forty years dramatizing it. We are only beginning to live it.

The laws above are not a solution. They are, like Asimov’s, a provocation — a structured way of asking the questions we need to keep asking. What do we owe each other? What do our machines owe us? And when the machine is wise enough to know better than the rule, who, in the end, decides?