AI Ethics(i): The Ghost in the Code

Why AI Ethics is a Crisis of Human Values, Not Just Computer Science

Preface: Co-written with Gemini.

For the vast majority of human history, the "moral agent"—the entity responsible for an action—was always a human being. Whether it was a bridge collapsing due to poor engineering or a physician misdiagnosing a patient, our legal and ethical systems were built on the bedrock of human intent and accountability. Technology was merely a passive conduit for that intent; a hammer cannot be "evil," and a steam engine cannot be "biased." However, the rapid ascent of Artificial Intelligence (AI) has triggered what philosophers call the "Great Decoupling." For the first time, we are interacting with entities that possess agency—the ability to make influential, autonomous decisions—without possessing sentience, or the ability to feel the moral weight of those decisions.

This decoupling is the true birthplace of AI Ethics. It is not merely a niche subfield of computer science or a set of technical hurdles for engineers to clear; it is a fundamental stress test for human civilization. As we outsource our most critical cognitive labors—hiring, policing, medical diagnosis, and financial lending—to "statistical engines," we are inadvertently mapping our messy, inconsistent, and often dark human history onto rigid mathematical optimization goals. The "Ghost in the Code" is not a supernatural presence, but the unexamined residue of our own societal failings, now automated at the speed of light.To understand the ethical crisis, one must first demystify the "magic" of how a modern AI, such as a Large Language Model (LLM), learns the world. We often use biological metaphors, claiming the AI "understands" or "thinks," but the reality is more akin to a massive exercise in pattern recognition.

At its most fundamental level, a Large Language Model (LLM) is a sophisticated statistical prediction engine. It does not "understand" text in the human sense; rather, it calculates the mathematical probability of what should come next in a sequence.The process begins with tokenization, where text is broken down into smaller chunks (words or characters) and converted into numerical vectors. These vectors are placed in a high-dimensional "embedding space" where words with similar meanings are mathematically positioned closer together. The core architecture, the Transformer, uses a mechanism called self-attention. This allows the model to weigh the importance of different words in a sentence regardless of their distance from one another. For example, in the sentence "The bank was closed because the river overflowed," the model uses attention to link "bank" to "river" rather than "finance," resolving the ambiguity through context. Through pre-training on massive datasets, the model learns billions of these associations. It essentially builds a complex map of human language, allowing it to generate coherent, contextually relevant responses by repeatedly guessing the most likely next token based on the patterns it has observed in its "textbook" of human knowledge.

During the Pre-training phase, an AI consumes petabytes of data: every digitized book in the public domain, billions of Reddit threads, court records, medical journals, and news archives. In this stage, the AI functions as a "Stochastic Parrot." It does not learn that "discrimination is wrong" or that "equality is a virtue." The term "Stochastic Parrot" is a metaphor used to describe the way Large Language Models (LLMs) produce text. It was popularized in a landmark 2021 research paper titled "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?" by Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Margaret Mitchell. "Stochastic" refers to a process that is determined by a random probability distribution. When an AI generates a sentence, it isn't "thinking"; it is performing a complex mathematical calculation to predict which word (or "token") is most likely to follow the previous ones. It is essentially rolling a loaded die based on the statistical patterns it found in its training data. A parrot can be trained to say "I love you," but the bird has no concept of what love is, who "I" refers to, or what a human relationship entails. It is simply repeating a sequence of sounds that it has learned leads to a reward. Similarly, an LLM can write a beautiful poem about a sunset, but the AI has never seen the sun, felt its warmth, or experienced the passage of time.It is "parroting" the linguistic structures of the millions of poems it has read without having any underlying mental model or communicative intent.

The authors of the paper used this term to warn against the "illusion of intelligence." When we treat AI as if it truly understands us, we fall into several ethical traps. We might trust the AI to make moral or medical decisions that require human empathy and real-world understanding, which a "parrot" lacks.Because the AI is only predicting the most likely next word, it will confidently state false facts if those facts sound like they belong in a convincing sentence. If a parrot only hears hateful language, it will repeat hateful language. Since LLMs are trained on the internet, they "parrot" the biases, stereotypes, and prejudices present in human society without the ability to filter them through a moral lens. Calling an AI a "Stochastic Parrot" is a reminder that while the output looks human, the process is purely mathematical, and the "meaning" we see in the text is actually being projected onto it by us, the human readers. It calculates the probability of word B following word A based on the trillions of examples it has read. If the vast corpus of human text contains 100,000 instances of the word "doctor" associated with male pronouns and only 10,000 associated with female pronouns, the AI’s internal coordinate system—its Vector Space—physically places the concept of "Doctor" closer to the vector for "Man."

This is the "Mathematics of the Status Quo." When a user asks an AI to "write a story about a brilliant surgeon," the algorithm simply follows the path of least mathematical resistance. It is not being "sexist" in the human sense of harboring ill-will; it is being "mathematically accurate" to a biased reality. The ethical danger lies in the fact that the AI takes the historical "is" and transforms it into an automated "should."

The most insidious ethical problem arises when AI doesn't just reflect society’s reflection but begins to act as a distorting lens that amplifies those reflections. This is best observed in the transition from data to action, particularly in the realm of Predictive Policing. Consider an algorithm trained on decades of historical crime data. Due to systemic biases and historical over-policing, certain low-income or minority neighborhoods have traditionally seen higher arrest rates for minor infractions, such as loitering or petty theft. The AI, looking for patterns, identifies a strong correlation: Location X = High Arrest Rate. It then suggests that the police department allocate more resources to Location X to "pre-empt" crime.

Because there are now more officers in Location X, they naturally catch more people for small crimes that would have gone unnoticed in a "wealthy" neighborhood. This creates new data—arrest records that "confirm" the AI’s original prediction was correct. The loop closes, and a community becomes trapped in an automated cycle of surveillance. This is a "Self-Fulfilling Prophecy" powered by silicon. The AI has successfully converted historical prejudice into a "data-driven" justification for future discrimination, making the bias nearly impossible to challenge because it wears the armor of "objective math."

A common refrain from tech optimists is that we can simply "code the bias out." If we find that the AI is penalizing certain groups, why not just tweak the algorithm? This leads us to the most uncomfortable truth in AI Ethics: The Impossibility Theorem of Machine Learning. In social science and law, there are dozens of competing definitions of "fairness." Two of the most common are Anti-Classification (ensuring the model doesn't use protected traits like race or gender) and Calibration (ensuring the model’s predictions are equally accurate for all groups). Anti-Classification is the digital equivalent of "blind" justice. It operates on the principle that by stripping away protected characteristics—such as race, gender, or religion—the model becomes incapable of discrimination. In this framework, the algorithm is forbidden from using these specific data points as inputs. However, this often proves to be a superficial fix. Because our world is so deeply stratified, "proxy variables" like zip codes, shopping habits, or even grammatical syntax can allow a sufficiently powerful model to "re-construct" a user’s race or gender with startling accuracy. Thus, removing the label does not necessarily remove the bias.

Calibration, conversely, focuses on the mathematical reliability of the output. A "calibrated" model ensures that a 70% risk score carries the same real-world meaning for every demographic group. For instance, in healthcare, a calibrated algorithm ensures that a "high-risk" designation for a heart attack is just as accurate for a woman as it is for a man. The ethical dilemma arises because achieving perfect calibration across groups often requires the model to know the very traits (like race or gender) that Anti-Classification seeks to hide. To ensure accuracy for a minority group, the model must recognize that group’s unique data patterns. Consequently, we are forced into a paradox: to make an AI "fair," we must often choose between ignoring human identity or obsessively tracking it.

Mathematicians have proven that in a world where different social groups have different historical starting points and data distributions, it is mathematically impossible to satisfy all definitions of fairness simultaneously. i.e., if a bank wants to be "fair" by ensuring that an equal percentage of men and women receive loans (Outcome Equality), they may have to lower the "credit score" threshold for one group, thereby violating the principle of "Equal Treatment for Equal Scores." If they prioritize the "Equal Treatment" principle, they will inevitably replicate the gender wealth gap. This forces us to realize that "Fairness" is not a technical problem to be solved by a better programmer; it is a political and philosophical choice. Deciding which version of fairness to prioritize requires a human value judgment—one that a machine, by definition, cannot provide.

Even if we could reach a global consensus on the definition of fairness, we are still faced with the Opacity Problem. Modern neural networks are "Black Boxes." They contain hundreds of billions of parameters—tiny weights that shift and change during training. Even the lead engineers who built the system cannot explain exactly why a specific input led to a specific output. They can describe the how (the calculus of backpropagation), but not the why (the logic of the specific decision).

This creates an existential crisis for the Right to an Explanation. In a democratic society, if a government agency denies your social security benefits, or a court uses an AI to determine your bail, you have a fundamental right to know the reasoning behind that decision so that you may contest it. "The computer said so" is an unacceptable answer; it is a form of Algorithmic Authoritarianism. Without Explainability (XAI), we lose our ability to appeal, to hold power accountable, and to ensure that the logic used by the machine isn't based on a glitch or a hidden prejudice. We risk moving toward a society governed by forces that are literally beyond human comprehension, where "the algorithm" becomes a digital deity whose decrees are absolute and inscrutable.☀️