Filmmaking in the Age of AI(43): Rigging

Back to stuff

← NEXT

Created on

May 14, 2026, 12:24 AM

Updated on

May 14, 2026, 12:34 AM

Preface: Welcome to this very long series of filmmaking in the age of AI(2026).

The model is finished. It exists in three dimensions, with correct topology, sculpted surface detail baked to maps, reviewed and approved. It is also completely inert. It cannot move. It has no joints, no controls, no mechanism for an animator to grab hold of it and drive a performance. Rigging is what changes that. The rigger takes a static piece of geometry and builds the infrastructure that turns it into an instrument — the skeleton, the controls, the deformation systems — that an animator will use to create every movement the character makes in the film.

The rig is an interface. The modeler builds the geometry. The animator performs. The rigger builds the thing those two exchanges pass through. Which means rigging is, at its core, a service discipline. The rigger's customer is the animator. The goal of every technical decision in a rig is to make the animator's job faster, more intuitive, and more expressive. A rig that is technically correct but frustrating to use is a failed rig. The best rigs are the ones the animator forgets they're operating.

How it began

The problem of digital facial animation was identified before the pipeline to solve it existed. In 1972, Fred Parke at the University of Utah — the same lab where Catmull was modeling his hand — built a parameterized facial model: a mathematical description of a face whose features could be adjusted by changing numerical values. Move this parameter and the mouth opens. Move that one and the brow rises. It was not yet a rig in the modern sense — there were no skeletal joints, no animator-facing controls — but it established the conceptual foundation. A digital face is a system of parameters. The question is how to organize and expose those parameters so a human can drive them.

By 1985, Tony de Peltrie — a six-minute short film produced by four filmmakers at the Université de Montréal — had used a parameterized facial model to produce the first digitally animated character with genuine emotional expression. The character played piano and emoted through synchronized speech and facial movement. It was a short film made outside any commercial production pipeline — proof of concept, not a studio deliverable. But it proved the concept held. One of the four directors, Daniel Langlois, founded Softimage directly after completing it. Softimage's 3D animation package became an industry standard in the 1990s, used on Jurassic Park and The Matrix.

Jurassic Park in 1993 introduced the problem of rigging creatures for live-action integration at scale. Before Jurassic Park, the craft knowledge for animating realistic animals lived with stop-motion animators — people who understood biomechanics, weight, and locomotion through physical manipulation of physical objects. ILM needed to transfer that knowledge into a digital system without losing it. Their solution was the Dinosaur Input Device — a physical armature instrumented with digital sensors that recorded the movements of stop-motion animators directly into the computer. The DID was a rig in the original sense: a physical armature whose position data drove digital geometry. It was also a bridge — a way of keeping the human craft of stop-motion animation alive inside a digital pipeline. The dinosaurs in Jurassic Park move the way they do because people who understood physical animation were operating rigs that fed data directly to the software.

Gollum in The Lord of the Rings: The Two Towers (2002) pushed the problem further. Gollum was not a creature in the background. He was a leading character with a sustained emotional arc, opposite live actors, in close-up, for extended sequences. His face required 675 blend shapes — pre-sculpted poses covering every combination of expression his performance would need — driven by a combination of motion capture data from Andy Serkis and hand-animated adjustments by Weta Digital's animators. The rig that made this possible was not a single system but a stack of systems: a body skeleton, a facial blend shape library, muscle and skin deformation layers, and a performance pipeline that translated Serkis's physical performance into starting data the animators could then refine. Gollum set the technical and creative standard for digital character performance in live-action film, and it did so through rigging.

The skeleton

A digital skeleton is a hierarchy of joints connected in parent-child relationships. The hip joint is typically the root — the origin point from which everything else descends. The spine joints are children of the hip. The shoulder joints are children of the spine. The upper arm is a child of the shoulder. The forearm is a child of the upper arm. The hand is a child of the forearm. When a parent joint transforms — moves, rotates, scales — all of its children transform with it. Rotate the hip and the entire upper body turns. Rotate the shoulder and the arm swings from it. The joints themselves are not geometry. They are transforms — mathematical descriptions of position and orientation in space. They have no visual presence in the final render. What they do is drive the mesh through skinning.

Skinning

Skinning is the process of attaching the model's geometry to the skeleton so that the mesh follows the joints. Each vertex of the mesh is assigned a set of weights — values between zero and one that define how much influence each nearby joint has over that vertex's position. A vertex in the middle of a forearm might be 100% influenced by the forearm joint. A vertex at the elbow might be 60% influenced by the forearm joint and 40% by the upper arm joint, so that when the elbow bends, the geometry at the joint transitions smoothly rather than hinging like a rigid hinge.

Weight painting is where skinning becomes an art. The rigger paints influence values directly onto the mesh surface — adding and reducing joint influence across each region until the deformation behaves correctly at every pose the character is likely to hold. The shoulder is the canonical problem area. The shoulder joint has an enormous range of motion, and the geometry over it has to deform plausibly in all directions without collapsing, tearing, or producing the distinctive candy-wrapper twist that happens when skinning is wrong. Getting shoulders right is a substantial part of every production rig's development time.

Corrective blend shapes address what skinning alone cannot fix. These are pre-sculpted mesh corrections that activate only when a joint reaches a specific rotation — an additional deformation applied on top of the base skinning result to fix known problem areas at known poses. A corrective shape for the shoulder might only activate when the arm is raised above 90 degrees. The corrective fires, the geometry corrects, the animator sees clean deformation.

FK and IK

Forward Kinematics means rotating joints from the root down. To lift a forearm, rotate the elbow joint. To position a hand, rotate the wrist. The motion flows forward through the hierarchy. FK is the natural mode for swinging arcs, relaxed limbs, and any motion where the path of the end of the limb is less important than its rotation. A character's arm swinging as they walk is FK. The shoulder, elbow, and wrist rotate in sequence and the hand arrives wherever physics and the animator's keys put it.

Inverse Kinematics flips the logic. The animator positions the end of the limb — the hand or the foot — and the IK solver calculates backward through the joint chain to find the joint angles that would place the end effector at that position. Move the hand and the wrist, elbow, and shoulder adjust automatically. IK is essential for any motion where the end of the limb needs to stay in a fixed position relative to the world — a foot planted on the ground, a hand gripping a door handle. Without IK, keeping a foot planted while the character's weight shifts would require manually keying every joint in the leg for every frame.

Most production rigs include an FK/IK switch that lets the animator move between the two modes mid-shot. The arm might be in FK for a swing and switch to IK when the hand makes contact with a surface. The switch itself has to be seamless — the rig must hold the limb position across the transition without popping.

The face

The face is a separate problem from the body. Skeletal joints and skinning work reasonably well for a body — the underlying structure is relatively consistent, the range of motion is biomechanically constrained, the geometry is relatively forgiving. The face is the opposite. It has dozens of overlapping muscles, each capable of independent activation, driving skin deformations that are extraordinarily sensitive to error. A slightly wrong eye deformation in close-up registers immediately as uncanny. The face is the part of another human being that we are most practiced at reading, and we read it at a precision level that exposes any inaccuracy.

Production facial rigs are built around blend shapes — pre-sculpted versions of the mesh representing specific poses. A brow raise is a blend shape. A lip corner pull is a blend shape. A cheek puff is a blend shape. An animator drives the face by blending between these shapes, setting weights that combine multiple shapes simultaneously to produce complex expressions. The shapes are typically organized around FACS — the Facial Action Coding System — a scientific framework developed by psychologists Paul Ekman and Wallace V. Friesen, building on anatomist Carl-Herman Hjortsjö's earlier muscle-mapping work, that describes human facial expressions in terms of discrete muscle activations called Action Units. Building a rig to FACS gives the animator a vocabulary that maps to how faces actually work, which makes it easier to direct digital performances toward specific emotional readings.

Gollum's face required 675 blend shapes to cover the range of his performance. A typical human face in a contemporary production might have several hundred. Each shape is hand-sculpted, named, and tested across combinations to ensure no unexpected interactions. The blend shape library is one of the most labor-intensive deliverables in any production rig, and it is the foundation on which every facial performance in the film is built.

The person who builds all of this is called a rigging technical director, or character TD. The title reflects what the job actually is: part artist, part engineer, part software developer. A rigger needs to understand anatomy well enough to build a skeleton that moves correctly. They need to understand deformation mathematics well enough to solve skinning problems. They need to write code — Python, MEL — to build the control systems, automate repetitive tasks, and integrate the rig with the rest of the pipeline. And they need to understand animation well enough to know what their customer needs, because a rig that works technically but doesn't serve the animator is a rig that will be sent back for revision.

The rig is a software product that gets delivered to animation. It is versioned, documented, and tested before it ships. When an animator finds a problem, they file a note and the rigger fixes it and publishes a new version. The rig evolves throughout production as new shot requirements reveal edge cases the original build didn't anticipate. A creature that was designed to walk may need to run, swim, and fight in close-up — each new demand revealing something the rig wasn't built to handle, and the rigger solving it.

🫶🏼