Real-time Face Tracking in Minecraft

How Minecraft Webcam transforms your face into a blocky avatar.

The Concept

Your webcam sees your face. Minecraft Webcam tracks it and renders a Minecraft character that mirrors your expressions and head movements. The result: a Minecraft version of yourself, animated in real-time.

How Face Tracking Works

MediaPipe Face Mesh detects 468 facial landmarks at 30 FPS. From these landmarks, the system calculates:

  • Head pitch — Looking up/down (from nose position relative to eyes)
  • Head yaw — Looking left/right (from nose position relative to face center)
  • Head roll — Tilting (from line between eyes)
  • Blink detection — Eye aspect ratio from landmark distances
  • Mouth openness — Vertical distance between lip landmarks

Rendering the Avatar

Minecraft skins are 64x64 PNG files. The renderer:

  1. Parses the skin format (64x32 for classic, 64x64 for modern)
  2. Extracts head, body, arm textures
  3. Maps textures to 3D quads with perspective correction
  4. Applies head rotation as quaternion rotation
  5. Depth-sorts all quads (painter’s algorithm)

Animation Storage

Here’s the clever part: Minecraft skins have unused texture space. Rows 32-40 in a standard skin aren’t rendered by the game.

Minecraft Webcam uses this space to store facial expression frames. When you blink, the renderer swaps to the appropriate animation frame.

Performance

The whole pipeline runs at 30 FPS on a typical laptop. MediaPipe does the heavy lifting on the CPU, OpenCV handles image processing, and Tkinter manages the display.

The virtual camera output (via pyvirtualcam) lets you use the avatar in any video conferencing app or streaming software.