How Rotation works in MP4 File
This all started as a fun challenge. We wanted to see if we could edit videos using nothing but a hex editor. No DaVinci, no Final Cut Pro, no FFmpeg—just raw hex editing. You might ask, why? Maybe it’s monomania, but the reason doesn’t matter.
What matters is whether it’s possible. And, of course, anything is possible if you know what you’re doing. But let’s start simple. Let’s rotate a video.
MP4: A Quick Primer
Before we rotate anything, here’s a 10-second spill on how video is stored on disk. We’re ignoring other formats—MKV, AVI, MOV, whatever—and focusing purely on MP4.
MP4 stands for MPEG-4 Part 14. It’s an ISO standard that Apple initially developed as the QuickTime format. Standards, at their core, are just agreements between the encoder (muxer) and decoder (demuxer). If both sides agree on the format, the video plays.
MP4 files are made up of atoms—nested data structures that store metadata and media data. Every atom has an 8-byte header:
- First 4 bytes → Atom size
- Next 4 bytes → Atom name (a four-character code like
moov
,trak
,tkhd
)
MP4 files have multiple atoms, some containing others. Think of them like folders within folders.
For our purposes, we’re interested in how the video player knows how to render pixels on screen. That
metadata is stored inside moov/trak/tkhd
.
The 36 Bytes That Control Rotation
Inside the tkhd (Track Header) atom, there are 36 bytes dedicated to storing a transformation matrix. This matrix tells the player how to scale, rotate, and position the video.
If you’ve ever done high school linear algebra, you’ve seen a 3×3 identity matrix:
$$ \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} $$This is the default—no rotation, no scaling, no movement. Each value in the matrix represents a transformation:
- The first column controls scaling and horizontal shearing.
- The second column controls rotation and vertical shearing.
- The third column is for translation (movement on x/y axes).
MP4 expands this to a 2D affine transformation matrix using 16.16 fixed-point numbers:
$$ \begin{bmatrix} a & b & u \\ c & d & v \\ x & y & w \end{bmatrix} $$This is stored in tkhd
as 36 bytes:
- a, b, c, d → Scale, shear, and rotation (16-bit fixed-point values).
- u, v → Perspective correction (usually
0
). - x, y → Translation (positioning).
- w → Always
0x40000000
(which represents1.0
in fixed-point).
What is Fixed-Point 16.16?
MP4 uses fixed-point 16.16 format to store decimal numbers efficiently. It’s a way of representing real numbers in an integer-based system.
- The first 16 bits (upper half) store the whole number.
- The last 16 bits (lower half) store the fractional part.
For example:
1.0
in fixed-point 16.16 →0x00010000
0.5
→0x00008000
-1.0
→0xFFFF0000
This format avoids floating-point precision issues and ensures fast computation on hardware that doesn’t have an FPU (Floating Point Unit).
Rotating the Video
To rotate a video by 90°, we modify the matrix. A 90-degree counterclockwise rotation swaps and negates values like this:
$$ \begin{bmatrix} 0 & -1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix} $$
Which, in fixed-point 16.16 hexadecimal, looks something like:
00000000 FFFF0000 00000000
00010000 00000000 00000000
00000000 00000000 40000000
Editing these bytes directly in a hex editor rotates the video without re-encoding.
Let's trying play that
Rotating to Any Angle (In Theory)
So far, we’ve only looked at 90°, 180°, and 270° rotations—angles that neatly fit into the transformation matrix with simple value swaps. But what if you want to rotate a video by 45°, 30°, or any other arbitrary angle? Mathematically, it’s not a problem. The full 2D rotation formula looks like this:
$$ \begin{bmatrix} \cos(\theta) & -\sin(\theta) & 0 \\ \sin(\theta) & \cos(\theta) & 0 \\ 0 & 0 & 1 \end{bmatrix} $$
45° rotation, the matrix would be:
$$ \begin{bmatrix} 0.7071 & -0.7071 & 0 \\ 0.7071 & 0.7071 & 0 \\ 0 & 0 & 1 \end{bmatrix} $$Converted to fixed-point 16.16 hexadecimal:
0000B504 FFFF4AFC 00000000
00004AFC 0000B504 00000000
00000000 00000000 40000000
The problem? Most video players don’t support it.
Let's play and see hey!
But no luck, your browser most likely won't show the current rotation. But if you try ffplay
, you will get something like this
Why? Players Got Lazy
The MP4 spec supports arbitrary rotations, but most video players only check for these cases:
0°
→ Identity matrix90°
→ Swapped and negated values180°
→ Negated diagonal270°
→ Another swap and negate
Anything else? They ignore it.
Why? Because handling arbitrary rotations properly means real-time transformations—which requires GPU acceleration or CPU-intensive pixel remapping. It’s easier for developers to say, “Just support 90-degree increments and call it a day.” Even QuickTime Player ignores non-standard rotation matrices. VLC might try, but results vary.
Why Hex Editing Works (and When It Doesn't)
editing works great if you don’t change the length of anything. But if you add or remove bytes, everything breaks.
MP4 sometimes uses Absolute Offsets
Each atom in an MP4 file has:
- A size field (first 4 bytes)
- References to other atoms (absolute offsets)
If you change the size of an atom (e.g., making tkhd
longer), you need to:
- Update the size field of every parent atom (
trak
,moov
, etc.). - Fix offsets in
mdat
(the actual media data).
Miss one update? The entire file might not play.
This is why we edit only metadata and only within the allocated space—so we don’t have to touch the rest of the file.Final Thoughts
MP4 is a structured but fragile format. With the right hex edits, you can manipulate metadata without re-encoding. Rotation is just one example—scaling, flipping, and even repositioning videos can all be done by tweaking the transformation matrix.
But while MP4 allows any rotation, players just aren’t keeping up. The standard is capable. The software? Not so much.
Experience Lossless Video Rotation
Try Rotately now and see how our matrix transformation technique preserves your video quality while instantly applying rotations.
Rotate Without Quality Loss