H.264 is not a single video codec; it is a family of codecs with some shared shortcuts grouped into 17 sets of profiles and 16 levels of constraints. Video creators and playback software share a mutual understanding of these shortcuts, which are often accelerated by specialized chipsets. This post examines a few of the many flavors of H.264 video and their application in mobile, desktop, and Flash Player environments.
A compressed video is a series of shortcuts shared between a video creator and a viewer. A series of pictures, 30 pictures per second in most capture devices, are analyzed and compared, collapsing a group of pictures into a single photograph and variances between pictures before or after its place in the series. All lossy video codecs examine a series of pictures and look for pieces that can be thrown out and replaced with shortcuts to recreate video quality with less stored data. Specialized decoders in our playback software, often assisted by chips especially programmed to quickly execute these shortcuts, decompress video with these specialized instruction sets. Shortcuts can be patented, leading to some of the intellectual property concerns around H.264, VP8, and Theora video as video playback, and encoding targets, are increasingly integrated with web browsers implementing support for native HTML5 <video>.
H.264 is not a single video codec; it is a family of codecs with some shared shortcuts grouped into 17 sets of profiles and 16 levels of constraints. Decoding software, often backed by chips specially wired for video tasks (such as NVIDIA’s PureVideo) fill a storage buffer and try to compute video frames more quickly than those frames are requested from the player. High-complexity profiles and levels offer the highest quality video in the smallest file size but require a larger file buffer and computational horsepower to quickly decompress a video. High complexity works well in an overpowered desktop environment but videos must be adjusted for simplified, battery sipping use cases such as a mobile phone.
|Flexible macroblock ordering (FMO)||✓|
|Arbitrary slice ordering (ASO)||✓|
|Redundant slices (RS)||✓|
|Interlaced coding (PicAFF, MBAFF)||✓||✓|
|CABAC entropy coding||✓||✓|
|8×8 vs. 4×4 transform adaptivity||✓|
|Quantization scaling matrices||✓|
|Separate Cb and Cr QP control||✓|
Videos are encoded with specific playback targets in mind based on maximum compatibility. The iPhone 3GS supports H.264 Baseline Level 3.0. The iPhone 4 and iPad support H.264 Main Profile Level 3.1. The latest netbooks with NVIDIA ION and PureVideo HD support H.264 High Profile Level 4.1. A video optimized for desktop, notebook, or netbook playback encoded using H.264 High Level 4.1 will not playback on an iPhone.
The Apple effect
Adobe has repeatedly said that Apple mobile devices cannot access “the full web” because 75% of video on the web is in Flash. What they don’t say is that almost all this video is also available in a more modern format, H.264, and viewable on iPhones, iPods and iPads.
Adobe’s Flash Player added support for H.264 video decoding in August 2007 with its Flash Player 9 Update 3 Beta 2 (9.0.115) release. Websites previously included a video file, a Flash video container (FLV) with a On2 VP6 or Sorenson video track, into a single Flash file for distribution and playback. The launch of H.264 support in Flash decoupled the video player and the video file, loading videos over the network when a viewer initiates playback (a much lighter payload for embeds such as YouTube). Video websites can directly expose MP4 downloads to iTunes, the QuickTime browser plugin, or search engines for download and indexing.
Decoupling the Flash video viewer from the underlying video provides direct access but does not necessarily deliver video “viewable on iPhones, iPods and iPads.” Video publishers need to dumb down their video for Apple’s low-power devices (and Flash mobile), or a video will be viewable but not playable.
YouTube exposes multiple video resolutions on its website. Each video resolution uses a slightly different version of H.264 but none of these videos delivered to desktop web browsers are compatible with an iPhone 3GS and its Baseline profile requirement. Let’s take a look at the underlying videos exposed in the default Flash version of YouTube for the latest weekly address from the White House.
Exposed YouTube web formats
- MP4, High profile level 4.1
- FLV, Main profile level 3.1
- FLV, Main profile level 3.0
The H.264 videos used by YouTube for default video playback on web browsers are not compatible with portable Apple devices not built off an A4 processor. YouTube is creating special video files for iOS and other mobile devices.
Flash Player for mobile
On June 22 Adobe released Flash Player 10.1 for mobile, its first full Flash player written for ARM instruction set architectures. Flash for mobile does not solve the video playback problem. Flash can draw a player area and display a preview image of the video in place of a failed plugin icon. Video playback ultimately depends on the hardware decoder horsepower behind the scenes and its ability to deliver video frames and synchronized audio to your mobile device’s screen faster than intended playback and within the constraints of small file buffer and memory available on mobile. Flash for Mobile renders a player and its interaction elements; video for mobile still relies on simpler sets of shortcuts targeting hardware-accelerated features and available computing resources on mobile.
WebM and VP8
Google introduced the WebM file format on May 19 with a container based on Matroska, a VP8 video track, and Vorbis audio. Google released any patent rights it may assert over VP8 and released the source code for libvpx, a reference encoder and decoder, with 17 test vectors for implementers. The popular FFmpeg project, used by many web publishers for encoding and by Google Chrome for decoding, quickly added native VP8 support in late June. FFmpeg’s VP8 implementation was able to highly leverage video encoder and decoder shortcuts already used by H.264, opening VP8 to hardware-accelerated playback by chipsets optimized for H.264 shortcuts. If your encoder, decoder, and hardware already pays into the H.264 patent licensing pool run by MPEG-LA the shared, patent-asserted shortcuts present in VP8 can be a good thing. If you were hoping for a Freedom-loving replacement for Theora, VP8 may not be clear of patent assertions (but Mozilla seems to like it).
Web developers are excited about H.264 video and the rise of browser-native playback through HTML5 <video> markup. H.264 is a family of standards, each with its own set of shortcuts shared between a video publishing tool and a video player. The excitement over mobile video has overlooked the intricacies of H.264 profiles and levels detailed by RFC 4281 and the changing landscape of hardware-accelerated video on mobile. Video publishers should be aware of playback differences between playback devices and either choose a lowest common denominator or specifically target the quality and file size of an intended playback device.