*************************************** * ISO/IEC 11172(MPEG-1)/13818(MPEG-2) * * MPEG Bit Stream Quick Reference * *************************************** - values use big endian (network) byte/bit order - general terms: integer = signed value - general values: byte/char/octet = 8-bit value; short/word = 16-bit value; long = 32-bit value - fixed point values: value made up of an integer for whole numbers and an unsigned value for the decimal - binary values: base-2 long unsigned values (values from 0 and 1) - octal values: base-8 long unsigned values (values from 0 through to 7) - decimal values: base-10 long unsigned values (values from 0 through to 9) - hexadecimal (hex) values: base-16 long unsigned values (values from 0 to 9 and A to F) FILE INFO System Multiplex v1 Suffixes = ".m1s", ".mpeg", ".mpg", ".mpgx", ".mpm" System Multiplex v2 Suffixes = ".m2s", ".mpeg", ".mpg", ".mpgx", ".mpm" Elementary Video v1 Suffixes = ".m1v", ".mpgv", ".mpv" Elementary Video v1 15 fps Suffix = ".m15" Elementary Video v1 7.5 fps Suffix = ".m75" Elementary Video v2 Suffixes = ".m2v", ".mpgv", ".mpv" Elementary Audio v1 Suffixes = ".m1a", ".mpga", ".mpa" Elementary Audio v2 Suffixes = ".m2a", ".mpga", ".mpa" Elementary Audio Compression layer/level Suffixes = ".mpa", ".mp2", ".mp3" System Multiplex or Elementary Video v1-2 MIME="video/mpeg" Elementary Audio MIME="audio/mpeg" System Multiplex or Elementary Video/Audio Mac OS Type = "MPEG" Elementary Audio Fraunhofer layer/level III Mac OS Type = "MPG3"; QuickTime Mac OS Creator = "TVOD" Developed for the compressing of moving pictures and associated audio and then the combining of one or more elementary streams of video and audio, as well as, other data into single or multiple streams which are suitable for storage or transmission. This is specified in two forms: the Program Stream and the Transport Stream. Each is optimised for a different set of applications such as ATM networks, DBS (Ku-band), station feeds (C-Band), subscription services (CATV bands), local stations (UHF and VHF bands) which are error prone transmissions so they use transport streams. Video CDs (white book) and DVD (UDF VOB) are stored content so they use program streams. ISO/IEC 11172-2 supports a maximum bit rate of 1.5 Mbits/sec (187.5 kB/sec). ISO/IEC 13818-2 levels support bit rates from 3 Mbits/sec (375 kB/sec) to 15 Mbits/sec (1.875 MB/sec). ISO/IEC 14496-2 (MPEG-4 video) supports scalable bit rates from 10 kbits/sec (1.25 kB/sec) to 6 Mbits/sec (750 kB/sec). ISO/IEC 11172-1/13818-1 SYSTEM MULTIPLEX PROGRAM/TRANSPORT PACKETS/BLOCKS AND HEADER - NOTE: a transport system multiplex retransmits header info more frequently for recovery from data loss. Program streams are stored streams so this isn't necessary. -> 4 bytes pack header start code = long hex value of 0x000001BA -> 8 bits v1 pack header = byte clock and multiplex pack rate bit info OR -> 12 bits v2 pack header = 1 1/2 bytes clock and multiplex pack rate with extension bit info - NOTE: pack header appears more than once -> 4 bytes system header start code = long hex value of 0x000001BB -> 16 bits packetized elementary stream length = short unsigned size of start code packet/block -> variable bits system header = multiplex system bit info - NOTE: only one system header per file bit stream -> 4 bytes optional stream gap identifier = long hex value of 0x00000000 -> 10 or variable bits stream gap = 1 1/4 or variable bytes gap set to zero - NOTE: can appear more than once or not at all -> 4 bytes optional padding packet/block start code = long hex value of 0x000001BE -> 16 bits packetized elementary stream length = short unsigned size of start code packet/block -> variable bits padding = multiplex padding bits set to 1 - NOTE: can appear more than once or not at all -> 4 bytes optional private 1 packet/block start code = long hex value of 0x000001BD -> 16 bits packetized elementary stream length = short unsigned size of start code packet/block -> 96 bits MPEG-1 PES header = 12 bytes PES bit info OR -> 24 - 392 bits MPEG-2 PES header = short PES bit info + byte unsigned PES header size + optional PES timecode/info -> variable bits custom data = video sub-pictures bit data for DVD -> 4 bytes optional private 2 packet/block start code = long hex value of 0x000001BF -> 16 bits packetized elementary stream length = short unsigned size of start code packet/block -> variable bits custom data = audio/video navigation bit info for DVD -> 4 bytes optional audio 0 - 31 packet/block start code = long hex value of 0x000001C0 - 0x000001DF -> 16 bits packetized elementary stream length = short unsigned size of start code packet/block -> 96 bits MPEG-1 PES header = 12 bytes PES bit info OR -> 24 - 392 bits MPEG-2 PES header = short PES bit info + byte unsigned PES header size + optional PES timecode/info -> variable bits elementary audio payload = elementary audio packets - NOTE: audio packets/blocks appear more than once. DVD uses only audio 0 - 7 packets/blocks. -> 4 bytes optional video 0 - 15 packet/block start code = long hex value of 0x000001E0 - 0x000001EF -> 16 bits packetized elementary stream length = short unsigneed size of start code packet/block -> 96 bits MPEG-1 PES header = 12 bytes PES bit info OR -> 24 - 392 bits MPEG-2 PES header = short PES bit info + byte unsigned PES header size + optional PES timecode/info -> variable bits elementary video payload = elementary video header + other video start codes - NOTE: video packets/blocks appear more than once. VCD/DVD uses only video 0 packets/blocks. - NOTE: generally one packet/block contains at least one frame of video. -> 4 bytes stream end start code = long hex value of 0x000001B9 - NOTE: only one stream end to terminate demultiplexing of file bit stream ISO/IEC 11172-2/13818-2 ELEMENTARY VIDEO HEADER -> 4 bytes video sequence start code = long hex value of 0x000001B3 -> 12 bits horizontal samples = 1 1/2 bytes unsigned number of samples in a line -> 12 bits vertical lines = 1 1/2 bytes unsigned number of lines in a frame -> 4 bits samples to pixel/display aspect ratio = 1/2 byte unsigned aspect type - v1 pixel types are Reserved = 0; 1:1 = 1; 0.6735 = 2; 0.7031 = 3 - v1 pixel types are 0.7615 = 4; 0.8055 = 5; 0.8437 = 6; 0.8935 = 7 - v1 pixel types are 54:59 PAL = 8; 0.9815 = 9; 1.0255 = 10; 1.0695 = 11 - v1 pixel types are 11:10 NTSC = 12; 1.1575 = 13; 1.2015 = 14; Reserved = 15 - v2 display types are Reserved = 0; 1:1 = 1; 4:3 = 2; 16:9 = 3; 11:5 = 4 - v2 display types are Reserved or v1 pixel types = 5 - 15 - NOTES on common pixel and display aspects - 1:1 (1.0) pixel/display: means one sample is equivalent to one pixel - 54:59 (0.9153) PAL: defined for displaying the CCIR 704 625 line system in pixels - 11:10 (1.1) NTSC: defined for displaying the CCIR 704 525 line system in pixels - MPEG v2 aspects: define the aspect of the vert. lines to pixel frame width for digital displays - 11:5 (2.2) display: defined for a size trade-off between 4:3 conventional and 16:9 widescreen -> 4 bits frame rate = 1/2 byte unsigned fps type - types are Reserved = 0; 23.976 frames/sec = 1; 24 frames/sec = 2 - types are 25 frames/sec = 3; 29.97 frames/sec = 4; 30 frames/sec = 5 - types are 50 frames/sec = 6; 59.94 frames/sec = 7; 60 frames/sec = 8 - types are Reserved = 9 - 15 - NOTE: if MPEG v2 video frames are interlaced the frame rate becomes the field rate -> 18 bits picture rate = 2 1/4 bytes unsigned data bit rate in 400 bps units -> 1 bit header marker flag = 1/8 byte true/false value -> 10 bits video buffering verifier size = 1 1/4 bytes unsigned decompression buffer in 16,000 byte units -> 1 bit constrained parameter flag = 1/8 byte true/false value (set to 0/false for MPEG v2) -> 1 bit load custom intra quantizer matrix flag = 1/8 byte true/false value -> 64 byte encoder defined quantizer table - only if above is true otherwise no values -> 1 bit load custom intra non quantizer matrix flag = 1/8 byte true/false value -> 64 byte encoder defined quantizer table - only if above is true otherwise no values -> Continues with other MPEG video start codes and associated data like GOPs, pictures and slices - NOTE: only one video header per file bit stream -> 4 bytes video user meta data start code = long hex value of 0x000001B2 - NOTE: optional private data like an ASCII text string may be added by encoders of the video bit stream and is limited to 23 bytes -> 4 bytes video sequence end start code = long hex value of 0x000001B7 - NOTE: only one video sequence end to terminate decoding of video bit stream ISO/IEC 11172-3/13818-3 ELEMENTARY AUDIO FRAME PACKET HEADER -> 11 bits audio frame/packet sync marker = 1 3/8 bytes unsigned value of 2047 -> 2 bits audio version = 1/4 byte unsigned version type - types are Fraunhofer (v2.5) = 0; Reserved = 1; ISO/IEC 13818-3 (v2) = 2; ISO/IEC 11172-3 (v1) = 3 -> 2 bits audio compression layer/level = 1/4 byte unsigned level type - types are Reserved = 0; Fraunhofer layer III = 1; ISO/IEC layer II = 2; ISO/IEC layer I = 3 -> 1 bit CRC data protection flag = 1/8 byte true/false type - short unsigned CR check value added after header = 0; not used = 1 -> 4 bits audio packet rate/size = 1/2 bytes unsigned data bit rate type - NOTE: L1 = layer I; L2 = layer II; L3 = layer III - types are Reserved = 0; v1 - 32 kbps or v2/2.5 L2/3 - 8 kbps = 1 - types are v1 L1- 64kbps or v1 L2/v2 L1- 48kbps or v1 L3- 40kbps or v2/2.5 L2/3-16kbps = 2 - types are v1 L1- 96kbps or v1 L2/v2 L1- 56kbps or v1 L3- 48kbps or v2/2.5 L2/3-24kbps = 3 - types are v1 L1-128kbps or v1 L2/v2 L1- 64kbps or v1 L3- 56kbps or v2/2.5 L2/3-32kbps = 4 - types are v1 L1-160kbps or v1 L2/v2 L1- 80kbps or v1 L3- 64kbps or v2/2.5 L2/3-40kbps = 5 - types are v1 L1-192kbps or v1 L2/v2 L1- 96kbps or v1 L3- 80kbps or v2 L2/3-48kbps = 6 - types are v1 L1-224kbps or v1 L2/v2 L1-112kbps or v1 L3- 96kbps or v2 L2/3-56kbps = 7 - types are v1 L1-256kbps or v1 L2/v2 L1-128kbps or v1 L3-112kbps or v2 L2/3-64kbps = 8 - types are v1 L1-288kbps or v1 L2-160kbps or v1 L3-128kbps or v2 L1-144kbps or v2 L2/3-80kbps = 9 - types are v1 L1-320kbps or v1 L2-192kbps or v1 L3/v2 L1-160kbps or v2 L2/3-96kbps = 10 - types are v1 L1-352kbps or v1 L2-224kbps or v1 L3-192kbps or v2 L1-176kbps or v2 L2/3-112 kbps = 11 - types are v1 L1-384kbps or v1 L2-256kbps or v1 L3-224kbps or v2 L1-192kbps or v2 L2/3-128 kbps = 12 - types are v1 L1-416kbps or v1 L2-320kbps or v1 L3-256kbps or v2 L1-224kbps or v2 L2/3-160kbps = 13 - types are v1 L1-448kbps or v1 L2-384kbps or v1 L3-320kbps or v2 L1-256kbps/v2 L2/3-192kbps = 14 - types are Reserved = 15 - NOTES suggesting optimal bit rate quality per channel are below - L1: v1 32000 samples - 160 kbps ; v2 16000 samples - 80 kbps ; v2.5 8000 samples - 48 kbps - L1: v1 44100 samples - 192 kbps ; v2 22050 samples - 96 kbps ; v2.5 11025 samples - 48 kbps - L1: v1 48000 samples - 224 kbps ; v2 24000 samples - 112 kbps ; v2.5 12000 samples - 56 kbps - L2: v1 32000 samples - 96 kbps ; v2 16000 samples - 48 kbps ; v2.5 8000 samples - 24 kbps - L2: v1 44100 samples - 112 kbps ; v2 22050 samples - 56 kbps ; v2.5 11025 samples - 28 kbps - L2: v1 48000 samples - 128 kbps ; v2 24000 samples - 64 kbps ; v2.5 12000 samples - 32 kbps - L3: v1 32000 samples - 48 kbps ; v2 16000 samples - 24 kbps ; v2.5 8000 samples - 16 kbps - L3: v1 44100 samples - 64 kbps ; v2 22050 samples - 32 kbps ; v2.5 11025 samples - 16 kbps - L3: v1 48000 samples - 80 kbps ; v2 24000 samples - 40 kbps ; v2.5 12000 samples - 24 kbps -> 2 bits audio sample rate = 1/4 byte unsigned rate type - types are v1 - 44100 Hz/v2 - 22050 Hz/v2.5 - 11025 Hz = 0 - types are v1 - 48000 Hz/v2 - 24000 Hz/v2.5 - 12000 Hz = 1 - types are v1 - 32000 Hz/v2 - 16000 Hz/v2.5 - 8000 Hz = 2 - types are Reserved = 3 -> 1 bit frame/packet data padding space flag = 1/8 byte true/false type - not used = 0; byte (L2/3) - long (L1) byte slot added at packet end = 1 -> 1 bit private flag = 1/8 byte true/false value -> 2 bits audio channels = 1/4 byte unsigned channel type - types are Stereo = 0; Joint Stereo = 1; Dual Monaural = 2; Monaural = 3 -> 2 bits audio joint stereo mode = 1/4 byte unsigned mode type - types are none = 0; intensity = 1; ms = 2; intensity/ms = 3 -> 1 bit audio copyright flag = 1/8 byte true/false value - types are unrestricted material = 0; copyright controlled material = 1 -> 1 bit audio original flag = 1/8 byte true/false value - types are copy or duplicate encoding = 0; original or first encoding = 1 -> 2 bits audio emphasis (used for handling data loss during transmittion) = 1/4 byte unsigned emphasis type - types are none = 0; 50/15 ms = 1; Reserved = 2; CCIT J.17 = 3 -> Continues with associated compressed data/slots and possibly more MPEG audio frame packets