DirectX File Binary Format

This section details the binary version of the Microsoft® DirectX® file format as introduced with the release of DirectX 3.0. This appendix should be read in conjunction with the DirectX File Format Architecture (xfileArchitecture.htm) section.

The binary format is a tokenized representation of the text format. Tokens can be stand-alone or accompanied by primitive data records. Stand-alone tokens give grammatical structure and record-bearing tokens supply the necessary data.

Note that all data is stored in little endian format.

A valid binary data stream consists of a header followed by templates and/or data objects.

This section discusses the following components of the binary file format:

In addition, example templates are provided, in Example Templates. A binary data object is shown in Example Data.

Header

The following definitions should be used when directly reading and writing the binary header. Note that compressed data streams are not currently supported and are therefore not detailed here.

#define XOFFILE_FORMAT_MAGIC \
  ((long)'x' + ((long)'o' << 8) + ((long)'f' << 16) + ((long)' ' << 24))

#define XOFFILE_FORMAT_VERSION \
  ((long)'0' + ((long)'3' << 8) + ((long)'0' << 16) + ((long)'2' << 24))

#define XOFFILE_FORMAT_BINARY \
  ((long)'b' + ((long)'i' << 8) + ((long)'n' << 16) + ((long)' ' << 24))

#define XOFFILE_FORMAT_TEXT   \
  ((long)'t' + ((long)'x' << 8) + ((long)'t' << 16) + ((long)' ' << 24))

#define XOFFILE_FORMAT_COMPRESSED \
  ((long)'c' + ((long)'m' << 8) + ((long)'p' << 16) + ((long)' ' << 24))

#define XOFFILE_FORMAT_FLOAT_BITS_32 \
  ((long)'0' + ((long)'0' << 8) + ((long)'3' << 16) + ((long)'2' << 24))

#define XOFFILE_FORMAT_FLOAT_BITS_64 \
  ((long)'0' + ((long)'0' << 8) + ((long)'6' << 16) + ((long)'4' << 24))

Tokens

Tokens are written as little endian WORDs. A list of token values follows. The list is divided into record-bearing and stand-alone tokens.

Record-bearing

#define TOKEN_NAME 1
#define TOKEN_STRING 2
#define TOKEN_INTEGER 3
#define TOKEN_GUID 5
#define TOKEN_INTEGER_LIST 6
#define TOKEN_FLOAT_LIST 7

Stand-alone

#define TOKEN_OBRACE 10
#define TOKEN_CBRACE 11
#define TOKEN_OPAREN 12
#define TOKEN_CPAREN 13
#define TOKEN_OBRACKET 14
#define TOKEN_CBRACKET 15
#define TOKEN_OANGLE 16
#define TOKEN_CANGLE 17
#define TOKEN_DOT 18
#define TOKEN_COMMA 19
#define TOKEN_SEMICOLON 20
#define TOKEN_TEMPLATE 31
#define TOKEN_WORD 40
#define TOKEN_DWORD 41
#define TOKEN_FLOAT 42
#define TOKEN_DOUBLE 43
#define TOKEN_CHAR 44
#define TOKEN_UCHAR 45
#define TOKEN_SWORD 46
#define TOKEN_SDWORD 47
#define TOKEN_VOID 48
#define TOKEN_LPSTR 49
#define TOKEN_UNICODE 50
#define TOKEN_CSTRING 51
#define TOKEN_ARRAY 52

Token Records

This section describes the format of the records for each of the record-bearing tokens.

TOKEN_NAME

TOKEN_NAME is a variable length record. The token is followed by a count value that specifies the number of bytes that follow in the name field. An ASCII name of length count completes the record.
Field Type Size (bytes) Contents
token WORD 2 TOKEN_NAME
count DWORD 4 Length of name field, in bytes
name BYTE array Count ASCII name

TOKEN_STRING

TOKEN_STRING is a variable length record. The token is followed by a count value that specifies the number of bytes that follow in the string field. An ASCII string of length count continues the record, which is completed by a terminating token. The choice of terminator is determined by syntax issues discussed elsewhere.
Field Type Size (bytes) Contents
token WORD 2 TOKEN_STRING
count DWORD 4 Length of string field in bytes
string BYTE array Count ASCII string
terminator DWORD 4 TOKEN_SEMICOLON or TOKEN_COMMA

TOKEN_INTEGER

TOKEN_INTEGER is a fixed length record. The token is followed by the integer value required.
Field Type Size (bytes) Contents
token WORD 2 TOKEN_INTEGER
value DWORD 4 Single integer

TOKEN_GUID

TOKEN_GUID is a fixed-length record. The token is followed by the four data fields as defined by the OSF DCE standard.
Field Type Size (bytes) Contents
token WORD 2 TOKEN_GUID
data1 DWORD 4 UUID data field 1
data2 WORD 2 UUID data field 2
data3 WORD 2 UUID data field 3
data4 BYTE array 8 UUID data field 4

TOKEN_INTEGER_LIST

TOKEN_INTEGER_LIST is a variable length record. The token is followed by a count value that specifies the number of integers that follow in the list field. For efficiency, consecutive integer lists should be compounded into a single list.
Field Type Size (bytes) Contents
token WORD 2 TOKEN_INTEGER_LIST
count DWORD 4 Number of integers in list field
list DWORD array 4 x count Integer list

TOKEN_FLOAT_LIST

TOKEN_FLOAT_LIST is a variable-length record. The token is followed by a count value that specifies the number of floats or doubles that follow in the list field. The size of the floating point value (float or double) is determined by the value of float size specified in the file header. For efficiency, consecutive TOKEN_FLOAT_LISTs should be compounded into a single list.
Field Type Size (bytes) Contents
token WORD 2 TOKEN_FLOAT_LIST
count DWORD 4 Number of floats or doubles in list field
list float/double array 4 or 8 x count Float or double list

Templates

A template has the following syntax definition:

template              : TOKEN_TEMPLATE name TOKEN_OBRACE
                            class_id
                            template_parts
                            TOKEN_CBRACE

template_parts        : template_members_part TOKEN_OBRACKET
                        template_option_info
                        TOKEN_CBRACKET
                      | template_members_list

template_members_part : /* Empty */
                      | template_members_list

template_option_info  : ellipsis
                      | template_option_list
                       
template_members_list :    template_members
                      | template_members_list template_members

template_members      : primitive
                      | array
                      | template_reference

primitive             : primitive_type optional_name TOKEN_SEMICOLON

array                 : TOKEN_ARRAY array_data_type name dimension_list
                        TOKEN_SEMICOLON

template_reference    : name optional_name YT_SEMICOLON

primitive_type        : TOKEN_WORD
                      | TOKEN_DWORD
                      | TOKEN_FLOAT
                      | TOKEN_DOUBLE
                      | TOKEN_CHAR
                      | TOKEN_UCHAR
                      | TOKEN_SWORD
                      | TOKEN_SDWORD
                      | TOKEN_LPSTR
                      | TOKEN_UNICODE
                      | TOKEN_CSTRING

array_data_type       : primitive_type
                      | name

dimension_list        : dimension
                      | dimension_list dimension

dimension             : TOKEN_OBRACKET dimension_size TOKEN_CBRACKET

dimension_size        : TOKEN_INTEGER
                      | name

template_option_list  : template_option_part
                      | template_option_list template_option_part

template_option_part  : name optional_class_id

name                  : TOKEN_NAME

optional_name         : /* Empty */
                      | name

class_id              : TOKEN_GUID

optional_class_id     : /* Empty */
                      | class_id

ellipsis              : TOKEN_DOT TOKEN_DOT TOKEN_DOT

Data

A data object has the following syntax definition.

Note that in number_list and float_list data in binary files, TOKEN_COMMA and TOKEN_SEMICOLON are not used. The comma and semicolon are used in string_list data. Also note that you can only use data_reference for optional data members.

object                : identifier optional_name TOKEN_OBRACE
                            optional_class_id
                            data_parts_list
                            TOKEN_CBRACE
data_parts_list       : data_part
                      | data_parts_list data_part

data_part             : data_reference
                      | object
                      | number_list
                      | float_list
                      | string_list

number_list           : TOKEN_INTEGER_LIST

float_list            : TOKEN_FLOAT_LIST

string_list           : string_list_1 list_separator

string_list_1         : string
                      | string_list_1 list_separator string

list_separator        : comma
                      | semicolon

string                : TOKEN_STRING

identifier            : name
                      | primitive_type

data_reference        : TOKEN_OBRACE name optional_class_id TOKEN_CBRACE

Example Templates

Two example binary template definitions follow. Note that data is stored in little endian format, which is not shown in these examples.

The closed template RGB is identified by the UUID {55b6d780-37ec-11d0-ab39-0020af71e433} and has three members r, g, and b each of type float.

TOKEN_TEMPLATE, TOKEN_NAME, 3, 'R', 'G', 'B', TOKEN_OBRACE,
TOKEN_GUID, 55b6d780, 37ec, 11d0, ab, 39, 00, 20, af, 71, e4, 33,
TOKEN_FLOAT, TOKEN_NAME, 1, 'r', TOKEN_SEMICOLON,
TOKEN_FLOAT, TOKEN_NAME, 1, 'g', TOKEN_SEMICOLON,
TOKEN_FLOAT, TOKEN_NAME, 1, 'b', TOKEN_SEMICOLON,
TOKEN_CBRACE

The closed template Matrix4x4 is identified by the UUID {55b6d781-37ec-11d0-ab39-0020af71e433} and has one member—a two-dimensional array named matrix of type float.

TOKEN_TEMPLATE, TOKEN_NAME, 9, 'M', 'a', 't', 'r', 'i', 'x', '4', 'x', '4', TOKEN_OBRACE,
TOKEN_GUID, 55b6d781, 37ec, 11d0, ab, 39, 00, 20, af, 71, e4, 33,
TOKEN_ARRAY, TOKEN_FLOAT, TOKEN_NAME, 6, 'm', 'a', 't', 'r', 'i', 'x',
TOKEN_OBRACKET, TOKEN_INTEGER, 4, TOKEN_CBRACKET,
TOKEN_OBRACKET, TOKEN_INTEGER, 4, TOKEN_CBRACKET,
TOKEN_CBRACE

Example Data

The binary data object that follows shows an instance of the RGB template defined earlier. The example object is named blue, and its three members r, g, and b have the values 0.0, 0.0 and 1.0, respectively. Note that data is stored in little endian format, which is not shown in this example.

TOKEN_NAME, 3, 'R', 'G', 'B', TOKEN_NAME, 4, 'b', 'l', 'u', 'e', TOKEN_OBRACE,
TOKEN_FLOAT_LIST, 3, 0.0, 0.0, 1.0, TOKEN_CBRACE