/*
This the scans of Word for DOS 5.0 file format, converted into html by Sean Young. Some footnotes were missing (not on the scan), but nothing essential, AFAICS. The text is unaltered.

Revision: 0.1
*/

Microsoft Word 5.0 (PC) Binary File Format

Last revision: May 19, 1989
Change log:

May 17, 1989: created (davidb)

Introduction

Word use the same basic file format for its document files, glossary files, style sheets and autosave files. The main focus here is Word's document format with brief sections to explain the other file types briefly. Style sheets and autosave files are sufficiently different to be documented separately.

Word has used this same file format since its first version. This means that Word 1.0 can read Word 5.0 files and vice-versa. This compatibility was accomplished by defining all structures to be larger than they needed to be and setting all reserved fields to zero for use in future versions. Reserved pointers in the document header have been used to add entirely new document sections (such as document retrieval information and bookmark tables).

Because of the important issue of compatibility with future versions, all fields in all structures which are not currently being used MUST be filled with zeros. When the fields is finally defined for a new feature, we will make zero either the default value for that fields or make zero represent an uninitialized state which will be ignored.

The most important sections of a Word document are the text section and the formatting sections. The text section is straight ASCII text with the IBM extended character set. Some of the low order characters have been reserved for the special use, such as forced line feeds and page breaks. No formatting "reveal" codes can be found in the text section. Instead, formatting is stored in the formatting sections and related to the text by sequential tables. This makes extracting the textual information from a Word document really simple: just read the ASCII text section and ignore the formatting section.

File Format Differences Between Word 5.0 and Word 4.0

Definitions

It would be useful at this point to useful to define a few concepts and structures that will be referred to throughout this document. The usage of some data structures are described here, but for precise C language definitions of these data structures, refer to Appendix A.
autosave file
A document saved with Word's autosave format. These files look just like full saved documents except they have additional sections of text and formatting tagged on the end. Since text and formatting are no longer stored in order of occurence, a piece table (plcfpcd) is stored to tie everything together.

Autosave files can be identified by examining the fAsv bit in the document header (FIB). They are mentioned here since any program not aware of their existance would confuse them with a normal Word document. Loading such a document as if were a normal document could cause problems, so beware. The autosave format is beyond the scope of this document.

bookmark
A contiguous sequence of characters with the text stream of a document which has a unique bookmark name. Bookmarks are a convenient way of saving a spot in a document which can be jumped to quickly. Bookmarks can also be used for cross-referencing and for importing text from another Word document.

CHP (CHaracter Properties)
The data structure describing the character properties of a run of text. See page 18.

CP (Character Position):
A four-byte integer which is the position coordiante of a character of text within the logical text stream of a document.

CPs are stored in the file least significant byte first, most significant byte last.

division
Same as a section. Word 5.0 externally refers to portions of a document deliminated by section marks as divisions, but internally refers to them as sections. Both terms will be used in this document.

DOP (DOcument Properties)
The data structure describing properties to the document as a whole. See page 20.

FIB (File Information Block)
The header of a Word file. Begins at offset 0 in file. Gives the beginning offsets of the document's subsidiary data structures within the file. Also stores other file status information. See page 20.

FC (File Character position):
A four-byte integer which is the byte offset of a character (or other object) from the beginning of the file. In the current file format (Word 1.0 through Word 5.0) it is possible to map from CP to FC by addding 128. In other words, CP 0 corresponds to FC 128 since the logical text stream starts at byte 128 (PN 1) in the file, immediately following the header.

FCs are stored in the file least significant byte first, most significant byte last.

FCHP, FPAP (File CHaracter or PAragraph Properties)
The compressed versions of CHPs and PAPs which are actually stored in file. Each begins with cch (length in bytes) followed by a partial CHP or PAP which contains all fields up to the point where all additional fields are the same as the default properties.

FKP (Formatted disK Page):
A data structure that fits in one 128-byte page that encodes either the character properties or the paragraph properties of a certain portion of Microsoft Word file. See page 22.

page (or sector):
128 byte segment of a Word file that begins on a 128-byte boundary. (bytes 0-127 are in page 0, bytes 128-255 are in page 1, etc.). In Word data structures, an unsigned two-byte integer page number is given the acronym PN (for Page Number).

The word page is also used to describe the way a document is divided up to print on pieces of paper when it is printed or paginated.

PLCF (PLex of Cps (of FCs) stored in File):
A data structure consisting of two parallel arrays that allows a relation to be established between a certain CP position in the document text stream and an arbitrary data structure. Its generalized structure is defined on page 26 in Appendix A. It consists of a of 4 byte header (int iMac, int cb), followed by an array of iMac + 1 CPs and an array of iMac instances of particular arbitrary data structure, item. In typical usage, the nth CP of the PLCF is in one-to-one correspondence with the nth instance of item, with the n 1st CP marking the limit of the nth instance's influence. The following PLCF structures may be found in a Word 5.0 document: plcfpgd (page table), plcfbkmf and plcfbkml (denoting the beginnings and ends of bookmark entries, respectively), plcfsqd (squence descriptor table), plcfsqr (sequence reference table), and plcfrhd (running head table).

PAP (PAragraph Properties)
The data structure which describes the properies of a particular paragraph. See page 22.

paragraph
A contiguous sequence of characters within the text stream of a document that is delimited by a paragraph mark or a section mark (These are special characters described later in this document).

run of text
A contiguous sequence of characters within the text stream of a docuemnt that have the same character formatting properties. A single run may cross paragraph boundaries and may encompass the entire document.

section
A contiguous sequence of paragraphs within the text stream of a document that is delimited by section mark or by the end of a document. Users frequently treat sections as the equivalent of a chapter in a book. The boundaries of sections mark locations where the layout rules for a document (margins, number of columns, text of headers and footers to use, whether page numbers should be displayed, etc.) are changed.

sequence
Sequences are used to automatically number objects in a Word document. Each sequence has a name to determine the type of object it is meant to number. For example, someone might create a sequence named "table" to number all the tables that might appear in a document. In such a sequence, a sequence marker would be used to mark each "table" in the sequence, and a sequence reference could be used to print the number of a specific table which is marked by a bookmark.

SEP (SEction Properties)
The data structure describing the properties of a particular section. See page 27.

STTB (STring Table)
A data structure used to hold an array of Pascal strings which consists simply of an integer count of strings + 1 (istrMac) followed by istrMac - 1 strings. Each strings consists of a byte count of characters (cch) in the string followed by the characters (rgch[cch]). The first string is referenced by index 1, the second by index 2, etc, up to istrMac - 1. There is no string with index 0. The nth string can only be reached by looking at the cch's of all previous strings. See page 29 for a more precise C language pseudostructure.

FSEP (File SEction Property)
The compressed version of a SEP which actually stored in the file. It begins with a cch (length in bytes) followed by a partial SEP which contains all fields up to the point where all additional fields are the same as the default section properties.

TB (TaBle)
Word's way of storing a simple array of an arbitrary data structure (item) in a file. It consists of an integer specifying the number of items in the array, followed by another integer which has no meaning, followed by the array of items. The size of each item in the array is not specified, but can be determined from context by referring to the descriptions of the structures that are stored in table. See page 30.

Tables are usually used to stored data structures which, if they were added today, would be stored in a PLCF. In such usage, the item structure will be defined to begin with a CP. The last instance of the structure will exist only to provide a limiting CP for the previous entry; the other fields in the last item should be ignored. Example of tables which contain cps are fntb (footnote table), setb (section table) and bftb (buffer table).

Note: In this document, bit 0 means the low-order bit. Structures are described as they would be declared in C.

Naming Conventions

The names in Word data structures usually consist of a lower case sequence of characters followed by optional upper case modifier. The following tags are used in the lower case parts of fields names to document the data type of a field:

f used to name a flag (a variable containing a boolean value). Usually the object referred to will contain either 1 (fTrue, TRUE) or 0 (fFalse, FALSE). (eg. fWidowControl, fBold)
l used to name a 4 byte integer value (a long). (eg. lcb)
w used to name a 2 byte integer value (a word).
cp used to name a variable that contains a character position within the document. always a 4 byte quantity.
fc used to name a variable that contains an offset from the beginning of a file. always a 4 byte quantity.
xa used to name a variable that contains a width of an object imaged on screen or on hard copy that is measured in units of 1/1440 of an inch. This unit is one-twentieth of a point size (1/20 * 1/72") is called a twip in this documentation. (eg. xaPage is the width of a page).
ya used to name a variable that contains a height of an object imaged on screen or on hard copy that is measured in twips.
dxa used to name a variable that contains the horizontal distance of an object measured from some reference point expressed in twips. (eg. pap.dxaLeft is the distance of the left boundary of a paragraph measured from the left margin of the page)
dya used to name a variable that contains the vertical distance of an object measured from some reference point expressed in twips. (eg. pap.dyaAbs is the vertical distance of the top of a paragraph from a frame declared in the pap).
dxp used to name a variable that contains the horizontal distance of an object measured from some reference point expressed in printer specific units.
dyp used to name a variable that contains the vertical distance of an object measured from some reference point expressed in printer specific units.
rg prefix used to signify that the data structure begin defined is an array. (eg. rgb (an array of bytes), rgcp (an array of cps), rgfc (an array of fcs), rgfoo (an array of foos).
i prefix used to signify that an integer value is used as an index into an array. (eg. itbd is an index info rgtdb.)
c prefix used to signify that an integer value is a count of some number of objects. (eg. a cb is a count of bytes, a cl is a count of lines, ccol is a count of columns.)

The two following modifiers are used occasionally in this documentation:

First means that variable marks the first of a range of object. For example, cpFirst would mark the first character position of a range of characters in a document. fcFirst would mark the file offset of the first byte of a range of bytes stored in a file.
Lim means the variable marks the limit of a range of objects (ie. is the index of the lst object in a range plus 1). For example, cpLim would be the limit CP of a range of characters in a document. fcLim would be the limit file offset of a range of bytes stored in a file.
Mac means variable marks the limit of an entire array of objects. Whereas cpLim could be used to specify the limit of a range of text within a document, cpMac would be used to denote the number of characters in the entire document.

Overview of Word Document Contents

A Word document file consists of the Word file header (FIB), the text, the formatting information, and various other information.

What follows is a brief overview of the composition of a Word document. A more in depth description of each part will follow.

The FIB structure comes first, and text immediately follows. All other information in the document is indexed by page numbers (PNs) stored in the FIB.

FIB
Stored at the beginning of page 0 of the file. Fields from this structure will referred to, so you might want to skip to Appendix A (structure definitions) or the more complete description on page 20 now and review its contents.

text of body, footnotes, headers
Text begins at byte 128 (page 1) of the file.

Test for existance:   Always exists
Location (PN) if present:   pn 1

Character Formatting
Contains character formatting for each run of text in the document. It begins at the first page boundary after end of the text.

Test for existance:   Always exists
Location (PN) if present:   (fib.fcMac + 127) / 128

Paragraph Formatting
Contains paragraph formatting for each paragraph in the document.

Test for existance:   Always exists
Location (PN) if present:   fib.pnPara

Footnote Table
The fntb is a TB table which associates all footnote reference marks in a document to the corresponding footnote text. If there are no footnotes in the document, there will be no footnote table.

Test for existance:   fib.pnFntb != fib.pnBkmk
Location (PN) if present:   fib.pnFntb

Bookmark and Sequence Information
This section contains various PLCF tables to store positions of bookmarks, sequence marks and sequence references. If there are no bookmarks or sequences, this section will not exists.

Test for existance:   fib.pnBkmk != fib.pnMacBkmk && fib.pnMacBkmk != 0
Location (PN) if present:   fib.pnBkmk

Section Properties
Contains section properties that correspond with each section mark in the document. If there are no section marks, there will be no section properties stored. These section properties could be stored anywhere in the document, as long as each individual SEP (section properites structure) is pointed to by the correct entry in the section table (see below).

Test for existance:   Section table exists
Location (PN) if present:   Referenced by section table

Page Table
The page table (plcfpgd) is a PLCF table which has information about each page in the document as of the last time it was printed or paginated.

Test for existance:   fib.pnPgtb != 0 && fib.pngtb < fib.pnSetb
Location (PN) if present:   fib.pnPgtb

Running Head Table
The running head table (plcfrhd) is a PLCF which is present for documents which contain running heads and have been paginated by Word 5.0. The purpose of this table is to make background pagination faster and smarter about dealing with running heads.

Test for existance:   fib.pnRhtb != 0 && fib.pnSetb > fib.pnRhtb
Location (PN) if present:   fib.pnRhtb

Section Table
The section table (setb) is a TB table which has one entry for the cpLim (CP of character following the section mark) of each section. It serves to relate each section mark to the location in the file where its sections properties are stored. If there are no section marks in a document, there will be no section table.

Test for existance:   fib.pnSetb != fib.pnBftb
Location (PN) if present:   fib.pnSetb

Buffer Table (Glossary Files Only)
The buffer table (bftb) is a Tb table which has one entry for every glossary item.

Test for existance:   fib.pnBftb != fib.pnMac
Location (PN) if present:   fib.pnBftb

For normal Word documents (non-glossaries), fib.pnBftb serves only to mark the end of the section table.

Document Summary Information
The document summary information contains the summary sheet information that is entered when the document was first saved, or through the document retrieval feature.

The summary sheet is only available for text documents. Do not look for summary information in glossaries or style sheets.

Test for existance:   fib.pnSumd != fib.pnMac
Location (PN) if present:   fib.pnSumd

The rest of this document will describe each of these sections described above in detail.

FIB

The FIB contains a "magic word" and pointers to the various other parts of the file, as well as information about the length of the file. At this point, refer to the FIB's formal definition in Appendix A.

The first thing Word uses the FIB for is to determine what type of file it is loading. The wIdent, dty, wTool and fAsv fields are used for this purpose.

All files written by PCWord will have fib.wIdent = 0x6031 and fib.wTool = 0xAB00. The dty field tells Word that the file is a document (0), glossary (1), stylesheet (2), or printer driver (3). If it is a document, Word looks at the fAsv field to determine whether the document has been incrementally saved.

In short, all full-saved Word documents begin with 0x6031, 0x000, 0xAB00 and the byte at offset 117 decimal will be zero. OEM programs dealing with Word documents can ignore all files not fitting this description.

The header contains the path\name of the attached stylesheet as a NULL terminated (ASCIIZ) string at offset 30, and an 8 character PRD (printer driver) name at offset 98. The PRD name is used to give a hint to other programs (such as Microsoft Write and Microsoft PageView) which printer driver Word was using when the document was last saved. The PRD name is NULL terminated only if it is less than 8 characters long.

The DOP (document properties, described in Appendix A) at offset 108 saves some of Word's options as they were when the document was aved. If they are different when it is reloaded, some caches will have to get invalidated. The DOP structure also contains revision marking options.

Some fields are used only for autosaved files. These fields (pnNextFib, pnChar, pnPlcpcd, pnPlcphe, fAsvFormatted, fAsv, pnFilename) should be set zero for normal word documents. Unassigned bits and wReserved should also be set to zero.

Most of the other fields in the header are PNs, or page numbers revealing the offsets of various sections of the file. Most of the sections so referenced are optional and may not be present. Unfortunately, there is no consistent rule for determining whether a section is present.

Some of the PNs have implied sequential relationships:
(pnPgtb and pnRhtb) < pnSetb < (pnBftb and pnSumd) < pnMac
This doesn't mean a whole lot more other than the conventions for indicating whether a section exists. For example, if there is no summary information, pnSumd gets to pnMac, not 0. Similarly, if pnSetb is equal to pnBftb, there is assumed to be no section table.

Text

The text of the file starts at byte 128 (page 1). The text in a Word document is ASCII text according to the character set defined by the IBM codepage specified in the document header. If fib.codepage is zero, codepage 427 (U.S.A) is assumed.

In this discussion, all ASCII values are given in decimal. The following ASCII definitions will apply:

<LF> is a Line Feed   10 decimal
<FF> is a Form Feed   12 decimal
<CR> is a Carriage Return   13 decimal

Some ASCII characters have special meaning to Word. The rest of this section will elaborate on what those special meanings might be.

Note: The end of a section is also the end of a paragraph. The last character of a section is a section mark which stands in place of the paragraph mark normally required to end a paragraph.

Also note that hard line, column and page breaks are only present if inserted by the user to preempt Word by causing a break sooner than it would otherwise occur. Word does not store any special codes represent natural breaks. Rather, it calculates how many words will fit on a line each time a line is displayed or printed. Word does not maintain a separate page table to store information about where pages break.

There is some confusion about the use of ASCII 196 to represent a non-breaking hyphen. Since Word began supporting the IBM linedraw set in version 4.0, this has been a natural conflict. People were not able to print the character 196 (horizontal line) from the linedraw set because it would be interpreted as a non-breaking hyphen and printed using ASCII 45. To avoid this problem, an exception was made for fixed-pitch fonts which could be used for cursor linedrawing. IF there are more than 2 ASCII 196's in a row or a ASCII 196 character is adjacent to another IBM linedraw character on either side, it will not be treated as a non-breaking hyphen. This makes virtually all cursor-linedrawings print correctly.

A second solution that works for all fonts has been introduced in Word 5.0. To get character 196, enter an em-hyphen. Word will unconditionally map this to character 196.

The following ASCII codes are treated as "special" characters when the have the character property fSpecial enabled (chp.fSpecial == 1):

1 Current page number
Prints the number of the page on which it occurs.

2 Current date (eg. July 28, 1989)
Prints the date on which the document is printed

3 Current time (eg. 5:15 PM)
Prints the time at which the document began to print

4 Autonumbered annotation reference
Automatically placed in body text and in footnote/annotation area when annotation is inserted

5 Autonumbered footnote reference
Automatically placed in body text and in footnote/annotation area when autonumbered footnote is inserted.

7 Sequence mark
Used to increment are assign a new count to a user defined sequence. For example, (table:) and (table:7) are both stored as this character. The name of the sequence and the assigned number are stored in a separate table.

8 Sequence reference mark
Used to cross reference the number of a sequence at the end of a named bookmark. For example, (page:address) might be used to print the number of the page on which the bookmark address will print. This character can also be used to cross reference using user defined sequences.

The document text stream is represented by the text beginning at byte 128 up to (but not including) fib.fcMac.

Character And Paragraph Formatting Properties

In Word documents, both the character and paragraph sections are structured as a set of disk pages (FKP). The structure of these FKPs is similar for paragraph and character formatting, except that paragraph properties are stored in FPAP structure and character properties are stored in FCHP properties. The generic structure FPROP can be used to represent an FPAP or an FCHP.

The fundamental unit of text for which character formatting information is kept is the run of text, a contiguous sequence of characters stored on disk that all have the same character properties. Each run would have an entry recorded in a FCHP FKP. If a user never changed the character used in his document, the entire document stream would be one large run of text and one FCHP would suffice to describe the character properties of the entire document.

The fundamental unit of text for which paragraph properties are recored is the paragraph. Every paragraph has an entry recorded in a FPAP FKP.

An FKP (see table below) is a 128-byte data structure that is stored in one page of a Word file. At offset 127 is a 1-byte count named cRun, which is a count of runs of text FCHP FKPs and which is a count of paragraphs in FPAP FKPs. Beginning at offset 0 of the FKP is the FC for the first character covered by this disk page. This is followed by an array of RUN structures which provide the FC for the start of the next run (fcLim) and the byte offset of the actual stored properties (bProp). This offset is from byte 4 of the FKP (beginning of array of RUNs). If its value is -1, the properties are assumed to be default and are not stored.

As of Word 5.0, paragraph height are stored in the FPAP FKPs in addition to formatting information. This information is for Word's use only and should be ignored by other software. Programs writing out Microsoft Word 5.0 files should set fib.version to 3 (version5_0OEM) so that Word will not try to acces this information.

byte offset number of bytes meaning
0 4
fcFirst
FC of first character covered by this page of formatting.
4 6 * cRun
rgRun
Array of RUN structures. Each 6 byte RUN structure consists of an fcLim (FC for the beginning of the next RUN) and bProp (byte offset to actual properties from beginning of rgRun array). If bProp is -1, the run is taken to be attached to the Paragraph Standard or Character Standard style.
4 + 6 * cRun 6 * cRun
rgPhe (Only stored in paragraph properties)
Optional array of PHE structures (paragraph height) used to enhance speed of future pagination and layout. Only version of Word whose internal version code match fib.version for this document will reference this information.
Character or Paragraph Properties (FPROPS) Go Here
127 1 byte
cRun
Number of runs of paragraph or character formatting stored on this page.

An FPROP is abbreviated set of properties. It consists of a single byte count of the characters (cch) followed by the first part of a CHP or PAP structure which is to be copied over the first cch bytes of the default CHP or PAP structure. For example, if a character run were bold, but in all other respects just like the default, a cch of 2 would be stored to followed by the first two bytes of the CHP (footnote: Two bytes are necessary since the fBold bit happens to be in the second byte of the CHP structure.). To recover the full character properties, Word would start with the default character properties, then replace the first two bytes with those from the file.

Since the offsets stored in the RUN entries (bProp) allow random acces throughout the FKP, space within an FKP can be conserved by storing the offset of the same physical FPROP in several RUN entries for paragraph or character runs that happen to have the same properties. Word uses this optimization only if consecutive paragraphs have the same properties.

Footnotes

The text in a Word document is broken into two section. The body text, which is followed by the footnote (footnote: The footnote section of the document can actually contain annotations as well. Annotations are just special footnotes, so the word footnote will be used to describe them too. One difference between footnotes and annotations is that annotations are referred to by annotation reference (ASCII -- unreadable on scan -- )) text. If there are no footnotes, all the text is body text. If there are footnotes, a group of paragraphs at the very end of the docuemtn is desinated to hold the footnote text.

The text of a footnote is anchored to a particular position within the document's body text, the location of its footnote reference. There is a structure referenced by the FIB, the FNTB (footnote table), which relates the location of each footnote references to the corresponding footnote. The FNTB is a table of footnote descriptors (FNDs). Each FND contains a cpRef which points to a footnote reference mark, and a cpFtn which to point to the beginning of the corresponding footnote text in the document's footnote section. The footnote text extends until the character immediately preceding the one indexed by cpFtn in the next FND.

The CP of the point in the document where body text ends and footnote text begins can be determined by examining the cpFtn stored in the 0th FND of the FNTB. If there is no FNTB, all the text in the document is body text.

The last two characters of footnote text for a footnote (ie. the character at limit CP minus 2) are always <CR> <LF> (a paragraph mark). Word will not allow this paragraph mark to be deleted since each footnote must begin at the start of a new paragraph.

When there are n footnotes, the FNTB structure consists of n + 1 FNDs. The final FND serves only to provide the CP limit of footnote text for the previous FND. This is, of course, the end of the document, so cpFtn and cpRef for the this FND are both set to the document's cpMac.

The footnote reference is whatever character in the body text is pointed to by and FND's cpRef. If it happens to be autonumbered footnote reference (ASCII 5), or an autonumbered annotation reference (ASCII 4), Word will automatically generate a footnote number based on the number of footnote references which precede it in its division. Thus the first footnote in in a division will automatically be numbered 1, the second 2, etc.

Bookmark and Sequence Information

This section contains several related structures containing information about bookmarks, sequence marks, and sequence references. The first word (bytes 0 - 1) contain bit flags indicating which of these structures are present. If the bit is set, the corresponding table is present.

bit 0: Bookmarks exist, so plcfbkmkf and plcbkmkl are present
bit 1: User defined sequences exist, so plcfsqd is present
bit 2: Sequences are cross-referenced, so plcfsqr is present
bit 3: Bookmarks or user defined sequences exist, so there it a string table (sttbNames) to store their names.

As their names imply, plcfbkmkf, plcfbkmkl, plcfsqd and plcfsqr are all PLCF tables and are accessed as such (see page 4). The tables are saved in immediate sequential order starting at byte 2 in the disk page in the following order: plcfbkmkf, plcfbkmkl, plcfsqd, plcfsqr, sttbNames.

plcfbkmkf associates the name of each bookmark to the beginning CP of the next marked by that bookmark. Each cp points to the beginning of a bookmark except the last one which points to the end (cpMac) of the document. Each 4 byte BKMK structure (see page 18) contains the index of the bookmark's name in sttbNames (iName) followed by the index of the bookmark's cpLim in plcfbkmkl.

plcfbkmkl contains the cpLim (CP of the character after the end of each bookmark) for each bookmark. This PLCF contains only CPs, to its cb should be set to 0.

plcfsqd associates the location of each sequence marker to the name of its sequence. Each cp points to a sequence marker except the last one which points to the end (cpMac) of the document. Each 4 byte SQD (see page 30) structure contains the index of its sequence's name in sttbNames (iName), followed by a 13 bit cache to its current position in its sequence (value) and a 3 bit code (action). The 3 bit action can be 0 (Increment sequence value and print new value. For example "(table:)"), 1 (Increment sequence value and don't print new value. For example "(table::)"), or 2 (Set sequence value to constant stored in value. For example "(table:0)").

plcfsqr associates the location of each sequence reference mark to the name of its sequence and the name of the bookmark being cross-referenced. Each cp points to a sequence cross reference mark except the last one which points to the end (cpMac) of the document. Each 4 byte SQR structure stores the index of the sequence name (iName) and bookmark name (iNameBkmk) into the sttbNames string table.

sttbNames is an STTB table (see page 5) which holds names for all user defined bookmarks and sequences. It is referenced in the same manner as all STTBs.
REMEMBER: the index to the first string is 1, not 0.

Section Properties

This area contains any necessary section properties in the form of SEP structures (see page 27). The exact storage order is unimportant since all the SEP structures are referenced by file location in the Section Table. (see page 16).

Page Table

The plcfpgd, referenced by pnPgtb in the FIB, gives the location of page breaks within Word document and may optionally be saved in a Word binary file. It is stored and accesed as a standard PLCF table (see page 4). If there are n page breaks calculated for a document, the plcfpgd would consist of n + 1 CP entries followed by n PGD (see page 25). The final (n + 1st) CP entry should contain cpMac for the document.

Third-party creators of Word files should not attempt to creat a plcfpgd. It can only be created properly using Word's layout routines. If a Word document is edited in any way, the plcfpgd should be deleted by setting fib.pnPgtb to 0.

Third-party readers of Word files will similarly have a tough time interpreting them. The CPs stored in the table are NOT always the real CPs on which the respective pages begin. The real page breaks are located by cp, cl pairs which translates to "This page starts cl lines AFTER cp." If pgd.cl is non-zero, the page break was calculated using paragraph heights (PHE) where the height of all lines in the paragraph were the same (!phe.fDiffLines). All Word knows in this case is that the -- this part is missing on the scan! -- knowledge of Words formatting and layout rules and the exact widths of all characters as reported in Word's PRD drivers.

Running Head Table

The plcfrhd, referenced by pnRhtb in the FIB, gives the location and type of running heads within a Word document. The purpose of this table is to make pagination and layout more efficient with respect to running heads. It should only exist if the page table exists.

This is a standard PLCF structure (see page 4) which contains an array RHD structures (see page 27). The plcfrhd is to be written and read the same way as other PLCF structures.

Only Word should read and write this table.

Section Table

The setb, referenced by pnSetb in the FIB, associates section properties with each section mark in the document. As the name implies, it is a standard TB table (see page 5) which contains an array of SEDF (see page 27) structures. Each SEDF consists of a cpLim (the CP for the character following the section mark) for its section, an fn (ignore this field), and an fcSep. If fcSep is 0xFFFFFFFF, the section mark is taken to be attached to the Division Standard style. Otherwise, it points to the full SEP structure (defined on page 27) which is stored in the Section Properties area.

Buffer Table

The bftb, referenced by pnBftb in the FIB, is present only in glosssary files.

Document Summary Information

The summary information is referenced by pnSumd in the FIB. It consists of 9 integer pointers which point to various pieces of information in the file relative to the beginning of the summary information section. The actual information is stored after the pointers. All strings are stored as NULL terminated ASCIIZ strings.

byte offset points to:
0 - 1 szTitle (document title - up to 40 chars)
2 - 3 szAuthor (document's author - up to 40 chars)
4 - 5 szOperator (operator - up to 40 chars)
6 - 7 szKeyword (keywords - up to 80 chars)
8 - 9 szComments (comment - up to 220 chars)
10 - 11 szVersion (version - up to 10 chars)
12 - 13
rgchRevDate (Revision Date)
stored in 8 character field, possibly padded with spaces
14 - 15
rgchCreateDate (Creation Date)
stored in 8 character field, possibly padded with spaces
16 - 17
lcchDoc (number of characters in document)
stored as long (32 byte) integer with least significant byte stored first

Appendix A: Structure Definitions

The following structures will be given in C language format with comments to make figuring offsets an easier process.

The following definitions are necessary in order to compile structures defined below:

Some of these structures will compile only if your compiler supports unnamed unions. If yours doesn't, you'll need to provide union names.

typedef unsigned BF;	 /* Bit field */
typedef long FC;	 /* FC = Byte Offset into File */
typedef unsigned PN;	 /* Page Number = Page Offset into File */

#define cchMaxFIBFile 66 /* Maximum length of PRD path in FIB */
#define itbdMax	      20 /* Maximum number of tabs (19) plus one */

BKMK (BooKMarK - stored in plcfbkmkf)

struct BKMK
	{
	int iName;	 /* index of name into string table */
	int ibkmkl;	 /* index of end anchor in plcfbkmkl */
/* SIZE = 4 bytes */
	};

CHP (Character Properties)

struct CHP		 /* Character properties */
	{
	BF	fStyled : 1;	/* if true, ignore all fields in this CHP
				   structure except stc, fSpecial and fNew,
				   and get properties from the style 
				   sheet instead */
	BF	stc : 7;	/* if fStyled, this will hold a style 
				   code from 0 to 29 - see belowe */
	BF	fBold : 1;	/* bold */
	BF	fItalic : 1;	/* italic */
	BF	ftc : 6;	/* font code from 0 to 62 -- see below */
/* BYTE 2 */
	BF	hps : 8;	/* size in half pts */
	BF	fUnline : 1;	/* underline */
	BF	fStrike : 1;	/* strikethrough */
	BF	fDline : 1;	/* double underline */
	BF	fNew : 1;	/* insert while revision marking was
				   enabled. Will be cleared when revision
				   is "Accepted" */
	BF	csm : 2;	/* case modifier - see below */
	BF	fSpecial : 1;	/* character should take on alternate 
				   definition. For example, character 5
				   with fSpecial is a footnote reference mark 
				   see page 11. */
	BF	fHidden : 1;	/* hidden text */
/* BYTE 4 */
	BF	bUnunsed : 8;
	BF	hpsPos : 8;	/* superscript/subscript */
/* BYTE 6 */
	BF	clr : 3;	/* font color - see below */
	BF	: 13;
/* SIZE = 8 BYTES */
	};

/* Definitions for stc (style code in CHP) */
#define stcNormal	0	/* Character Standard */
#define stcFtnRef	13	/* Footnote Reference */
#define stcFolio	19	/* Page Number */
#define stcAnnRef	26	/* Annotation Reference */
#define stcLineDraw	27	/* Line Draw */
#define stcSummaryInfo	28	/* Summary Info */
#define stcLNum		29	/* Line Number */

stc 1 through 12 represent character variants 1 thorugh 12 respectively
stc 14 through 18 represent character variants 13 through 17 respectively
stc 20 through 25 represent character variants 18 through 23 respectively

/* definitions for ftc (font code) */
ftc 0 through 15 represent (modern a) through (modern p) respectively
ftc 16 through 31 represent (roman a) through (roman p) respectively
ftc 32 through 39 represent (script a) through (script h) respectively
ftc 40 through 47 represent (foreign a) through (foreign h) respectively
ftc 48 through 55 represent (decor a) through (decor h) respectively
ftc 56 through 63 represent (symbol a) through (symbol h) respectively

If a printer driver is loaded, the font codes will receive printer specific
font names in addition to the generic font names above, For example,
for the HP LaserJet, ftc 0 translates to "Courier (modern a)"

/* definitions for csm (case modifier) */
#define csmNormal	0	/* don't do any case transformation */
#define csmUpper	1	/* print characters in uppercase */
#define csmSmallCaps	2	/* print characters in small caps */

/* definitions for hpsPos (half point size position) */
#define hpsNormal	0	/* Neither superscript or subscript */
#define hpsSuperscript	12	/* superscript */
#define hpsSubscript	244	/* subscript (8 bit integer = -12) */

Word's file format was designed to allow degrees of superscript 
and subscript represented in half point units. This is currently
not supported, but Word does respect the following mapping 
8 bit integers: hps 1 through 127 represent subscript, 
hps -1 through -128 represent superscript

/* definitions for clr (font color) */
#define clrBlack	0 
#define clrRed		1
#define clrGreen	2
#define clrBlue		3
#define clrViolet	4	/* Dark gray for monochrome printers */
#define clrMagenta	5	/* Medium gray for monochrome printers */
#define clrYellow	6	/* Light gray for monochrome printers */
#define clrWhite	7
#define cclrMac		8

DOP: Document Properties

/* This structure is included in the File Information Block. 
   Only fMarkRev and revText are true document properties.
   The rest of the fields are used to validate options set and
   style sheet and printer driver loaded when the document was
   saved vs. conditions when it will be loaded. If the conditions
   don't match, paragraph heights will be ignored and the page
   table dirtied. */

struct DOP	/* Current Pagination State (Document Properties) */
	{
	BF	fMarkRev : 1;		/* mark revisions? */
	BF	revText : 3;		/* revision text modification */
	BF	revBar : 2;		/* revision bar position */
	BF	fWidowControl : 1;	/* widow/orphan control? */
	BF	fSeeHidden : 1;		/* print hidden text? */
	BF	fOutLine : 1;		/* print in outline mode */
	BF	wReserved : 7;
	int	prid;			/* printer driver checksum */
	int	dxaTab;			/* default tab width */
	int	sshtsum;		/* style sheet checksum
					   set to zero if no style sheet */
/* SIZE = 8 BYTES */
	};

/* Definitions necessary to interpret DOP structure */

/* possible values for revText field 
	- special format for text inserted while marking enabled */
#define revTextUline	0	/* inserted text displayed as underline */
#define revTextUpper	1	/* inserted text displayed as uppercase */
#define revTextNormal	2	/* inserted text displayed as formatted */
#define revTextBold	3	/* inserted text displayed as bold */
#define revTextDline	6	/* inserted text double underlined */

/* NOTE: When creating a Word document from scratch, just fill in the
   DOP structure with zeros */

FIB: File Information Block

struct FIB
	{
	int	wIdent;		/* must contain 0x6031 */
	int	dty;		/* document type - see below */
	int	wTool;		/* most contain 0xAB00 */

/* NOTE: The following 4 pn's are only for incrementally saved
Word documents. They should be set to zero for normal documents. */

	PN	pnNextFib;	/* pointer to incremental save FIB */
	PN	pnChar;		/* pointer to Char formatting,
				   only valid in second or greater
				   FIBs in an incremental save file */
	PN	pnPlcpcd;	/* pointer to piece table */
	PN	pnPlcphe;	/* paragraph height table */

/* BYTE 14 */
	FC	fcMac;		/* FC of end of section */
	PN	pnPara;		/* location of paragraph formatting */
/* BYTE 20 */
	PN	pnFntb;		/* location of footnote table */
	PN	pnBkmk;		/* bookmarks and sequence information */
	PN	pnSetb;		/* section table */
	PN	pnBftb;		/* buffer table (glossaries only) */
	PN	pnSumd;		/* summary information */
/* BYTE 30 */
	char	szSsht[cchMaxFIBFile];
				/* style sheet path\name ASCIIZ -
				   up to 65 characters followed by '\0' */
	int	wReserved;	/* Windows write uses it */
/* BYTE 98 */
	char	rgchPrtNm[8];	/* 8 character PRD name without path */
				/* not an ASCIIZ - NULL terminated only 
				   if name is lessthan 8 characters */
	PN	pnMac;		/* one PN past end of document */
/* BYTE 108 */
	struct DOP dop;		/* document properties */
/* BYTE 116 */
	BF	Version : 8;	/* word version - see below */
	BF	fAsvFormatted : 1;	/* was this a formatted file 
					   before autosave? */
	BF	fAsv : 1;	/* autosave */
	BF	: 6;
/* BYTE 118 */
	PN	pnPgtb;		/* Word 5.0 page table */
	PN	pnMacBkmk;	/* Mac of Bkmk stuff,
				    0 for pre word 5.0 */
	PN	pnFilename;	/* only used for Asv files */
	PN	pnRhtb;		/* running head table */
	PN	codepage;	/* if known, else 0 */
				/* codepage is usually 437 (U.S.),
				   or 850 (International). Other
				   localized character sets exist.
				   see your MS DOS or MS OS/2
				   documentation */
/* SIZE = 128 BYTES = 1 PAGE */
	};

/* Definitions to interpret FIB structure */
/* dty - document type */
#define dtyNormal	0	/* normal word document (i.e. typos.doc) */
#define dtyBuffer	1	/* glossary file (i.e. normal.gly) */
#define dtySsht		2	/* style sheet (i.e. normal.sty) */
#define dtyPrd		3	/* printer driver (i.e. epsonfx.prd) */

/* version - Word verion */
#define versionOld	0	/* Written by Word 4.0 or earlier version */
#define version5_0OEM	3	/* Written in Word 5.0 format by any program
				   other than Word 5.0. This will cause 
				   Word to ignore paragraph height and
				   treat the page table with a grain of
				   salt. */
#define version5_0MS	4	/* Written by Microsoft Word 5.0. If that
				   is not the name of your program, don't
				   set version = 4! */

FKP (Formatted disK Page - contains paragraph or character formatting)

struct FKP 		/* Formatted disK Page */
	{
	FC	fcFirst;	/* first fc which has formatting info here */
	char	rgb[123];	/* contains rgRun, rgPhe and FPROPS */
/* BYTE 127 */
	char	crun;		/* number of runs */
/* SIZE = 128 BYTES = 1 PAGE */	
	};

PAP (Paragraph Properties)

struct PAP		/* Paragraph properties */
	{
	BF	fStyled : 1;	/* If true, ignore all fields in this PAP
				   structure except stc, rhc,
				   fUseDivMarg, level, and fHidden.
				   Get the rest from the style sheet */
	BF	stc : 7;	/* if fStyled, this will hold a style 
				   code from 30 through 104 - see belowe */
	BF	jc : 2;		/* paragraph alignment - see below */
	BF	fKeep : 1;	/* keep together */
	BF	fKeepFollow : 1;/* keep follow */
	BF	fSideBySide : 1;/* side by side paragraph */
	BF	fUseDivMarg : 1;/* running head alignment -
				   0 = edge of paper, 1 = left margin */
	BF	: 2;
/* BYTE 2 */
	BF	stcNormChp : 7;	/* Style code from 30 through 104. If
				   paragraph is styled, this should be the 
				   same as stc. Otherwise, this should be
				   the paragraph variant to used to get 
				   formatting for characters attached to
				   Character Standard. */
	BF	: 1; 
	BF	level : 7;	/* outline level -
				   0 for body text, else just store the
				   outline level as displayed on the
				   status line */
	BF	fHidden : 1;	/* true if paragraph would be 
				   collapsed if displayed in
				   outline mode */
/* BYTE 4 */
	unsigned dxaRight;	/* right indent */
	unsigned dxaLeft;	/* left indent */
	int dxaLeft1;		/* first line indent */
/* BYTE 10 */
	unsigned dyaLine	/* line spacing - stored in twips,
				   but -80 means Auto */
	unsigned dyaBefore;	/* space before */
	unsigned dyaAfter; 	/* space after */
/* BYTE 16 */
	unsigned rhc : 4;	/* running head code - see below */
	unsigned btc : 2;	/* border type code - see below */
	unsigned bsc : 2;	/* border style code - see below */
	unsigned fBorderLeft : 1;	/* which borders to show? */
	unsigned fBorderRight : 1;	/* these flags are referred to */
	unsigned fBorderAbove : 1;	/* only when btc == btcLines */
	unsigned fBorderBelow : 1;
	unsigned bclr : 3;	/* border color -
					see clr definitions under CHP */
	unsigned wUnused1 : 1;
/* BYTE 18 */
	BF	shade : 7;	/* paragraph shading - 0 to 100% */
	BF	bUnused : 1;
/* BYTE 19 */
	BF	pcVert : 2;	/* vertical position - see below */
	BF	pcHorz : 2;	/* horizontal postion - see below */
	BF	sclr ; 3;	/* shading color
				   see clr definitions under CHP */
	BF	wUnused2 : 1;
/* BYTE 20 */
	int	dxaFromText;	/* distance from text in twips */
/* BYTE 22: Array for up to 19 tabs plus NULL entry */
	struct TBD rgtbd[itbdMax];	/* itbdMax is 20 */
/* This is the end of Word 4.0's PAP structure */

/* BYTE 102: Format Position measurements */
	/* Format Position -- absolute position properties */
	int	dxaAbs;		/* horizontal position - see below */
	int	dyaAbs;		/* vertical position - see below */
	int	dxaWidth;	/* paragraph frame width - see below */
/* SIZE = 108 BYTES */
	};

/* definitions for  src and stcNormChp (style codes) in PAP */
#define stcParaMin	30	/* Paragraph Standard */
#define stcFtnText	39	/* Footnote Text */ 
#define stcAnnText	87	/* Annotation Text */
#define stcHeadLev1	88	/* Heading Level 1 */
#define stcHeadLev2	89	/* Heading Level 2 */
#define stcHeadLev3	90	/* Heading Level 3 */
#define stcHeadLev4	91	/* Heading Level 4 */
#define stcHeadLev5	92	/* Heading Level 5 */
#define stcHeadLev6	93	/* Heading Level 6 */
#define stcHeadLev7	94	/* Heading Level 7 */
#define stcIndLev1	95	/* Index Level 1 */
#define stcIndLev2	96	/* Index Level 2 */
#define stcIndLev3	97	/* Index Level 3 */
#define stcIndLev4	98	/* Index Level 4 */
#define stcTabLev1	99	/* Table Level 1 */
#define stcTabLev2	100	/* Table Level 2 */
#define stcTabLev3	101	/* Table Level 3 */
#define stcTabLev4	102	/* Table Level 4 */
#define stcRunningHead	103	/* Running Head */

stc 31 through 38 represent paragraph variants 1 through 8 respectively
stc 40 through 86 represent paragraph variants 9 through 55 respectively

/* Definitions for jc (paragraph justification code) in PAP */
#define jcLeft		0	/* left alignment */
#define jcCenter	1	/* centered */
#define jcRight		2	/* right alignment */
#define jcBoth		3	/* justified */

/* Definitions for rhc (running head code) in PAP
   For complete rhc definition, add up relevant bits */
#define RHC_fBottom	1	/* bit 0: bottom running head, not top */
#define RHC_fOdd	2	/* bit 1: print on odd pages */
#define RHC_fEven	4	/* bit 2: print on even pages */
#define RHC_fFirst	8	/* bit 3: print on first page */

/* Definitions for btc (border type code) in PAP */
#define btcNone		0	/* no paragraph borders */
#define btcBox		1	/* all four paragraph borders */
#define btcLines	2	/* refer to fBorderRight, fBorderLeft,
				   fBorderAbove, and fBorderBelow */

/* Definitions for bsc (border style code) in PAP */
#define bscNormal	0	/* use single line drawing characters */
#define bscBold		1	/* use bold font for line drawing */
#define bscDouble	2	/* use double line drawing characters */
#define bscThick	3	/* use block characters for the borders */

/* Definitions for pcVert (Position "relative to" in vertical direction) */
#define pcVMargin	0	/* relative to margins */
#define pcVPage		1	/* relative to page */

/* Definitions for pcHorz (Position "relative to" in horizontal direction) */
#define pcHColumn	0	/* relative to column */
#define pcHMargin	1	/* relative to margins */
#define pcHPage		2	/* relative to page */

/* Definitions for dxaAbs (horizontal frame position) */
#define dxaAbsLeft	(-80)	/* Left */
#define dxaAbsCenter	(-81)	/* Centered */
#define dxaAbsRight	(-82)	/* Right */
#define dxaAbsInside	(-83)	/* Inside */
#define dxaAbsOutside 	(-83)	/* Outside */

/* Definitios for dyaAbs (vertical frame position) */
#define dyaAbsInline	(-70)	/* In line */
#define dyaAbsTop	(-71)	/* Top */
#define dyaAbsCenter	(-72)	/* Centered */
#define dyaAbsBottom	(-73)	/* Bottom */

/* Definitions for dxaWidth (frame width) */
#define dxaAbsWidthCol	(-90)	/* Single Column */
#define dxaAbsWidth2Col (-91)	/* Double Column */
#define dxaAbsWidthMarg	(-92)	/* Between Margins */
#define dxaAbsGrWidth	(-93)	/* Width of Graphic */

PGD (Page Descriptor - stored in plcfpgd)

struct PGD
	{
	BF	fUnk : 1;	/* if fTrue, redo this page next pagination */
	BF	: 1;
	BF	fPending : 1;	/* this page has footnotes which need to 
			 	   be continued to next page */
	BF	fSimpleLbs : 1;	/* this page has no APOs, is single column,
				   and running heads don't push the text.
				   This means it can be displayed in Show
				   Layout mode without doing much work. */
	BF	fPgnrestart : 1;/* next page page should be page 1 again */
	BF	fEmptyPage : 1;	/* this page is left totally empty in order
				   satisfy the section break code -
			 	   that is, if section must start on even or
				   odd page */
	BF	fAllFtn : 1;	/* this page contains only footnotes and
				   annotations */
	BF	fContinue : 1;	/* same as fPending */
	BF	bkc : 8;	/* section break code (see SEP) - not looked
				   at by anyone */
/* BYTE 2 */
	unsigned lnn;	/* line number of 1st line, -1 if no line numbering */
	int	cl;	/* count of of line into paragraph for 1st line */
	int 	pgn;		/* page number as printed */
	int	dcpDepend;	/* number of cp's on next page which might
				   effect THIS page if edited */
/* SIZE = 10 BYTES */
	};

PHE (Paragraph HEight - stored in FPAP FKP)

/* NOTE - dypLine, dypHeight are stored in printer specific units. dyp is
   usually measured in 1440ths of inch (twips)m or 1/300ths of an inch for
   certain laser printers. */
/* This information is valid only with a certian printer driver, style sheet,
   version of Word and set of options. Word stores necessary information in
   the FIB and DOP to know whether it can safely use this information when a
   file is reloaded. */

struct PHE
	{
	BF	fUnk : 1;	/* phe entry is invalid */
	BF	: 7;
	BF	clMac : 8;	/* number of lines in para, if known */
	BF	fDifflines : 1;	/* total height is known, 
				   lines are different */
	BF	dxaCol  : 15;	/* width of column when height was 
				   calculated */
	union
		{
		unsigned dypLine;   /* height of each line if !fDiffLines */
		unsigned dupHeight; /* height of entire paragraph if 
				       fDifflines */
		}
/* SIZE = 6 BYTES */
	};

PLCE (PLex of Cps stored in File)

/* This an abstract structure which can't be compiled as declared 
   since iMac is not a constant, and ITEM is an arbitrary structure
   whose size happens to be cb. */

struct PLCF
	{
	int		iMac;		/* number of items stored */
	int		cb;		/* size of struct ITEM in bytes */
	CP		rgcp[iMac + 1];	/* there are iMac + 1 CPs */
	struct ITEM 	rgItem[iMac];	/* there iMac items */
/* SIZE = 4 + 4 * (iMac + 1) + iMac * cb */
	};

RHD (Running Head Descriptor - stored in plcfrhd)

struct RHD
	{
	BF	fUnk : 8;	/* fTrue is running head has been edited since
				   last repag. If so, all applicable pages 
				   will have to be repaginated */
	BF	rhc : 8;	/* running head code at last pagination
				   (see definitions under PAP)
				   this rhc is used in conjunction with current
				   rhc (in PAP) to decide which pages to redo */
	unsigned dcpDepend;	/* number of bytes in paragraph - used by 
				   editing code to quickly decided whether to 
				   set fUnk */
/* size = 4 BYTES */
	};

SEDF (Section Descriptor - stored in setb)

struct SEDF	/* Section descriptor */
	{
	CP	cp;	/* cpLim of section */
	int	fn;	/* file number of properties -
			   ignore me: you're looking at the right file now */
	FC	fc;	/* location of properties in file or 
			   0xFFFFFFFF for default section properties */
/* SIZE = 10 BYTES */
	};

SEP (Section Properties)

struct SEP	/* Section properties */
	{
	BF	fStyled : 1;	/* if true, ignore all fields in this SEP
				   except stc and get properties from the
				   style sheet instead */
	BF	stc : 7;	/* if fStyled, this will hold a style 
				   code from 105 to 127 - see below */
	BF	bkc : 3;	/* page break code - see below /
	BF	nfcPgn : 3;	/* page number format code - see below */
	BF	lnc : 2;	/* line number restart code - see below */
/* BYTE 2 */
	unsigned yaPage;	/* page height */
	unsigned xaPage;	/* page width */
	unsigned pgnStart;	/* starting page number -
				   0 to continue page numbers from 
				   previous section */
	unsigned yaTop;		/* top margin */
/* BYTE 10 */
	unsigned dyaText;	/* height of text */
	unsigned xaLeft;	/* left margin */
	unsigned dxaText;	/* width of column */
	BF	rhc : 4;	/* set to zero */
	BF	fMirrorMargins : 1;	/* flip margins on even pages */ 
	BF	fLnn : 1;	/* print line numbers */
	BF	fAutoPgn : 1;	/* print page number on each page */
	BF	fEndFtns : 1;	/* put footnotes at end of section */
	BF	cColumns : 8;	/* number of columns */
	unsigned yaRH1;		/* header position from top of page */
/* BYTE 20 */
	unsigned yaRH2;		/* footer position from top of page */
	unsigned dxaColumns;	/* space between columns */
	unsigned dxaGutter;	/* gutter margin */
	unsigned yaPgn;		/* page number pos. from top of page */
	unsigned xaPgn;		/* page number pos. from left of page */
/* BYTE 30 */
	unsigned dxaLnn;	/* line number distance from text */
	char	nLnnMod;	/* line number increment -
				   example: if nLnnMod = 5, only
				   print numbers for lines whose number
				   would be divisible by 5. */
	BF	fHardMargTop:1; /* allow header to overlap text
				   if it goes too long. this is 
				   represented in the menu by putting
			 	   a hyphen before the top margin
				   measurement */
	BF	fHardMargBottom:1; /* allow footer to overlap text if
				   it is positioned too high. This is
				   represented in the menu by putting 
				   a hyphen before the bottom margin
				   measurement */
	BF	wReserved : 5;
/* SIZE = 34 BYTES */
	};

/* definitions for stc (style code) in SEP */
#define stcSectMin	105	/* Division Standard */

stc 106 through 127 represent division variants 1 through 21 respectively

/* definitions for bkc (page break code) in SEP */
#define bkcNoBreak	0	/* continue on same page 
				   as last division */
#define bkcNewColumn	1	/* start a new column */
#define bkcNewPage	2	/* start a new page */
#define bkcOddPage	3	/* make sure this division starts
				   on an odd page */
#define bkcEvenPage	4	/* make sure this division starts
				   on an even page */

/* definitions for nfcPgn (number format code for page numbers in SEP */
#define nfcArabic	0	/* standard number format */
#define	nfcUCRoman	1	/* uppercase roman numerals */
#define nfcLCRoman	2	/* lowercase roman numerals */
#define nfcUCLetter	3	/* uppercase letters in
				   alphabetical order */
#define nfcLCLetter	4	/* lowercase letters in	
				   alphabetical order */

/* definitions for lnc (line number continuation code) in SEP */
#define lncPerPage	0	/* start from line 1 on each page */
#define lncRestart	1	/* start at line 1 at the beginning of
				   this section, but don't restart
				   at the top of each page */
#define lncContinue	2	/* do not restart line numbers at the
				   beginning of this section or at
				   the beginning of each page */

RUN (RUN of properties - stored in FKP)

struct RUN /* Char or para run descriptor */
	{
	FC	fcLim;	/* last fc of run */
	int	bProps;	/* byte offset from rgb; if -1, standard props */
/* SIZE = 6 BYTES */
	};

ST (pascal STring - Used in STTB)

/* This an abstract structure which can't be compiled as declared
   since cch is not a constant */

typedef struct 
	{
	char	cch;
	rgch[cch];
	} ST:
/* SIZE = cch + 1 BYTES */

STTB (STring TaBle)

/* This an abstract structure which can't be compiled as declared
   since istMac is not a constant and STs vary in size */

struct STTB
	{
	int	istMac;
	ST	grpst[istMac - 1];  /* NOT a true array since STs do not
				       have a constant size. Also, the
				       strings are indexed from 1 to istMac
				       - 1. There is no string index by 0 */
	};

SQD (SeQuenc Descriptor - stored in plcfsqd)

struct SQD
	{
	int	iName;		/* index into string table for this sequence
				   name */
	BF	value : 13;	/* value of sequence at this point - only
				   only valid while printing the document */
	BF	action : 3;	/* action code - see below */
	};

/* definitions for action code in SQD */
#define sqdIncIns	0	/* increment sequence and print new value */
#define sqdInc		1	/* increment sequence but don't print value */
#define sqdSet		2	/* increment sequence value to sqd.value
				   (don't print value */

SQR (SeQuence Reference descriptor - stored in plcfsqr)

struct SQR
	{
	int 	iName;		/* index into string table for this sequence 	
				   name */
	int	iBkmkName;	/* index into string table for Bookmark Name */
	};

TB (TaBle)

/* This is an abstract structure which can't be compiled as declared 
   since iMac is not a constant, and ITEM is an arbitrary structure
   whose size happens to be cb. */

struct PLCF
	{
	int		iMac;		/* number of items stored */
	int		iMax;		/* number of items allocated 
					   in memory. this field is
					   meaningless when stored in
					   file. */
	struct ITEM	rgItem[Imac];	/* there iMac items */
	};