The style description is stored in an STD structure as follows:
// STD: STyle Definition
// The STD contains the entire definition of a style.
// It has two parts, a fixed-length base (cbSTDBase bytes long)
// and a variable length remainder holding the name, and the upx and upe
// arrays (a upx and upe for each type stored in the style, std.cupx)
// Note that new fields can be added to the BASE of the STD without
// invalidating the file format, because the STSHI contains the length
// that is stored in the file. When reading STDs from an older version,
// new fields will be zero.
typedef struct _STD
{
// Base part of STD:
ushort sti : 12; /* invariant style identifier */
ushort fScratch : 1; /* spare field for any temporary use,
always reset back to zero! */
ushort fInvalHeight : 1; /* PHEs of all text with this style are wrong */
ushort fHasUpe : 1; /* UPEs have been generated */
ushort fMassCopy : 1; /* std has been mass-copied; if unused at
save time, style should be deleted */
ushort sgc : 4; /* style type code */
ushort istdBase : 12; /* base style */
ushort cupx : 4; /* # of UPXs (and UPEs) */
ushort istdNext : 12; /* next style */
ushort bchUpe; /* offset to end of upx's, start of upe's */
ushort fAutoRedef : 1; /* auto redefine style when appropriate */
ushort fHidden : 1; /* hidden from UI? */
ushort : 14; /* unused bits */
// Variable length part of STD:
XCHAR xstzName[2]; /* sub-names are separated by chDelimStyle */
/* char grupx[]; */
/* the UPEs are not stored on the file; they are a cache of the based-on
chain */
/* char grupe[]; */
} STD;
The cb preceding each STD is the length of the data, which includes all of the STD except the grupe array (which is derived after the file is read in, by building each UPE from the base style UPE plus the exceptions in the UPX.) A cb of zero indicates an empty slot in the style array, i.e. no style has that istd. Note that the STD structure may be longer or shorter than the one stored in the file, stshi.cbSTDBaseInFile indicates the length of the base of the STD (up to stzName) as stored in the file. The stylesheet reader routine has to take this into account.
The variable-length part of the STD actually has three variable-length subparts, the xstzName, the grupx, and the grupe. Since this doesn't fit well into a C structure declaration, some processing is needed to figure out where one part stops and the next part begins. An important note is that all variable-length parts and subparts of the STD begin on EVEN-BYTE OFFSETS within the STD, even if the length of the preceding variable-length part was odd.
std.sti: The sti is an identifier which built-in style this is, or stiUser for a user-defined style. An sti is intended to be permanent through versions of Word, although new sti's may be added in new versions. The sti definitions are:
// standard sti codes - these are invariant identifiers for built-in styles
// and must remain the same (i.e. don't renumber them, or old files will be
// messed up.)
// NOTE: sti and istd are the same for Normal and level styles
// If you want to define a new built-in style:
// 1) Decide if you really need one--it will exist in all future versions!
// 2) Add a new sti below. You can take the first available slot.
// 3) Change stiMax, and stiPapMax or stiChpMax
// 4) Add entry to _dnsti, and the two ids's in strman.pp
// 5) Add case in GetDefaultUpdForSti
// 6) Change cstiMaxBuiltinDependents if necessary
// If you want to change the definition of a built-in style
// 1) In order to make WinWord 2 documents that use the style look like
// they did in WinWord 2, add a case in GetDefaultUpdForSti to handle
// fOldDef. This definition will be used when converting WinWord 2
// stylesheets.
// 2) If you change the name of a built-in style, increment nVerBuiltInNames
#define stiNormal 0 // 0x0000
#define stiLev1 1 // 0x0001
#define stiLev2 2 // 0x0002
#define stiLev3 3 // 0x0003
#define stiLev4 4 // 0x0004
#define stiLev5 5 // 0x0005
#define stiLev6 6 // 0x0006
#define stiLev7 7 // 0x0007
#define stiLev8 8 // 0x0008
#define stiLev9 9 // 0x0009
#define stiLevFirst stiLev1
#define stiLevLast stiLev9
#define stiIndex1 10 // 0x000A
#define stiIndex2 11 // 0x000B
#define stiIndex3 12 // 0x000C
#define stiIndex4 13 // 0x000D
#define stiIndex5 14 // 0x000E
#define stiIndex6 15 // 0x000F
#define stiIndex7 16 // 0x0010
#define stiIndex8 17 // 0x0011
#define stiIndex9 18 // 0x0012
#define stiIndexFirst stiIndex1
#define stiIndexLast stiIndex9
#define stiToc1 19 // 0x0013
#define stiToc2 20 // 0x0014
#define stiToc3 21 // 0x0015
#define stiToc4 22 // 0x0016
#define stiToc5 23 // 0x0017
#define stiToc6 24 // 0x0018
#define stiToc7 25 // 0x0019
#define stiToc8 26 // 0x001A
#define stiToc9 27 // 0x001B
#define stiTocFirst stiToc1
#define stiTocLast stiToc9
#define stiNormIndent 28 // 0x001C
#define stiFtnText 29 // 0x001D
#define stiAtnText 30 // 0x001E
#define stiHeader 31 // 0x001F
#define stiFooter 32 // 0x0020
#define stiIndexHeading 33 // 0x0021
#define stiCaption 34 // 0x0022
#define stiToCaption 35 // 0x0023
#define stiEnvAddr 36 // 0x0024
#define stiEnvRet 37 // 0x0025
#define stiFtnRef 38 // 0x0026 char style
#define stiAtnRef 39 // 0x0027 char style
#define stiLnn 40 // 0x0028 char style
#define stiPgn 41 // 0x0029 char style
#define stiEdnRef 42 // 0x002A char style
#define stiEdnText 43 // 0x002B
#define stiToa 44 // 0x002C
#define stiMacro 45 // 0x002D
#define stiToaHeading 46 // 0x002E
#define stiList 47 // 0x002F
#define stiListBullet 48 // 0x0030
#define stiListNumber 49 // 0x0031
#define stiList2 50 // 0x0032
#define stiList3 51 // 0x0033
#define stiList4 52 // 0x0034
#define stiList5 53 // 0x0035
#define stiListBullet2 54 // 0x0036
#define stiListBullet3 55 // 0x0037
#define stiListBullet4 56 // 0x0038
#define stiListBullet5 57 // 0x0039
#define stiListNumber2 58 // 0x003A
#define stiListNumber3 59 // 0x003B
#define stiListNumber4 60 // 0x003C
#define stiListNumber5 61 // 0x003D
#define stiTitle 62 // 0x003E
#define stiClosing 63 // 0x003F
#define stiSignature 64 // 0x0040
#define stiNormalChar 65 // 0x0041 char style
#define stiBodyText 66 // 0x0042
#define stiBodyText2 67 // 0x0043
#define stiListCont 68 // 0x0044
#define stiListCont2 69 // 0x0045
#define stiListCont3 70 // 0x0046
#define stiListCont4 71 // 0x0047
#define stiListCont5 72 // 0x0048
#define stiMsgHeader 73 // 0x0049
#define stiSubtitle 74 // 0x004A
#define stiSalutation 75 // 0x004B
#define stiDate 76 // 0X004C
#define stiBodyText1I 77 // 0x004D
#define stiBodyText1I2 78 // 0x004E
#define stiNoteHeading 79 // 0x004F
#define stiBodyText2 80 // 0x0050
#define stiBodyText3 81 // 0x0051
#define stiBodyTextInd2 82 // 0x0052
#define stiBodyTextInd3 83 // 0x0053
#define stiBlockQuote 84 // 0x0054
#define stiHyperlink 85 // 0x0055 char style
#define stiHyperlinkFollowed 86 // 0x0056 char style
#define stiStrong 87 // 0x0057 char style
#define stiEmphasis 88 // 0x0058 char style
#define stiNavPane 89 // 0x0059 char style
#define stiPlainText 90 // 0x005A
#define stiMax 91 // number of defined sti's
#define stiUser 0x0ffe // user styles are distinguished by name
#define stiNil 0x0fff // max for 12 bits
See below for the names of these styles.
std.stc: The type of each style is indicated by std.sgc. The two types currently in use are:
sgcPara 1 // A paragraph style
sgcChp 2 // A character style
More style types may exist in the future, so styles of an unknown type should be discarded.
std.istdBase: The style that this style is based on. A style is always based on another style or the null style (istdNil). Following a "chain" of based-on styles will always end at the null style, because a based-on chain cannot have a loop in it. A style can have up to 11 "ancestors" in its based-on chain, including the null style. A style's definition is built up from the style that it is based on. See std.cupx, std.grupx, std.grupe.
std.istdNext: The style that should be applied after this one. For a paragraph style, this is the style that is applied when Enter is pressed at the end of a paragraph. For a character style, the next style is essentially ignored, but should be the same as the current style.
std.xstzName: The name of the style, including aliases. The name is stored as an xstz (preceded by a length byte, followed by a null-terminator.) A style name can contain multiple "aliases", separated by commas. Aliases are alternate names for the same style (e.g. a style named "a,b,c" has three aliases, and can be referred to by "a", "b", or "c", or any combination.) WinWord 2.x did not have aliases, but MacWord 5.x did. If a style is a built-in style, the built-in stylename is always stored first.
All names (and aliases) must be unique within a stylesheet (e.g. styles "a,b" and "b,c" should not exist in the same stylesheet, as "b" matches multiple stylenames.)
A stylename (including all its aliases and comma separators) can be up to 253 characters long. So the xstz format of that name can be up to 255 characters. Stylenames are case sensitive.
The built-in stylenames (corresponding to each sti above) are defined for each language version of Word. For the USA, the names are:
// These are the names of the built-in styles as we want to present them
// to the user.
Normal
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Heading 7
Heading 8
Heading 9
Index 1
Index 2
Index 3
Index 4
Index 5
Index 6
Index 7
Index 8
Index 9
TOC 1
TOC 2
TOC 3
TOC 4
TOC 5
TOC 6
TOC 7
TOC 8
TOC 9
Normal Indent
Footnote Text
Annotation Text
Header
Footer
Index Heading
Caption
Table of Figures
Envelope Address
Envelope Return
Footnote Reference
Annotation Reference
Line Number
Page Number
Endnote Reference
Endnote Text
Table of Authorities
Macro Text
TOA Heading
List
List 2
List 3
List 4
List 5
List Bullet
List Bullet 2
List Bullet 3
List Bullet 4
List Bullet 5
List Number
List Number 2
List Number 3
List Number 4
List Number 5
Title
Closing
Signature
Default Paragraph Font
Body Text
Body Text Indent
List Continue
List Continue 2
List Continue 3
List Continue 4
List Continue 5
Message Header
Subtitle
Salutation
Date
Body Text First Indent
Body Text First Indent 2
Note Heading
Body Text 2
Body Text 3
Body Text Indent 2
Body Text Indent 3
Block Text
Hyperlink
Followed Hyperlink
Strong
Emphasis
Document Map
Plain Text
std.cupx: This is the number of UPXs in the std.grupx array. See below.
std.grupx: This is an array of variable-length UPXs, with std.cupx UPXs in the array. This array begins after the variable-length xstzName field, at the next even-byte offset within the STD. A UPX (Universal Property eXception) describes the difference in formatting of this style as compared to its based-on style. The UPX structure looks like this:
typedef union _UPX
{
struct
{
uchar grpprl[cbMaxGrpprlStyleChpx];
} chpx;
struct
{
ushort istd;
uchar grpprl[cbMaxGrpprlStylePapx];
} papx;
uchar rgb[1];
} UPX;
Each UPX stored in a file is not a complete UPX, rather it is a UPX with all trailing zero bytes lopped off, and preceded by a ushort length field. So it is stored like:
Field Size Comment
cbUPX 2 bytes size of the following UPX structure
UPX (cbUPX) Nonzero prefix of a UPX structure
Each UPX begins on an even-byte offset within the STD, even if the length of the previous UPX (cbUPX) was odd.
The meaning of each UPX depends on the style type (std.sgc). For a paragraph style, std.cupx is 2. The first UPX is a paragraph UPX (UPX.papx) and the second UPX is a character UPX (UPX.chpx). For a character style, std.cupx is 1, and that UPX is a character UPX (UPX.chpx). Note that new UPXs may be added in the future, so std.cupx might be larger than expected. Any UPXs past those expected should be discarded.
The grpprl within each UPX contains the differences of this property type for this style from the UPE of that property type for the based on style. For example, if two paragraph styles, A and B, were identical except that B was bold where A was not, and B was based on A, B would have two UPXs, where the paragraph UPX would have an empty grpprl, and the character UPX would have a bold sprm in the grpprl. Thus B looks just like A (since B is based on A), with the exception that B is bold.
std.grupe: This is an array (group) of variable-length UPEs. These are not stored in the file! Rather, they are constructed using the std.istdBase and std.grupx fields. A UPE (Universal Property Expansion) describes the "end-result" of the property formatting, i.e. what the style looks like. The UPE structure is the non-zero prefix of a UPD structure. The UPD structure looks like this:
typedef union _UPD
{
PAP pap;
CHP chp;
struct
{
ushort istd;
uchar cbGrpprl;
uchar grpprl[cbMaxGrpprlStyleChpx];
} chpx;
} UPD;
The std.grupe and std.grupx arrays are similar: there is one UPE for each UPX, and internally they are stored similarly (a length ushort followed by a non-zero prefix), though remember that the UPEs are not stored in the file. The meaning of each UPE depends on the style type (std.sgc). For a paragraph style, the first UPE is a PAP (UPE.pap). The second UPE is a CHP (UPE.chp). For a character style, the first UPE is a CHPX (UPE.chpx).
The UPEs for a style are constructed by taking the UPEs from the based-on style, and applying the UPXs to them. Obviously, if the UPEs for the based-on style haven't yet been constructed, that style's UPE needs to be constructed first. Eventually by following the based-on chain, a style will be based on the null style (istdNil). The UPEs for the null style are predefined:
The UPE.pap for the null style is all zeros, except fWidowControl which is 1, dyaLine which is 240, and fMultLinespace which is 1.
The UPE.chp for the null style is all zeros, except istd which is 10 (istdNormalChar), hps which is 20, lid which is 0x0400, and ftc which is set to the STSHI.ftcStandardChpStsh.
The UPE.chpx for the null style has an istd of zero, a cbGrpprl of zero (and an empty grpprl).
So, for a paragraph style, the first UPE is a UPE.pap. It can be constructed by starting the with first UPE from the based-on style (std.istdBase), and then applying the first UPX (UPX.papx) in std.grupx to that UPE. To apply a UPX.papx to a UPE.pap, set UPE.pap.istd equal to UPX.papx.istd, and then apply the UPX.papx.grpprl to UPE.pap. Similarly, the second UPE is a UPE.chp. It can be constructed by starting with the second UPE from the based-on style, and then applying the second UPX (UPX.chpx) in std.grupx to that UPE. To apply a UPX.chpx to a UPE.chp, apply the UPX.chpx.grpprl to UPE.chp. Note that a UPE.chp for a paragraph style should always have UPE.chp.istd == istdNormalChar.
For a character style, the first (and only) UPE (a UPE.chpx) can be constructed by starting with the first UPE from the based-on style (std.istdBase), and then applying the first UPX (UPX.chpx) in std.grupx to that UPE. To apply a UPX.chpx to a UPE.chpx, take the grpprl in UPE.chpx.grpprl (which has a length of UPE.chpx.cbGrpprl) and merge the grpprl in UPX.chpx.grpprl into it. Merging grpprls is a tricky business, but for character styles it is easy because no prls in character style grpprls should interact with each other. Each prl from the source (the UPX.chpx.grpprl) should be inserted into the destination (the UPE.chpx.grpprl) so that the sprm of each prl is in increasing order, and any prls that have the same sprm are replaced by the prl in the source. UPE.chpx.cbGrpprl is then set to the length of resulting grpprl, and UPE.chpx.istd is set to the style's istd.