Outlook MSG file format --------------------------------------------- Date: 31.03.2003 Email: peter.fiskerstrand [at] netcom.no Feel free to send me comments about this doc. --------------------------------------------- MSG files are outlook messages saved as files. They are saved as "COM stuctured storage OLE2 compound documents" or "DocFile", which is the same technique used by Word, Excel and many many more. To view the contents of a "DocFile", you can use DFVIEW.EXE (shipped with MS Visual C++ 6.0). Or you can decode it yourself. This text does not help you to do that, but assumes you already know about compound files. More info at http://www.wotsit.org If you open a .msg file you will see many streams. Here's an example: __nameid_version1.0 __substg1.0_00020102 __substg1.0_00030102 __substg1.0_00040102 __substg1.0_10100102 __substg1.0_001A001E __substg1.0_0037001E __substg1.0_10090102 __substg1.0_3FF8001E __substg1.0_3FF90102 __properties_version1.0 __recip_version1.0_#00000000 ... __attach_version1.0_#00000000 ... __attach_version1.0_#00000001 ... Nameid, recip and attach are "folders" which contain substgs. Properties contains some binary data. Attach contains an attachment (a file, a picture, a new mail message etc.). Recip contains information about a recipient. The #00000000 is just the count. If you have five attachments, they are numbered 0-4. Each substg contains a piece of information. The first four of the eight digits at the end tells you what kind of information this is. (Property). The last four digits tells you the type (binary, ascii, unicode etc.) If you open a substg that has 001E at the end, you wil see plain ascii text inside. 0102 means binary information. There are other types, but they are rarely used. Back to the first four digits, the property types: 001A means Message Class. It can contain one of the following: IPM.Note <- a regular e-mail IPM.Contact <- a contact (name, address, phone etc.) IPM.Post <- a post-it note IPM.Activity <- a calendar event IPM.Task <- a task or special cases like: IPM.Note.Rules.OofTemplate.Microsoft IPM.TaskRequest.Accept Here are some other property types (some of them are guesswork): 0x001A: Message class 0x0037: Subject 0x003D: Subject prefix 0x0040: Received by name 0x0042: Sent repr name 0x0044: Rcvd repr name 0x004D: Org author name 0x0050: Reply rcipnt names 0x005A: Org sender name 0x0064: Sent repr adrtype 0x0065: Sent repr email 0x0070: Topic 0x0075: Rcvd by adrtype 0x0076: Rcvd by email 0x0077: Repr adrtype 0x0078: Repr email 0x007d: Message header 0x0C1A: Sender name 0x0C1E: Sender adr type 0x0C1F: Sender email 0x0E02: Display BCC 0x0E03: Display CC 0x0E04: Display To 0x0E1D: Subject (normalized) 0x0E28: Recvd account1(?) 0x0E29: Recvd account2(?) 0x1000: Message body <- This is the message body 0x1008: RTF sync body tag 0x1035: Message ID (?) 0x1046: Sender email(?) 0x3001: Display name 0x3002: Address type 0x3003: Email address 0x39FE: 7-bit email (?) 0x39FF: 7-bit display name //Attachments (37xx): 0x3701: Attachment data <- This is the binary attachment 0x3703: Attach extension 0x3704: Attach filename 0x3707: Attach long filenm 0x370E: Attach mime tag 0x3712: Attach ID (?) //Address book (3Axx): 0x3A00: Account 0x3A02: Callback phone no 0x3A05: Generation 0x3A06: Given name 0x3A08: Business phone 0x3A09: Home phone 0x3A0A: Initials 0x3A0B: Keyword 0x3A0C: Language 0x3A0D: Location 0x3A11: Surname 0x3A15: Postal address 0x3A16: Company name 0x3A17: Title 0x3A18: Department 0x3A19: Office location 0x3A1A: Primary phone 0x3A1B: Business phone 2 0x3A1C: Mobile phone 0x3A1D: Radio phone no 0x3A1E: Car phone no 0x3A1F: Other phone 0x3A20: Transmit dispname 0x3A21: Pager 0x3A22: User certificate 0x3A23: Primary Fax 0x3A24: Business Fax 0x3A25: Home Fax 0x3A26: Country 0x3A27: Locality 0x3A28: State/Province 0x3A29: Street address 0x3A2A: Postal Code 0x3A2B: Post Office Box 0x3A2C: Telex 0x3A2D: ISDN 0x3A2E: Assistant phone 0x3A2F: Home phone 2 0x3A30: Assistant 0x3A44: Middle name 0x3A45: Dispname prefix 0x3A46: Profession 0x3A48: Spouse name 0x3A4B: TTYTTD radio phone 0x3A4C: FTP site 0x3A4E: Manager name 0x3A4F: Nickname 0x3A51: Business homepage 0x3A57: Company main phone 0x3A58: Childrens names 0x3A59: Home City 0x3A5A: Home Country 0x3A5B: Home Postal Code 0x3A5C: Home State/Provnce 0x3A5D: Home Street 0x3A5F: Other adr City 0x3A60: Other adr Country 0x3A61: Other adr PostCode 0x3A62: Other adr Province 0x3A63: Other adr Street 0x3A64: Other adr PO box 0x3FF7: Server (?) 0x3FF8: Creator1 (?) 0x3FFA: Creator2 (?) 0x3FFC: To email (?) 0x403D: To adrtype(?) 0x403E: To email (?) 0x5FF6: To (?)