HIREx Data Specification - Transfer Files

RSA Hayward, JA Hogeterp, M Bosnjakovic

Up Overview

Information contained in HIREx databooks may need to be transferred to or from other databooks, other databases or other software products. To facilitate this, we have defined a file format to support the movement of information. This specification details the type and content of HIREx electronic transfer files and explains rules for interpretation of data transfer files.

Up File Type

A single computer file is used to transfer one or more records in and out of HIREx databooks. The file must have the following properties:

File name To ensure compatibility with 16 bit versions of Microsoft Windows (Windows 3.1, Windows for Workgroups), the Microsoft-DOS file naming convention must apply with no more than 8 characters in the name root and 3 characters in the file extension. By default, HIREx will look for and save to text files with the ".txt" extension.
Example: "transfer.txt"
File type Transfer files must be saved as ASCII text files. Carriage returns have significance and so transfer files should not have a default line length with default carriage returns. For most programs and text editors, this means that any line-wrap feature should be turned off.
File size There is no practical limit to file size. Transfer files could be many megabytes large.
Allowed characters The tilde (~, ASCII 126), pipe (|, ASCII 124) and carriage return (ASCII 13,10) have significance as delimiters and should be avoided


Up File Format

A "tagged" format is used for delimiting and describing information in a HIREx data transfer file. The "tags" within a file indicate which database in a HIREx databook a record should be placed and which fields should be filled within that record. Records, tags and fields are delimited from one another as follows:

Start of file The very first line of a HIREx transfer file is used to identify and describe the file. This line must not contain actual data; it is ignored during data imports to HIREx. A typical file header might look like this:

ENTITY~HIREx Transfer from HIRU databook on March 17, 1996, 1345, by Hayward~

A file header can be up to 255 characters in length. It must be terminated by a tilde + carriage return (ASCII 124,13,10). The first word of the header line is delimited from the rest of the line by a tilde (ASCII 124). This delimits a tag for the transfer file type. Supported file types currently include:

  • ENTITY
    These transfer files are used to import or export information about HIREx entities and any associated custom notes.
  • PRODUCT
    These transfer files are used to import or export information about HIREx products and any associated custom notes.

The file type name is case insensitive. Other types will be supported in the future (e.g., RELATIONSHIP, FOLDER).

Start of field Every field is preceded by a tag. Tags consist of one or more characters (case insensitive) and may contain spaces but may not contain delimiters (ASCII 10, 13, 124, 126). The tag begins with the very first character after the end-of-field delimiter and ends with the character preceding the first occurrence of a tilde (ASCII 124) on the same line. There are two types of tags:
  • Core dataset tags correspond to actual field names in a HIREx database or table. They are case-insensitive but must match HIREx field names in all other respects. "LASTNAME" is a HIREx core tag in the "address" table. These names are listed in other HIREx object descriptions (e.g., Entity Description).
  • Custom dataset tags are recognized for Entity and Product object types. They may correspond to existing "notes" types already defined for an entity or product. The "notes" fields contain information linked to a particular entity or product. HIREx allows the user to define an unlimited number of fields of this type. If, during data import, HIREx encounters a tag that does not match a core field or an existing custom field, it will create a new custom field (note type) with that name and store the data there.

For example:

FIRSTNAME~Robert~
LASTNAME~Hayward~

Tag delimiter Tags are delimited from field content by the first tilde (ASCII 124) appearing on the same line as the tag.
Field content Fields may contain any ASCII character except the following combinations:
  • 124,13,10 (tilde + carriage return + tilde)
  • 124,13,10,126,13,10 (tilde + carriage return + pipe + carriage return)

The following ASCII characters are discouraged because they display unpredictably on Internet and they may confuse other database products and import/export routines:

  • 124 (tilde)
  • 126 (pipe)
  • 34 (double quote)
  • 38 (single quote)
  • 94 (chevron)
Field order Fields may appear in any order.
Missing fields An transfer file need not contain ALL possible fields for a HIREx data object. During import, HIREx adds content to only those fields represented in a transfer file. During export, HIREx generates tags and content only for those fields in a HIREx data object that are non-null.
Field repeats If any tag (and therefore, field) is repeated before an end-of-record delimiter is reached, then HIREx will ADD the content of the repeated field to the existing content of that field. The extra content will be separated from the existing content by a semi-colon and space combination (ASCII 59,26). This feature can be useful for building a field of keywords, for example. HIREx does not generate repeated tags in an export file.
Field size Core fields may have a set field length (see data object specifications). If, during import; a field is larger than the fixed space allowed for it in HIREx, HIREx will accept only that number of characters it can accommodate.
Field delimiter As soon as a tilde, carriage return combination (ASCII 124,13,10) is encountered, the next character is examined. If it is not a pipe (ASCII 126), then it is assumed to be the first character of the tag for the next field.
Blank lines Blank lines (ASCII 13,10,13,10) are captured if part of the content of a field. Blank lines appearing immediately after a field delimiter or a record delimiter are ignored.
Record delimiter A tilde, carriage return, pipe and carriage return (ASCII 124,13,10,126,13,10) delimits one record from another.
End of file A tilde, carriage return, pipe and carriage return (ASCII 124,13,10,126,13,10) that is not followed by a field tag is considered to be the end of a transfer file..


Up Examples

HIREx Entity Export
ENTITY~HIREx export from HAYWARD databook on 19/03/96 at 1454~

Prefix~Dr.~

LastName~Hayward~

FirstName~Robert~

SuppName~RSA~

Suffix~MD, MPH~

Assistant~Lori Houghton~

Organization~McMaster University Health Sciences Centre~

Department~Department of Clinical Epidemiology and Biostatistics~

Position~Health Information Research~

Address.Type~PERSON~

Keyword2~General~

E_Mail~haywardr@fhs.mcmaster.ca~

Role~Assistant Professor~

URL~http://hiru.mcmaster.ca/hiru/people/rob.htm~

First1~1200 Main Street West~

Second1~Room 3H7c~

City1~Hamilton~

Region1~Ontario~

Country1~Canada~

Code1~L8N 3Z5~

First3~40 Bond Street South~

City3~Ontario~

Region3~Hamilton~

Country3~Canada~

Code3~L8S 1S7~

Ph1~(905) 525-9140x22060~

Ph2~(905) 525-9140x23297~

Ph3~(905) 546-0401~

Ph4~2311~

Ph5~(905) 527-2372~

Ph6~(905) 527-4358~

QuickPhone~-1~

Private~0~

MasterData~0~

Updated By~Robert Hayward~

Updated On~12/12/95~

SoundexL~H630~

SoundexF~R163~

SoundexO~M252~

UI~256~

|

HIREx PRODUCT Import
PRODUCT~Reference Manager HIREx export file format~

REF ID~1912~

REF TYPE~Article~

AUTHORS1~Tunis SR, Hayward RSA, Wilson MC, Rubin HR, Bass EB, Johnston M, Steinberg EP.~

TITLE1~Internists' attitudes about clinical practice guidelines.~

PUBDATE1~1994~

SOURCE~Tunis SR, Hayward RSA, Wilson MC, Rubin HR, Bass EB, Johnston M, Steinberg EP. Internists' attitudes about clinical practice guidelines. Ann Intern Med 120. 956-63 (1994).~

NOTES1~Comment in: Ann Intern Med 1994 Jun 1;120(11):966-8, Comment in: Ann Intern Med 1994 Nov 1;121(9):725-6 OBJECTIVE: To assess internists' familiarity with, confidence in, and attitudes about practice guidelines issued by various organizations. DESIGN: Cross-sectional, self-administered survey. PARTICIPANTS: Questionnaires were mailed to a stratified random sample of 2600 members of the American College of Physicians (ACP) in 1992. Of the 2513 internists who met our eligibility criteria, 1513 responded (60%). MEASUREMENTS AND RESULTS: Familiarity with guidelines varied from 11% of responders for the ACP guideline on exercise treadmill testing to 59% of responders for the National Cholesterol Education Program guideline. Confidence was reported in ACP guidelines by 82% of responders but by only 6% for Blue Cross and Blue Shield guidelines. Subspecialists had greatest confidence in guidelines developed by their own subspecialty organizations. It was thought that guidelines would improve the quality of health care by 70% of responders, increase health care costs by 43%, be used to discipline physicians by 68%, and make practice less satisfying by 34%. More favorable attitudes were held by internists who were paid a fixed salary, saw patients for less than 20 hours per week, had recently graduated from medical school, or were not in private practice. CONCLUSIONS: Although most ACP members studied recognized the potential benefits of practice guidelines, many were concerned about possible effects on clinical autonomy, health care costs, and satisfaction with clinical practice~

KEYWORDS1~INTERNIST; Attitude; Questionnaires; Practice Guidelines; Survey; clinical; CV; Attitude of Health Personnel; cost-benefit analysis; Cross-Sectional Studies; Human; Internal Medicine/sn [Statistics & Numerical Data]; Knowledge,Attitudes,Practice; physician's practice patterns; Physicians,Family/sn [Statistics & Numerical Data]; Quality of Health Care; Support,Non-U.S.Gov't; Support,U.S.Gov't,P.H.S. confidence; Organizations; DESIGN; American College of Physicians; Physicians; Criteria; RESULTS; Exercise; Cholesterol; education; blue cross; blue shield; Quality; Health; health care; CARE; health care costs; Patients; schools; private practice~

REPRINT~In File~

UI~94226435~

Datasource~Johns Hopkins University, Baltimore, Maryland~

MISC2~Journal Article; 5~

|


Up Protocol

Export

HIREx data exports in transfer file format take the form specified above. Users may select one or more entities and/or products to export. At present only the core data fields are exported. Support for export of custom data fields is under development. When implemented, this will happen as follows:

Import

It is possible that data imported to a HIREx databook will duplicate information already contained in HIREx records. Accordingly, both entity and product import routines use duplicate resolution functions. At present these simply check for entities with the same last name and products with the same unique identifier. More sophisticated duplicate detection algorithms will be implemented in the future. A HIREx import can be "supervised" so that all possible import data is reviewed in its HIREx form before it is rejected or accepted to the HIREx databook.


Copyright © 1996 Health Information Research Unit; Last modified: April 9, 1996.