Contents | Prev | Next | Index | The JavaTM Virtual Machine Specification |
CHAPTER 4
This chapter describes the Java Virtual Machine class
file format. Each class
file
contains one Java type, either a class or an interface. Compliant Java Virtual
Machine implementations must be capable of dealing with all class
files that conform to the specification provided by this book.
A class
file consists of a
stream of 8-bit bytes. All 16-bit, 32-bit, and 64-bit quantities are
constructed by reading in two, four, and eight consecutive 8-bit bytes,
respectively. Multibyte data items are always stored in big-endian
order, where the high bytes come first. In Java, this format is
supported by inter-faces java.io.DataInput
and java.io.DataOutput
and classes such as java.io.DataInputStream
and java.io.DataOutputStream
.
This chapter defines its own set of data types representing Java class
file data: The types u1
, u2
, and u4
represent an unsigned one-, two-, or four-byte quantity, respectively. In Java, these types may be read by methods such as readUnsignedByte
, readUnsignedShort
, and readInt
of the interface java.io.DataInput
.
The Java class
file format is
presented using pseudostructures written in a C-like structure
notation. To avoid confusion with the fields of Java Virtual Machine
classes and class instances, the contents of the structures describing
the Java class
file format are referred to as items. Unlike the fields of a C structure, successive items are stored in the Java class
file sequentially, without padding or alignment.
Variable-sized tables, consisting of variable-sized items, are used in several class
file structures. Although we will use C-like array syntax to refer to
table items, the fact that tables are streams of varying-sized
structures means that it is not possible to directly translate a table
index into a byte offset into the table.
Where we refer to a data structure as an array, it is literally an array.
class
file contains a single ClassFile
structure:
ClassFile {
The items in theu4 magic;
u2 minor_version;
u2 major_version;
u2 constant_pool_count;
cp_info constant_pool[constant_pool_count-1];
u2 access_flags;
u2 this_class;
u2 super_class;
u2 interfaces_count;
u2 interfaces[interfaces_count];
u2 fields_count;
field_info fields[fields_count];
u2 methods_count;
method_info methods[methods_count];
u2 attributes_count;
attribute_info attributes[attributes_count];
}
ClassFile
structure are as follows:
Theminor_version, major_versionmagic
item supplies the magic number identifying theclass
file format; it has the value0xCAFEBABE
.
The values of theconstant_pool_countminor_version
andmajor_version
items are the minor and major version numbers of the compiler that produced thisclass
file. An implementation of the Java Virtual Machine normally supportsclass
files having a given major version number and minor version numbers0
through some particularminor_version
.If an implementation of the Java Virtual Machine supports some range of minor version numbers and a
class
file of the same major version but a higher minor version is encountered, the Java Virtual Machine must not attempt to run the newer code. However, unless the major version number differs, it will be feasible to implement a new Java Virtual Machine that can run code of minor versions up to and including that of the newer code.A Java Virtual Machine must not attempt to run code with a different major version. A change of the major version number indicates a major incompatible change, one that requires a fundamentally different Java Virtual Machine.
In Sun's Java Developer's Kit (JDK) 1.0.2 release, documented by this book, the value of
major_version
is45
. The value ofminor_version
is3
. Only Sun may define the meaning of newclass
file version numbers.
The value of theconstant_pool[]constant_pool_count
item must be greater than zero. It gives the number of entries in theconstant_pool
table of theclass
file, where theconstant_pool
entry at index zero is included in the count but is not present in theconstant_pool
table of the class file. Aconstant_pool
index is considered valid if it is greater than zero and less thanconstant_pool_count
.
Theaccess_flagsconstant_pool
is a table of variable-length structures (§4.4) representing various string constants, class names, field names, and other constants that are referred to within theClassFile
structure and its substructures.The first entry of the
constant_pool
table,constant_pool[0]
, is reserved for internal use by a Java Virtual Machine implementation. That entry is not present in theclass
file. The first entry in theclass
file isconstant_pool[1]
.Each of the
constant_pool
table entries at indices1
throughconstant_pool_count-1
is a variable-length structure (§4.4) whose format is indicated by its first "tag" byte.
The value of thethis_classaccess_flags
item is a mask of modifiers used with class and interface declarations. Theaccess_flags
modifiers are shown in Table 4.1.An interface is distinguished by its
ACC_INTERFACE
flag being set. IfACC_INTERFACE
is not set, this class file defines a class, not an interface.Interfaces may only use flags indicated in Table 4.1 as used by interfaces. Classes may only use flags indicated in Table 4.1 as used by classes. An interface is implicitly
abstract
(§2.13.1); itsACC_ABSTRACT
flag must be set. An interface cannot befinal
; its implementation could never be completed (§2.13.1) if it were, so it could not have itsACC_FINAL
flag set.The flags
ACC_FINAL
andACC_ABSTRACT
cannot both be set for a class; the implementation of such a class could never be completed (§2.8.2).The setting of the
ACC_SUPER
flag directs the Java Virtual Machine which of two alternative semantics for its invokespecial instruction to express; it exists for backward compatibility for code compiled by Sun's older Java compilers. All new implementations of the Java Virtual Machine should implement the semantics for invokespecial documented in Chapter 6, "Java Virtual Machine Instruction Set." All new compilers to the Java Virtual Machine's instruction set should set theACC_SUPER
flag. Sun's older Java compilers generateClassFile
flags withACC_SUPER
unset. Sun's older Java Virtual Machine implementations ignore the flag if it is set.All unused bits of the
access_flags
item, including those not assigned in Table 4.1, are reserved for future use. They should be set to zero in generatedclass
files and should be ignored by Java Virtual Machine implementations.
The value of thesuper_classthis_class
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Class_info
(§4.4.1) structure representing the class or interface defined by thisclass
file.
For a class, the value of theinterfaces_countsuper_class
item either must be zero or must be a valid index into theconstant_pool
table. If the value of thesuper_class
item is nonzero, theconstant_pool
entry at that index must be aCONSTANT_Class_info
(§4.4.1) structure representing the superclass of the class defined by thisclass
file. Neither the superclass nor any of its superclasses may be afinal
class.If the value of
super_class
is zero, then thisclass
file must represent the classjava.lang.Object
, the only class or interface without a superclass.For an interface, the value of
super_class
must always be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Class_info
structure representing the classjava.lang.Object
.
The value of theinterfaces[]interfaces_count
item gives the number of direct superinterfaces of this class or interface type.
Each value in thefields_countinterfaces
array must be a valid index into theconstant_pool
table. Theconstant_pool
entry at each value ofinterfaces[
i]
, where0
£ i <interfaces_count
, must be aCONSTANT_Class_info
(§4.4.1) structure representing an interface which is a direct superinterface of this class or interface type, in the left-to-right order given in the source for the type.
The value of thefields[]fields_count
item gives the number offield_info
structures in thefields
table. Thefield_info
(§4.5) structures represent all fields, both class variables and instance variables, declared by this class or interface type.
Each value in themethods_countfields
table must be a variable-lengthfield_info
(§4.5) structure giving a complete description of a field in the class or interface type. Thefields
table includes only those fields that are declared by this class or interface. It does not include items representing fields that are inherited from superclasses or superinterfaces.
The value of themethods[]methods_count
item gives the number ofmethod_info
structures in themethods
table.
Each value in theattributes_countmethods
table must be a variable-lengthmethod_info
(§4.6) structure giving a complete description of and Java Virtual Machine code for a method in the class or interface.The
method_info
structures represent all methods, both instance methods and, for classes, class (static
) methods, declared by this class or interface type. Themethods
table only includes those methods that are explicitly declared by this class. Interfaces have only the single method<clinit>
, the interface initialization method (§3.8). Themethods
table does not include items representing methods that are inherited from superclasses or superinterfaces.
The value of theattributes[]attributes_count
item gives the number of attributes (§4.7) in theattributes
table of this class.
Each value of theattributes
table must be a variable-length attribute structure. AClassFile
structure can have any number of attributes (§4.7) associated with it.The only attribute defined by this specification for the
attributes
table of aClassFile
structure is theSourceFile
attribute (§4.7.2).A Java Virtual Machine implementation is required to silently ignore any or all attributes in the
attributes
table of aClassFile
structure that it does not recognize. Attributes not defined in this specification are not allowed to affect the semantics of theclass
file, but only to provide additional descriptive information (§4.7.1).
class
file structures are always represented in a fully
qualified form (§2.7.9). These class names are always represented as
CONSTANT_Utf8_info
(§4.4.7) structures, and they are referenced from those
CONSTANT_NameAndType_info
(§4.4.6) structures that have class names as part of
their descriptor (§4.3), as well as from all CONSTANT_Class_info
(§4.4.1) structures.
For historical reasons the exact syntax of fully qualified class names that appear in class
file structures differs from the familiar Java fully qualified class name documented in §2.7.9. In the internal form, the ASCII periods ('
.'
) that normally separate the identifiers (§2.2) that make up the fully qualified name are replaced by ASCII forward slashes ('/'
). For example, the normal fully qualified name of class Thread
is java.lang.Thread
. In the form used in descriptors in class
files, a reference to the name of class Thread
is implemented using a CONSTANT_Utf8_info
structure representing the string "java/lang/Thread"
.
FieldType
FieldType
BaseType
BObjectType:
L <classname> ;ArrayType:
[ ComponentTypeThe characters of BaseType, the L and ; of ObjectType, and the [ of ArrayType are all ASCII characters. The <classname> represents a fully qualified class name, for instance,
java.lang.Thread
. For historical reasons it is stored in a class
file in a modified internal form (§4.2).The meaning of the field types is as follows:
BFor example, the descriptor of anbyte
signed byte
Cchar
character
Ddouble
double-precision IEEE 754 float
Ffloat
single-precision IEEE 754 float
Iint
integer
Jlong
long integer
L<classname>;... an instance of the class
Sshort
signed short
Zboolean
true
orfalse
[... one array dimension
int
instance variable is simply I. The descriptor of
an instance variable of type Object
is Ljava/lang/Object;. Note that the internal form of
the fully qualified class name for class Object
is used. The descriptor of an instance
variable that is a multidimensional double
array,
double d[][][];
is
[[[D
FieldTypeA method descriptor represents the parameters that the method takes and the value that it returns:
( ParameterDescriptor * ) ReturnDescriptorA return descriptor represents the return value from a method. It is a series of characters generated by the grammar:
FieldTypeThe character V indicates that the method returns no value (its return type is
void
). Otherwise, the descriptor indicates the type of the return value.
A valid Java method descriptor must represent 255 or fewer words of method parameters, where that limit includes the word for this
in the case of instance method invocations. The limit is on the number
of words of method parameters and not on the number of parameters
themselves; parameters of type long
and double
each use two words.
For example, the method descriptor for the method
Object mymethod(int i, double d, Thread t)
is
(IDLjava/lang/Thread;)Ljava/lang/Object;Note that internal forms of the fully qualified class names of
Thread
and Object
are
used in the method descriptor.
The method descriptor for mymethod
is the same whether mymethod
is static
or is an instance method. Although an instance method is passed this
,
a reference to the current class instance, in addition to its intended
parameters, that fact is not reflected in the method descriptor. (A
reference to this
is not passed to a static
method.) The reference to this
is passed implicitly by the method invocation instructions of the Java Virtual Machine used to invoke instance methods.
constant_pool
table entries have the following general format:
cp_info {
Each item in theu1 tag;
u1 info[];
}
constant_pool
table must begin with a 1-byte tag indicating the kind
of cp_info
entry. The contents of the info
array varies with the value of tag
. The valid
tags and their values are listed in Table 4.2CONSTANT_Class_info
structure is used to represent a class or an interface:
CONSTANT_Class_info {
The items of theu1 tag;
u2 name_index;
}
CONSTANT_Class_info
structure are the following:
Thename_indextag
item has the valueCONSTANT_Class
(7
).
The value of theBecause arrays are objects, the opcodes anewarray and multianewarray can reference array "classes" vianame_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Utf8_info
(§4.4.7) structure representing a valid fully qualified Java class name (§2.8.1) that has been converted to theclass
file's internal form (§4.2).
CONSTANT_Class_info
(§4.4.1) structures in the constant_pool
table. In this case, the name of the class is the descriptor of the
array type. For example, the class name representing a two-dimensional int
array type;
int[][]
is
[[I
The class name representing the type array of class Thread
;
Thread[]
is
[Ljava.lang.Thread;
A valid Java array type descriptor must have 255 or fewer array dimensions.
CONSTANT_Fieldref_info {
The items of these structures are as follows:u1 tag;
u2 class_index;
u2 name_and_type_index;
}
CONSTANT_Methodref_info {
u1 tag;
u2 class_index;
u2 name_and_type_index;
}
CONSTANT_InterfaceMethodref_info {
u1 tag;
u2 class_index;
u2 name_and_type_index;
}
Theclass_indextag
item of aCONSTANT_Fieldref_info
structure has the valueCONSTANT_Fieldref
(9
).The
tag
item of aCONSTANT_Methodref_info
structure has the valueCONSTANT_Methodref
(10
).The
tag
item of aCONSTANT_InterfaceMethodref_info
structure has the valueCONSTANT_InterfaceMethodref
(11
).
The value of thename_and_type_indexclass_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Class_info
(§4.4.1) structure representing the class or interface type that contains the declaration of the field or method.The
class_index
item of aCONSTANT_Fieldref_info
or aCONSTANT_Methodref_info
structure must be a class type, not an interface type. Theclass_index
item of aCONSTANT_InterfaceMethodref_info
structure must be an interface type that declares the given method.
The value of thename_and_type_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_NameAndType_info
(§4.4.6) structure. Thisconstant_pool
entry indicates the name and descriptor of the field or method.If the name of the method of a
CONSTANT_Methodref_info
orCONSTANT_InterfaceMethodref_info
begins with a'<'
('u003c'
), then the name must be one of the special internal methods (§3.8), either<init>
or<clinit>
. In this case, the method must return no value.
CONSTANT_String_info
structure is used to represent constant objects of the type
java.lang.String
:
CONSTANT_String_info {
The items of theu1 tag;
u2 string_index;
}
CONSTANT_String_info
structure are as follows:
Thestring_indextag
item of theCONSTANT_String_info
structure has the valueCONSTANT_String
(8
).
The value of thestring_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Utf8_info
(§4.4.3) structure representing the sequence of characters to which thejava.lang.String
object is to be initialized.
CONSTANT_Integer_info
and CONSTANT_Float_info
structures represent four-byte
numeric (int
and float
) constants:
CONSTANT_Integer_info {
The items of these structures are as follows:u1 tag;
u4 bytes;
}
CONSTANT_Float_info {
u1 tag;
u4 bytes;
}
Thebytestag
item of theCONSTANT_Integer_info
structure has the valueCONSTANT_Integer
(3
).The
tag
item of theCONSTANT_Float_info
structure has the valueCONSTANT_Float
(4
).
Thebytes
item of theCONSTANT_Integer_info
structure contains the value of theint
constant. The bytes of the value are stored in big-endian (high byte first) order.The
bytes
item of theCONSTANT_Float_info
structure contains the value of thefloat
constant in IEEE 754 floating-point "single format" bit layout. The bytes of the value are stored in big-endian (high byte first) order, and are first converted into anint
argument. Then:
- If the argument is
0x7f800000
, thefloat
value will be positive infinity.- If the argument is
0xff800000
, thefloat
value will be negative infinity.- If the argument is in the range
0x7f800001
through0x7fffffff
or in the range0xff800001
through0xffffffff
, thefloat
value will be NaN.- In all other cases, let
s
,e
, andm
be three values that might be computed by
int s = ((bytes >> 31) == 0) ? 1 : -1;
int e = ((bytes >> 23) & 0xff);
int m = (e == 0) ?
(bytes & 0x7fffff) << 1 :
(bytes & 0x7fffff) | 0x800000;
Then thefloat
value equals the result of the mathematical expression
.
CONSTANT_Long_info
and CONSTANT_Double_info
represent eight-byte numeric
(long
and double
) constants:
CONSTANT_Long_info {
All eight-byte constants take up two entries in theu1 tag;
u4 high_bytes;
u4 low_bytes;
}
CONSTANT_Double_info {
u1 tag;
u4 high_bytes;
u4 low_bytes;
}
constant_pool
table of the class
file, as well as in the in-memory version of the constant pool that is constructed when a class
file is read. If a CONSTANT_Long_info
or CONSTANT_Double_info
structure is the item in the constant_pool
table at index n, then the next valid item in the pool is located at index n+2. The constant_pool
index n+1 must be considered invalid and must not be used.1The items of these structures are as follows:
Thehigh_bytes, low_bytestag
item of theCONSTANT_Long_info
structure has the valueCONSTANT_Long
(5
).The
tag
item of theCONSTANT_Double_info
structure has the valueCONSTANT_Double
(6
).
The unsignedhigh_bytes
andlow_bytes
items of theCONSTANT_Long
structure together contain the value of thelong
constant ((long
)high_bytes
<< 32) +low_bytes
, where the bytes of each ofhigh_bytes
andlow_bytes
are stored in big-endian (high byte first) order.The
high_bytes
andlow_bytes
items of theCONSTANT_Double_info
structure contain thedouble
value in IEEE 754 floating-point "double format" bit layout. The bytes of each item are stored in big-endian (high byte first) order. Thehigh_bytes
andlow_bytes
items are first converted into along
argument. Then:
- If the argument is
0x7f80000000000000L
, thedouble
value will be positive infinity.- If the argument is
0xff80000000000000L
, thedouble
value will be negative infinity.- If the argument is in the range
0x7ff0000000000001L
through0x7fffffffffffffffL
or in the range0xfff0000000000001L
through0xffffffffffffffffL
, thedouble
value will be NaN.- In all other cases, let
s
,e
, andm
be three values that might be computed from the argument:
int s = ((bits >> 63) == 0) ? 1 : -1;
int e = (int)((bits >> 52) & 0x7ffL);
long m = (e == 0) ?
(bits & 0xfffffffffffffL) << 1 :
(bits & 0xfffffffffffffL) | 0x10000000000000L;
Then the floating-point value equals thedouble
value of the mathematical expression
.
CONSTANT_NameAndType_info
structure is used to represent a field or method,
without indicating which class or interface type it belongs to:
CONSTANT_NameAndType_info {
The items of theu1 tag;
u2 name_index;
u2 descriptor_index;
}
CONSTANT_NameAndType_info
structure are as follows:
Thename_indextag
item of theCONSTANT_NameAndType_info
structure has the valueCONSTANT_NameAndType
(12
).
The value of thedescriptor_indexname_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Utf8_info
(§4.4.7) structure representing a valid Java field name or method name (§2.7) stored as a simple (not fully qualified) name (§2.7.1), that is, as a Java identifier.
The value of thedescriptor_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Utf8_info
(§4.4.7) structure representing a valid Java field descriptor (§4.3.2) or method descriptor (§4.3.3).
CONSTANT_Utf8_info
structure is used to represent constant string values.
UTF-8 strings are encoded so that character sequences that contain only
non-null ASCII characters can be represented using only one byte per
character, but characters of up to 16 bits can be represented. All
characters in the range 'u0001'
to 'u007F'
are represented by a single byte:
0 | bits 0-7 |
The seven bits of data in the byte give the value of the character represented. The
null character ('u0000'
) and characters in the range 'u0080'
to 'u07FF'
are represented
by a pair of bytes x and y:
x: | 1 | 1 | 0 | bits 6-10 | y: | 1 | 0 | bits 0-5 |
The bytes represent the character with the value ((x & 0x1f
) << 6
) + (y & 0x3f
).
Characters in the range 'u0800'
to 'uFFFF'
are represented by three bytes x, y, and z:
x: | 1 | 1 | 1 | 0 | bits 12-15 | y: | 1 | 0 | bits 6-11 | z: | 1 | 0 | bits 0-5 |
The character with the value ((x & 0xf
) << 12
) + ((y & 0x3f
) << 6
) + (z & 0x3f
) is represented by the bytes.
The bytes of multibyte characters are stored in the class
file in big-endian (high byte first) order.
There are two differences between this format and the "standard" UTF-8 format. First, the null byte (byte)0
is encoded using the two-byte format rather than the one-byte format,
so that Java Virtual Machine UTF-8 strings never have embedded nulls.
Second, only the one-byte, two-byte, and three-byte formats are used.
The Java Virtual Machine does not recognize the longer UTF-8 formats.
For more information regarding the UTF-8 format, see File System Safe UCS Transformation Format (FSS_UTF), X/Open Preliminary Specification, X/Open Company Ltd., Document Number: P316. This information also appears in ISO/IEC 10646, Annex P.
The CONSTANT_Utf8_info
structure is
CONSTANT_Utf8_info {
The items of theu1 tag;
u2 length;
u1 bytes[length];
}
CONSTANT_Utf8_info
structure are the following:
Thelengthtag
item of theCONSTANT_Utf8_info
structure has the valueCONSTANT_Utf8
(1
).
The value of thebytes[]length
item gives the number of bytes in thebytes
array (not the length of the resulting string). The strings in theCONSTANT_Utf8_info
structure are not null-terminated.
Thebytes
array contains the bytes of the string. No byte may have the value(byte)0
or(byte)0xf0
-(byte)0xff
.
field_info
structure. The format of this
structure is
field_info {
The items of theu2 access_flags;
u2 name_index;
u2 descriptor_index;
u2 attributes_count;
attribute_info attributes[attributes_count];
}
field_info
structure are as follows:
The value of thename_indexaccess_flags
item is a mask of modifiers used to describe access permission to and properties of a field. Theaccess_flags
modifiers are shown in Table 4.3.Fields of interfaces may only use flags indicated in Table 4.3 as used by any field. Fields of classes may use any of the flags in Table 4.3.
All unused bits of the
access_flags
item, including those not assigned in Table 4.3, are reserved for future use. They should be set to zero in generatedclass
files and should be ignored by Java Virtual Machine implementations.Class fields may have at most one of flags
ACC_PUBLIC
,ACC_PROTECTED
, andACC_PRIVATE
set (§2.7.8). A class field may not have bothACC_FINAL
andACC_VOLATILE
set (§2.9.1).Each interface field is implicitly
static
andfinal
(§2.13.4) and must have both itsACC_STATIC
andACC_FINAL
flags set. Each interface field is implicitlypublic
(§2.13.4) and must have itsACC_PUBLIC
flag set.
The value of thedescriptor_indexname_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Utf8_info
(§4.4.7) structure which must represent a valid Java field name (§2.7) stored as a simple (not fully qualified) name (§2.7.1), that is, as a Java identifier.
The value of theattributes_countdescriptor_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Utf8
(§4.4.7) structure which must represent a valid Java field descriptor (§4.3.2).
The value of theattributes[]attributes_count
item indicates the number of additional attributes (§4.7) of this field.
Each value of theattributes
table must be a variable-length attribute structure. A field can have any number of attributes (§4.7) associated with it.The only attribute defined for the
attributes
table of afield_info
structure by this specification is theConstantValue
attribute (§4.7.3).A Java Virtual Machine implementation must recognize
ConstantValue
attributes in theattributes
table of afield_info
structure. A Java Virtual Machine implementation is required to silently ignore any or all other attributes in theattributes
table that it does not recognize. Attributes not defined in this specification are not allowed to affect the semantics of theclass
file, but only to provide additional descriptive information (§4.7.1).
<init>
, is described by a variable-length method_info
structure. The structure has the following format:
method_info {
The items of theu2 access_flags;
u2 name_index;
u2 descriptor_index;
u2 attributes_count;
attribute_info attributes[attributes_count];
}
method_info
structure are as follows:
The value of thename_indexaccess_flags
item is a mask of modifiers used to describe access permission to and properties of a method or instance initialization method (§3.8). Theaccess_flags
modifiers are shown in Table 4.4.Methods in interfaces may only use flags indicated in Table 4.4 as used by any method. Class and instance methods (§2.10.3) may use any of the flags in Table 4.4. Instance initialization methods (§3.8) may only use
ACC_PUBLIC
,ACC_PROTECTED
, andACC_PRIVATE
.All unused bits of the
access_flags
item, including those not assigned in Table 4.4, are reserved for future use. They should be set to zero in generatedclass
files and should be ignored by Java Virtual Machine implementations.At most one of the flags
ACC_PUBLIC
,ACC_PROTECTED
, andACC_PRIVATE
may be set for any method. Class and instance methods may not useACC_ABSTRACT
together withACC_FINAL
,ACC_NATIVE
, orACC_SYNCHRONIZED
(that is,native
andsynchronized
methods require an implementation). A class or instance method may not useACC_PRIVATE
withACC_ABSTRACT
(that is, aprivate
method cannot be overridden, so such a method could never be implemented or used). A class or instance method may not useACC_STATIC
withACC_ABSTRACT
(that is, astatic
method is implicitlyfinal
and thus cannot be overridden, so such a method could never be implemented or used).Class and interface initialization methods (§3.8), that is, methods named
<clinit>
, are called implicitly by the Java Virtual Machine; the value of theiraccess_flags
item is ignored.Each interface method is implicitly
abstract
, and so must have itsACC_ABSTRACT
flag set. Each interface method is implicitlypublic
(§2.13.5), and so must have itsACC_PUBLIC
flag set.
The value of thedescriptor_indexname_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Utf8_info
(§4.4.7) structure representing either one of the special internal method names (§3.8), either<init>
or<clinit>
, or a valid Java method name (§2.7), stored as a simple (not fully qualified) name (§2.7.1).
The value of theattributes_countdescriptor_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Utf8_info
(§4.4.7) structure representing a valid Java method descriptor (§4.3.3).
The value of theattributes[]attributes_count
item indicates the number of additional attributes (§4.7) of this method.
Each value of theattributes
table must be a variable-length attribute structure. A method can have any number of optional attributes (§4.7) associated with it.The only attributes defined by this specification for the
attributes
table of amethod_info
structure are theCode
(§4.7.4) andExceptions
(§4.7.5) attributes.A Java Virtual Machine implementation must recognize
Code
(§4.7.4) andExceptions
(§4.7.5) attributes. A Java Virtual Machine implementation is required to silently ignore any or all other attributes in theattributes
table of amethod_info
structure that it does not recognize. Attributes not defined in this specification are not allowed to affect the semantics of theclass
file, but only to provide additional descriptive information (§4.7.1).
ClassFile
(§4.1), field_info
(§4.5), method_info
(§4.6), and
Code_attribute
(§4.7.4) structures of the class
file format. All attributes have the following general format:
attribute_info {
For all attributes, theu2 attribute_name_index;
u4 attribute_length;
u1 info[attribute_length];
}
attribute_name_index
must be a valid unsigned 16-bit index into the constant pool of the class. The constant_pool
entry at attribute_name_index
must be a CONSTANT_Utf8
(§4.4.7) string representing the name of the attribute. The value of the attribute_length
item indicates the length of the subsequent information in bytes. The
length does not include the initial six bytes that contain the attribute_name_index
and attribute_length
items.
Certain attributes are predefined as part of the class
file specification. The predefined attributes are the SourceFile
(§4.7.2), ConstantValue
(§4.7.3), Code
(§4.7.4), Exceptions
(§4.7.5), LineNumberTable
(§4.7.6), and Local
-VariableTable
(§4.7.7) attributes. Within the context of their use in this specification, that is, in the attributes
tables of the class
file structures in which they appear, the names of these predefined attributes are reserved.
Of the predefined attributes, the Code
, ConstantValue
, and Exceptions
attributes must be recognized and correctly read by a class
file reader for correct interpretation of the class
file by a Java Virtual Machine. Use of the remaining predefined attributes is optional; a class
file reader may use the information they contain, and otherwise must silently ignore those attributes.
class
files containing new attributes in the attributes
tables of class
file structures. Java Virtual Machine
implementations are permitted to recognize and use new attributes found in the
attributes
tables of class
file structures. However, all attributes not defined as part of
this Java Virtual Machine specification must not affect the semantics of class or
interface types. Java Virtual Machine implementations are required to silently
ignore attributes they do not recognize.
For instance, defining a new attribute to support vendor-specific
debugging is permitted. Because Java Virtual Machine implementations
are required to ignore attributes they do not recognize, class
files intended for that particular Java Virtual Machine implementation
will be usable by other implementations even if those implementations
cannot make use of the additional debugging information that the class
files contain.
Java Virtual Machine implementations are specifically prohibited from throwing an exception or otherwise refusing to use class
files simply because of the presence of some new attribute. Of course, tools operating on class
files may not run correctly if given class
files that do not contain all the attributes they require.
Two attributes that are intended to be distinct, but that happen to use
the same attribute name and are of the same length, will conflict on
implementations that recognize either attribute. Attributes defined
other than by Sun must have names chosen according to the package
naming convention defined by The Java Language Specification. For instance, a new attribute defined by Netscape might have the name "COM.Netscape.new-attribute"
.
Sun may define additional attributes in future versions of this class
file specification.
SourceFile
attribute is an optional fixed-length attribute in the attributes
table of
the ClassFile
(§4.1) structure. There can be no more than one SourceFile
attribute in
the attributes
table of a given ClassFile
structure.
The SourceFile
attribute has the format
SourceFile_attribute {
The items of theu2 attribute_name_index;
u4 attribute_length;
u2 sourcefile_index;
}
SourceFile_attribute
structure are as follows:
The value of theattribute_lengthattribute_name_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Utf8_info
(§4.4.7) structure representing the string"SourceFile"
.
The value of thesourcefile_indexattribute_length
item of aSourceFile_attribute
structure must be2
.
The value of thesourcefile_index
item must be a valid index into theconstant_pool
table. The constant pool entry at that index must be aCONSTANT_Utf8_info
(§4.4.7) structure representing the string giving the name of the source file from which thisclass
file was compiled.Only the name of the source file is given by the
SourceFile
attribute. It never represents the name of a directory containing the file or an absolute path name for the file. For instance, theSourceFile
attribute might contain the file namefoo.java
but not the UNIX pathname/home/lindholm/foo.java
.
ConstantValue
attribute is a fixed-length attribute used in the attributes
table of the
field_info
(§4.5) structures. A ConstantValue
attribute represents the value of a constant
field that must be (explicitly or implicitly) static
; that is, the ACC_STATIC
bit (§Table
4.3) in the flags
item of the field_info
structure must be set. The field is not required
to be final
. There can be no more than one ConstantValue
attribute in the attributes
table of a given field_info
structure. The constant field represented by the field_info
structure is assigned the value referenced by its ConstantValue
attribute as part of its
initialization (§2.16.4).
Every Java Virtual Machine implementation must recognize ConstantValue
attributes.
The ConstantValue
attribute has the format
ConstantValue_attribute {
The items of theu2 attribute_name_index;
u4 attribute_length;
u2 constantvalue_index;
}
ConstantValue_attribute
structure are as follows:
The value of theattribute_lengthattribute_name_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Utf8_info
(§4.4.7) structure representing the string"ConstantValue"
.
The value of theconstantvalue_indexattribute_length
item of aConstantValue_attribute
structure must be2
.
The value of theconstantvalue_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must give the constant value represented by this attribute.The
constant_pool
entry must be of a type appropriate to the field, as shown by Table 4.5.
Field Type Entry Type long
CONSTANT_Long
float
CONSTANT_Float
double
CONSTANT_Double
int
,short
,char
,byte
,boolean
CONSTANT_Integer
java.lang.String
CONSTANT_String
Code
attribute is a variable-length attribute used in the attributes
table of
method_info
structures. A Code
attribute
contains the Java Virtual Machine instructions and auxiliary
information for a single Java method, instance initialization method (§3.8), or class or interface initialization method (§3.8). Every Java Virtual
Machine implementation must recognize Code
attributes. There must be exactly one
Code
attribute in each method_info
structure.
The Code
attribute has the format
Code_attribute {
The items of theu2 attribute_name_index;
u4 attribute_length;
u2 max_stack;
u2 max_locals;
u4 code_length;
u1 code[code_length];
u2 exception_table_length;
{ u2 start_pc;
u2 end_pc;
u2 handler_pc;
u2 catch_type;
} exception_table[exception_table_length];
u2 attributes_count;
attribute_info attributes[attributes_count];
}
Code_attribute
structure are as follows:
The value of theattribute_lengthattribute_name_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Utf8_info
(§4.4.7) structure representing the string"Code"
.
The value of themax_stackattribute_length
item indicates the length of the attribute, excluding the initial six bytes.
The value of themax_localsmax_stack
item gives the maximum number of words on the operand stack at any point during execution of this method.
The value of thecode_lengthmax_locals
item gives the number of local variables used by this method, including the parameters passed to the method on invocation. The index of the first local variable is0
. The greatest local variable index for a one-word value ismax_locals-1
. The greatest local variable index for a two-word value ismax_locals-2
.
The value of thecode[]code_length
item gives the number of bytes in thecode
array for this method. The value ofcode_length
must be greater than zero; thecode
array must not be empty.
Theexception_table_lengthcode
array gives the actual bytes of Java Virtual Machine code that implement the method.When the
code
array is read into memory on a byte addressable machine, if the first byte of the array is aligned on a 4-byte boundary, the tableswitch and lookupswitch 32-bit offsets will be 4-byte aligned; refer to the descriptions of those instructions for more information on the consequences ofcode
array alignment.The detailed constraints on the contents of the
code
array are extensive and are given in a separate section (§4.8).
The value of theexception_table[]exception_table_length
item gives the number of entries in theexception_table
table.
Each entry in thestart_pc, end_pcexception_table
array describes one exception handler in thecode
array. Eachexception_table
entry contains the following items:
The values of the two itemshandler_pcstart_pc
andend_pc
indicate the ranges in thecode
array at which the exception handler is active. The value ofstart_pc
must be a valid index into thecode
array of the opcode of an instruction. The value ofend_pc
either must be a valid index into thecode
array of the opcode of an instruction, or must be equal tocode_length
, the length of thecode
array. The value ofstart_pc
must be less than the value ofend_pc
.The
start_pc
is inclusive andend_pc
is exclusive; that is, the exception handler must be active while the program counter is within the interval [start_pc,
end_pc
).2
The value of thecatch_typehandler_pc
item indicates the start of the exception handler. The value of the item must be a valid index into thecode
array, must be the index of the opcode of an instruction, and must be less than the value of thecode_length
item.
If the value of theattributes_countcatch_type
item is nonzero, it must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Class_info
(§4.4.1) structure representing a class of exceptions that this exception handler is designated to catch. This class must be the classThrowable
or one of its subclasses. The exception handler will be called only if the thrown exception is an instance of the given class or one of its subclasses.If the value of the
catch_type
item is zero, this exception handler is called for all exceptions. This is used to implementfinally
(see Section 7.13, "Compiling finally").
The value of theattributes[]attributes_count
item indicates the number of attributes of theCode
attribute.
Each value of theattributes
table must be a variable-length attribute structure. ACode
attribute can have any number of optional attributes associated with it.Currently, the
LineNumberTable
(§4.7.6) andLocalVariableTable
(§4.7.7) attributes, both of which contain debugging information, are defined and used with theCode
attribute.A Java Virtual Machine implementation is permitted to silently ignore any or all attributes in the
attributes
table of aCode
attribute. Attributes not defined in this specification are not allowed to affect the semantics of theclass
file, but only to provide additional descriptive information (§4.7.1).
Exceptions
attribute is a variable-length attribute used in the attributes
table of a
method_info
(§4.6) structure. The Exceptions
attribute indicates which checked exceptions a method may throw. There must be exactly one Exceptions
attribute in each
method_info
structure.
The Exceptions
attribute has the format
Exceptions_attribute {
The items of theu2 attribute_name_index;
u4 attribute_length;
u2 number_of_exceptions;
u2 exception_index_table[number_of_exceptions];
}
Exceptions_attribute
structure are as follows:
The value of theattribute_lengthattribute_name_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must bethe
CONSTANT_Utf8_info
(§4.4.7) structure representing the string"Exceptions"
.
The value of thenumber_of_exceptionsattribute_length
item indicates the attribute length, excluding the initial six bytes.
The value of theexception_index_table[]number_of_exceptions
item indicates the number of entries in theexception_index_table
.
Each nonzero value in theA method should only throw an exception if at least one of the following three criteria is met:exception_index_table
array must be a valid index into theconstant_pool
table. For each table item, ifexception_index_table[
i] != 0
, where0
£ i <number_of_exceptions
, then theconstant_pool
entry at indexexception_index_table[
i]
must be aCONSTANT_Class_info
(§4.4.1) structure representing a class type that this method is declared to throw.
RuntimeException
or one of its subclasses.
Error
or one of its subclasses.
exception_index_table
above, or one of their subclasses.
throws
clauses when classes
are verified.
LineNumberTable
attribute is an optional variable-length attribute in the attributes
table of a Code
(§4.7.4) attribute. It may be used by debuggers to determine which
part of the Java Virtual Machine code
array corresponds to a given line number in the
original Java source file. If LineNumberTable
attributes are present in the attributes
table of a given Code
attribute, then they may appear in any order. Furthermore, multiple LineNumberTable
attributes may together represent a given line of a Java source
file; that is, LineNumberTable
attributes need not be one-to-one with source lines.3
The LineNumberTable
attribute has the format
LineNumberTable_attribute {
The items of theu2 attribute_name_index;
u4 attribute_length;
u2 line_number_table_length;
{ u2 start_pc;
u2 line_number;
} line_number_table[line_number_table_length];
}
LineNumberTable_attribute
structure are as follows:
attribute_lengthThe value of the
attribute_name_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Utf8_info
(§4.4.7) structure representing the string"LineNumberTable"
.
The value of theline_number_table_lengthattribute_length
item indicates the length of the attribute, excluding the initial six bytes.
The value of theline_number_table[]line_number_table_length
item indicates the number of entries in theline_number_table
array.
Each entry in thestart_pcline_number_table
array indicates that the line number in the original Java source file changes at a given point in thecode
array. Each entry must contain the following items:
The value of theline_numberstart_pc
item must indicate the index into thecode
array at which the code for a new line in the original Java source file begins. The value ofstart_pc
must be less than the value of thecode_length
item of theCode
attribute of which thisLineNumberTable
is an attribute.
The value of theline_number
item must give the corresponding line number in the original Java source file.
LocalVariableTable
attribute is an optional variable-length attribute of a Code
(§4.7.4) attribute. It may be used by debuggers to determine the value of a given
local variable during the execution of a method. If LocalVariableTable
attributes are
present in the attributes
table of a given Code
attribute, then they may appear in any
order. There may be no more than one LocalVariableTable
attribute per local variable
in the Code
attribute.
The LocalVariableTable
attribute has the format
LocalVariableTable_attribute {
The items of theu2 attribute_name_index;
u4 attribute_length;
u2 local_variable_table_length;
{ u2 start_pc;
u2 length;
u2 name_index;
u2 descriptor_index;
u2 index;
} local_variable_table[
local_variable_table_length];
}
LocalVariableTable_attribute
structure are as follows:
The value of theattribute_lengthattribute_name_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must be aCONSTANT_Utf8_info
(§4.4.7) structure representing the string"LocalVariableTable"
.
The value of thelocal_variable_table_lengthattribute_length
item indicates the length of the attribute, excluding the initial six bytes.
The value of thelocal_variable_table[]local_variable_table_length
item indicates the number of entries in thelocal_variable_table
array.
Each entry in thestart_pc, lengthlocal_variable_table
array indicates a range ofcode
array offsets within which a local variable has a value. It also indicates the index into the local variables of the current frame at which that local variable can be found. Each entry must contain the following items:
The given local variable must have a value at indices into thename_index, descriptor_indexcode
array in the interval [start_pc
,start_pc+length
], that is, betweenstart_pc
andstart_pc+length
inclusive. The value ofstart_pc
must be a valid index into thecode
array of thisCode
attribute of the opcode of an instruction. The value ofstart_pc+length
must be either a valid index into thecode
array of thisCode
attribute of the opcode of an instruction, or the first index beyond the end of thatcode
array.
The value of theindexname_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must contain aCONSTANT_Utf8_info
(§4.4.7)structure representing a valid Java local variable name stored as a simple name (§2.7.1).
The value of the
descriptor_index
item must be a valid index into theconstant_pool
table. Theconstant_pool
entry at that index must contain aCONSTANT_Utf8_info
(§4.4.7) structure representing a valid descriptor for a Java local variable. Java local variable descriptors have the same form as field descriptors (§4.3.2).
The given local variable must be atindex
in its method's local variables. If the local variable atindex
is a two-word type (double
orlong
), it occupies bothindex
andindex+1
.
code
array of the
Code
attribute of a method_info
structure of a class
file. This section describes the constraints associated with the contents of the Code_attribute
structure.
class
file are those defining the well-formedness of the
file. With the exception of the static constraints on the Java Virtual Machine code of
the class
file, these constraints have been given in the previous section. The static
constraints on the Java Virtual Machine code in a class
file specify how Java Virtual
Machine instructions must be laid out in the code
array, and what the operands of
individual instructions must be.
The static constraints on the instructions in the code
array are as follows:
code
array must not be empty, so the code_length
attribute cannot have the value 0
.
code
array begins at index 0
.
code
array. Instances of instructions using the reserved opcodes (§6.2), the _quick opcodes documented in Chapter 9, "An Optimization," or any opcodes not documented in this specification may not appear in the code
array.
code
array except the
last, the index of the opcode of the next instruction equals the index
of the opcode of the current instruction plus the length of that
instruction, including all its operands. The wide instruction is treated like any other instruction for these purposes; the opcode specifying the operation that a wide instruction is to modify is treated as one of the operands of that wide instruction. That opcode must never be directly reachable by the computation.
code
array must be the byte at index code_length-1
.
code
array are as follows:
constant_pool
table. The constant pool entry referenced by that index must be of type CONSTANT_Integer
, CONSTANT_Float
, or CONSTANT_String
.
constant_pool
table. The constant pool entry referenced by that index must be of type CONSTANT_Long
or CONSTANT_double
.
In addition, the subsequent constant pool index must also be a valid
index into the constant pool, and the constant pool entry at that index
must not be used.
constant_pool
table. The constant pool entry referenced by that index must be of type CONSTANT_Fieldref
.
constant_pool
table. The constant pool entry referenced by that index must be of type CONSTANT_Methodref
.
<init>
, the instance initialization method (§3.8). No other method whose name begins with the character '<'
('u003c'
) may be called by the method invocation instructions. In particular, the class initialization method <clinit>
is never called explicitly from Java Virtual Machine instructions, but only implicitly by the Java Virtual Machine itself.
constant_pool
table. The constant pool entry referenced by that index must be of type CONSTANT_InterfaceMethodref
. The value of the nargs operand of each invokeinterface instruction must be the same as the number of argument words implied by the descriptor of the CONSTANT_NameAndType_info
structure referenced by the CONSTANT_InterfaceMethodref
constant pool entry. The fourth operand byte of each invokeinterface instruction must have the value zero.
constant_pool
table. The constant pool entry referenced by that index must be of type CONSTANT_Class
.
CONSTANT_Class
constant_pool
table entry representing an array class. The new instruction cannot be used to create an array. The new instruction also cannot be used to create an interface or an instance of an abstract
class, but those checks are performed at link time.
CONSTANT_Class
operand, it must not attempt to create more dimensions than are in the array type. The dimensions operand of each multianewarray instruction must not be zero.
T_BOOLEAN
(4
), T_CHAR
(5
), T_FLOAT
(6
), T_DOUBLE
(7
), T_BYTE
(8
), T_SHORT
(9
), T_INT
(10
), or T_LONG
(11
).
max_locals-1
.
max_locals-1
.
max_locals-2
.
max_locals-2
.
code
array specify constraints on relationships
between Java Virtual Machine instructions. The structural constraints are as follows:
int
is also permitted to operate on values of type byte
, char
, and short
. (As noted in §3.11.1, the Java Virtual Machine internally converts values of types byte
, char
, and short
to type int
.)
long
or double
) be reversed or split up. At no point can the words of a two-word type be operated on individually.
max_stack
words.
<init>
, a method in this
, a private
method, or a method in a superclass of this
.
<init>
is invoked, an uninitialized class instance must be in an appropriate position on the operand stack. The <init>
method must never be invoked on an initialized class instance.
finally
clause. However, an uninitialized class instance may be on the operand stack in code protected by an exception handler or a finally
clause. When an exception is thrown, the contents of the operand stack are discarded.
Object
, must call either another instance initialization method of this
or an instance initialization method of its immediate superclass super
before its instance members are accessed. However, this is not necessary in the case of class Object
, which does not have a superclass (§2.4.6).
abstract
method must never be invoked.
byte
, char
, short
, or int
, only the ireturn instruction may be used. If the method returns a float
, long
, or double
, only an freturn, lreturn, or dreturn instruction, respectively, may be used. If the method returns a reference
type, it must do so using an areturn instruction, and the returned value must be assignment compatible (§2.6.6) with the return descriptor (§4.3.3) of the method. All instance initialization methods, static initializers, and methods declared to return void
must only use the return instruction.
protected
field of a superclass, then the type of the class instance being
accessed must be the same as or a subclass of the current class. If invokevirtual is used to access a protected
method of a superclass, then the type of the class instance being
accessed must be the same as or a subclass of the current class.
byte
, char
, short
, or int
, then the value must be an int
. If the descriptor type is float
, long
, or double
, then the value must be a float
, long
, or double
, respectively. If the descriptor type is a reference
type, then the value must be of a type that is assignment compatible (§2.6.6) with the descriptor type.
reference
by an aastore instruction must be assignment compatible (§2.6.6) with the component type of the array.
Throwable
or of subclasses of Throwable
.
code
array.
returnAddress
) may be loaded from a local variable.
try
-finally
constructs from within a finally
clause. For more information on Java Virtual Machine subroutines, see §4.9.6.)
returnAddress
can be returned to at most once. If a ret instruction returns to a point in the subroutine call chain above the ret instruction corresponding to a given instance of type returnAddress
, then that instance can never be used as a return address.
class
files. The HotJava browser needs to determine whether the class
file
was produced by a trustworthy Java compiler or by an adversary attempting to
exploit the interpreter.
An additional problem with compile-time checking is version skew. A user may have successfully compiled a class, say PurchaseStockOptions
, to be a subclass of TradingClass
. But the definition of TradingClass
might have changed in a way that is not compatible with preexisting
binaries since the time the class was compiled. Methods might have been
deleted, or had their return types or modifiers changed. Fields might
have changed types or changed from instance variables to class
variables. The access modifiers of a method or variable may have
changed from public
to private
. For a discussion of these issues, see Chapter 13, "Binary Compatibility," in The Java Language Specification.
Because of these potential problems, the Java Virtual Machine needs to
verify for itself that the desired constraints hold on the class
files it attempts to incorporate. A well-written Java Virtual Machine emulator could reject poorly formed instructions when a class
file is loaded. Other constraints could be checked at run time. For
example, a Java Virtual Machine implementation could tag runtime data
and have each instruction check that its operands are of the right type.
Instead, Sun's Java Virtual Machine implementation verifies that each class
file it considers untrustworthy satisfies the necessary constraints at linking time (§2.16.3). Structural constraints on the Java Virtual Machine code are checked using a simple theorem prover.
Linking-time verification enhances the performance of the interpreter. Expensive checks that would otherwise have to be performed to verify constraints at run time for each interpreted instruction can be eliminated. The Java Virtual Machine can assume that these checks have already been performed. For example, the Java Virtual Machine will already know the following:
class
file verifier
is independent of any Java compiler. It should certify all code
generated by Sun's current Java compiler; it should also certify code
that other compilers can generate, as well as code that the current
compiler could not possibly generate. Any class
file that satisfies the structural criteria and static constraints will be certified by the verifier.
The class
file verifier is also independent of the Java language. Other languages can be compiled into the class
format, but will only pass verification if they satisfy the same constraints as a class
file compiled from Java source.
class
file verifier operates in four passes:
Pass 1: When a prospective class
file is loaded (§2.16.2) by the Java Virtual
Machine, the Java Virtual Machine first ensures that the file has the basic format of a
Java class
file. The first four bytes must contain the right magic number. All recognized attributes must be of the proper length. The class
file must not be truncated or
have extra bytes at the end. The constant pool must not contain any superficially
unrecognizable information.
While class
file verification properly occurs during class linking (§2.16.3), this check for basic class
file integrity is necessary for any interpretation of the class
file contents and can be considered to be logically part of the verification process.
Pass 2: When the class
file is linked, the verifier performs all additional verification
that can be done without looking at the code
array of the Code
attribute (§4.7.4). The
checks performed by this pass include the following:
final
classes are not subclassed, and that final
methods are not overridden.
Object)
has a superclass.
CONSTANT_Utf8
string reference in the constant pool.
Pass 3: Still during linking, the verifier checks the code
array of the Code
attribute for
each method of the class
file by performing data-flow analysis on each method. The
verifier ensures that at any given point in the program, no matter what code path is
taken to reach that point:
Pass 4: For efficiency reasons, certain tests that could in principle be performed in
Pass 3 are delayed until the first time the code for the method is actually invoked. In
so doing, Pass 3 of the verifier avoids loading class
files unless it has to.
For example, if a method invokes another method that returns an instance of class A
, and that instance is only assigned to a field of the same type, the verifier does not bother to check if the class A
actually exists. However, if it is assigned to a field of the type B
, the definitions of both A
and B
must be loaded in to ensure that A
is a subclass of B
.
Pass 4 is a virtual pass whose checking is done by the appropriate Java Virtual Machine instructions. The first time an instruction that references a type is executed, the executing instruction does the following:
LinkageError
to be thrown.
A Java Virtual Machine is allowed to perform any or all of the Pass 4 steps, except for class or interface initialization, as part of Pass 3; see 2.16.1, "Virtual Machine Start-up" for an example and more discussion.
In Sun's Java Virtual Machine implementation, after the verification
has been performed, the instruction in the Java Virtual Machine code is
replaced with an alternative form of the instruction (see Chapter 9, "An Optimization"). For example, the opcode new
is replaced with new_quick
.
This alternative instruction indicates that the verification needed by
this instruction has taken place and does not need to be performed
again. Subsequent invocations of the method will thus be faster. It is
illegal for these alternative instruction forms to appear in class
files, and they should never be encountered by the verifier.
class
file verification. This section looks at the verification of Java Virtual Machine code in more detail.
The code for each method is verified independently. First, the bytes
that make up the code are broken up into a sequence of instructions,
and the index into the code
array of the start of each instruction is placed in an array. The
verifier then goes through the code a second time and parses the
instructions. During this pass a data structure is built to hold
information about each Java Virtual Machine instruction in the method.
The operands, if any, of each instruction are checked to make sure they
are valid. For instance:
code
array for the method.
int
or float
, or for instances of class String
; the instruction getfield must reference a field.
byte
, short
, char
) when determining the value types on the operand stack. Next, a data-flow analyzer is initialized. For the first instruction of the method, the local variables which represent parameters initially contain values of the types indicated by the method's type descriptor; the operand stack is empty. All other local variables contain an illegal value. For the other instructions, which have not been examined yet, no information is available regarding the operand stack or local variables.
Finally, the data-flow analyzer is run. For each instruction, a "changed" bit indicates whether this instruction needs to be looked at. Initially, the "changed" bit is only set for the first instruction. The data-flow analyzer executes the following loop:
reference
values may appear at corresponding places on the two stacks. In this case, the merged operand stack contains a reference
to an instance of the first common superclass or common superinterface
of the two types. Such a reference type always exists because the type Object
is a supertype of all class and interface types. If the operand stacks cannot be merged, verification of the method fails.
To merge two local variable states, corresponding pairs of local
variables are compared. If the two types are not identical, then unless
both contain reference
values, the verifier records that the local variable contains an
unusable value. If both of the pair of local variables contain reference
values, the merged state contains a reference
to an instance of the first common superclass of the two types.
If the data-flow analyzer runs on a method without reporting a
verification failure, then the method has been successfully verified by
Pass 3 of the class
file verifier.
Certain instructions and data types complicate the data-flow analyzer. We now examine each of these in more detail.
long
and double
types each take two consecutive words on the operand
stack and in the local variables.
Whenever a long
or double
is moved into a local variable, the subsequent local variable is marked as containing the second half of a long
or double
. This special value indicates that all references to the long
or double
must be through the index of the lower-numbered local variable.
Whenever any value is moved to a local variable, the preceding local
variable is examined to see if it contains the first word of a long
or a double
. If so, that preceding local variable is changed to indicate that it now contains an unusable value. Since half of the long
or double
has been overwritten, the other half must no longer be used.
Dealing with 64-bit quantities on the operand stack is simpler; the
verifier treats them as single units on the stack. For example, the
verification code for the dadd opcode (add two double
values) checks that the top two items on the stack are both of type double
. When calculating operand stack length, values of type long
and double
have length two.
Untyped instructions that manipulate the operand stack must treat values of type double
and long
as atomic. For example, the verifier reports a failure if the top value on the stack is a double
and it encounters an instruction such as pop or dup. The instructions pop2 or dup2 must be used instead.
...
can be implemented by the following:new myClass(i, j, k);
...
...
new #1 // Allocate uninitialized space forThis instruction sequence leaves the newly created and initialized object on top of the operand stack. (More examples of compiling Java code to the instruction set of the Java Virtual Machine are given in Chapter 7, "Compiling for the Java Virtual Machine.")myClass
dup // Duplicate object on the operand stack iload_1 // Push i iload_2 // Push j iload_3 // Push k invokespecialmyClass.<init>
// Initialize object...
The instance initialization method <init>
for class myClass
sees the new uninitialized object as its this
argument in local variable 0
. It must either invoke an alternative instance initialization method for class myClass
or invoke the initialization method of a superclass on the this
object before it is allowed to do anything else with this
.
When doing dataflow analysis on instance methods, the verifier initializes local variable 0
to contain an object of the current class, or, for instance initialization methods, local variable 0
contains a special type indicating an uninitialized object. After an
appropriate initialization method is invoked (from the current class or
the current superclass) on this object, all occurrences of this special
type on the verifier's model of the operand stack and in the local
variables are replaced by the current class type. The verifier rejects
code that uses the new object before it has been initialized or that
initializes the object twice. In addition, it ensures that every normal
return of the method has either invoked an initialization method in the
class of this method or in the direct superclass.
Similarly, a special type is created and pushed on the verifier's model of the operand stack as the result of the Java Virtual Machine instruction new. The special type indicates the instruction by which the class instance was created and the type of the uninitialized class instance created. When an initialization method is invoked on that class instance, all occurrences of the special type are replaced by the intended type of the class instance. This change in type may propagate to subsequent instructions as the dataflow analysis proceeds.
The instruction number needs to be stored as part of the special type, as there may be multiple not-yet-initialized instances of a class in existence on the operand stack at one time. For example, the Java Virtual Machine instruction sequence that implements
new InputStream(new Foo(), new InputStream("foo"))
may have two uninitialized instances of InputStream
on the operand stack at once. When an initialization method is invoked
on a class instance, only those occurrences of the special type on the
operand stack or in the registers that are the same
object as the class instance are replaced.
A valid instruction sequence must not have an uninitialized object on
the operand stack or in a local variable during a backwards branch, or
in a local variable in code protected by an exception handler or a finally
clause. Otherwise, a devious piece of code might fool the verifier into
thinking it had initialized a class instance when it had, in fact,
initialized a class instance created in a previous pass through the
loop.
class
file verifier since they do not pose a
threat to the integrity of the Java Virtual Machine. As long as every nonexceptional
path to the exception handler causes there to be a single object on the operand stack,
and as long as all other criteria of the verifier are met, the verifier will pass the code.
finally
...
the Java language guarantees thattry {
startFaucet();
waterLawn();
} finally {
stopFaucet();
}
...
stopFaucet
is invoked (the faucet is turned off)
whether we finish watering the lawn or whether an exception occurs while starting
the faucet or watering the lawn. That is, the finally
clause is guaranteed to be executed whether its try
clause completes normally, or completes abruptly by throwing an exception.
To implement the try-finally
construct, the Java compiler uses the exception-handling facilities together with two special instructions jsr ("jump to subroutine") and ret ("return from subroutine"). The finally
clause is compiled as a subroutine within the Java Virtual Machine code
for its method, much like the code for an exception handler. When a jsr instruction that invokes the subroutine is executed, it pushes its return address, the address of the instruction after the jsr that is being executed, onto the operand stack as a value of type returnAddress
. The code for the subroutine stores the return address in a local variable. At the end of the subroutine, a ret instruction fetches the return address from the local variable and transfers control to the instruction at the return address.
Control can be transferred to the finally
clause (the finally
subroutine can be invoked) in several different ways. If the try
clause completes normally, the finally
subroutine is invoked via a jsr instruction before evaluating the next Java expression. A break
or continue
inside the try
clause that transfers control outside the try
clause executes a jsr to the code for the finally
clause first. If the try
clause executes a return
, the compiled code does the following:
finally
clause.
finally
clause, returns the value saved in the local variable.
try
clause. If an exception is thrown in the try
clause, this exception
handler does the following:
finally
clause.
finally
clause, rethrows the exception.
try-finally
construct, see
Section 7.13, "Compiling finally."
The code for the finally
clause
presents a special problem to the verifier. Usually, if a particular
instruction can be reached via multiple paths and a particular local
variable contains incompatible values through those multiple paths,
then the local variable becomes unusable. However, a finally
clause might be called from several different places, yielding several different circumstances:
return
may have some local variable that contains the return value.
try
clause may have an indeterminate value in that same local variable.
finally
clause itself might pass verification, but after updating
all the successors of the ret instruction, the verifier would note that the local
variable that the exception handler expects to hold an exception, or that the
return code expects to hold a return value, now contains an indeterminate
value.
Verifying code that contains a finally
clause is complicated. The basic idea is the following:
finally
clause, it is of length one. For multiply nested finally
code (extremely rare!), it may be longer than one.
class
File Formatconstant_pool_count
field of the ClassFile
structure (§4.1). This acts as an internal limit on the total complexity of a single class.
exception_table
of the Code
attribute (§4.7.4), in the LineNumberTable
attribute (§4.7.6), and in the LocalVariableTable
attribute (§4.7.7).
max_locals
item of the ClassFile
structure (§4.1). (Recall that values of type long
and double
are considered to occupy two local variables.)
fields_count
item of the ClassFile
structure (§4.1).
methods_count
item of the ClassFile
structure (§4.1).
max_stack
field of the Code_attribute
structure (§4.7.4).
this
in the case of instance method invocations. Note that the limit is on
the number of words of method arguments, and not on number of arguments
themselves. Arguments of type long
and double
are two words long; arguments of all other types are one word long.
2
The fact that end_pc
is
exclusive is an historical mistake in the Java Virtual Machine: if the
Java Virtual Machine code for a method is exactly 65535 bytes long and
ends with an instruction that is one byte long, then that instruction
cannot be protected by an exception handler. A compiler writer can work
around this bug by limiting the maximum size of the generated Java
Virtual Machine code for any method, instance initialization method, or
static initializer (the size of any code
array) to 65534 bytes.
3
The javac
compiler in Sun's JDK 1.0.2 release can in fact generate LineNumberTable
attributes which are not in line number order and which are not
one-to-one with source lines. This is unfortunate, as we would prefer
to specify a one-to-one, ordered mapping of LineNumberTable
attributes to source lines, but must yield to backward compatibility.
Contents | Prev | Next | Index
Java Virtual Machine Specification
Copyright © 1996, 1997 Sun Microsystems, Inc.
All rights reserved
Please send any comments or corrections through our feedback form