| Contents | Prev | Next | Index | The JavaTM Virtual Machine Specification |
CHAPTER 4
This chapter describes the Java Virtual Machine class file format. Each class file
contains one Java type, either a class or an interface. Compliant Java Virtual
Machine implementations must be capable of dealing with all class files that conform to the specification provided by this book.
A class file consists of a stream of 8-bit bytes. All 16-bit, 32-bit, and 64-bit quantities are constructed by reading in two, four, and eight consecutive 8-bit bytes, respectively. Multibyte data items are always stored in big-endian order, where the high bytes come first. In Java, this format is supported by inter-faces java.io.DataInput and java.io.DataOutput and classes such as java.io.DataInputStream and java.io.DataOutputStream.
This chapter defines its own set of data types representing Java class file data: The types u1, u2, and u4 represent an unsigned one-, two-, or four-byte quantity, respectively. In Java, these types may be read by methods such as readUnsignedByte, readUnsignedShort, and readInt of the interface java.io.DataInput.
The Java class file format is presented using pseudostructures written in a C-like structure notation. To avoid confusion with the fields of Java Virtual Machine classes and class instances, the contents of the structures describing the Java class file format are referred to as items. Unlike the fields of a C structure, successive items are stored in the Java class file sequentially, without padding or alignment.
Variable-sized tables, consisting of variable-sized items, are used in several class file structures. Although we will use C-like array syntax to refer to table items, the fact that tables are streams of varying-sized structures means that it is not possible to directly translate a table index into a byte offset into the table.
Where we refer to a data structure as an array, it is literally an array.
class file contains a single ClassFile structure:
ClassFile {
The items in theu4 magic;u2 minor_version;u2 major_version;u2 constant_pool_count;cp_info constant_pool[constant_pool_count-1];u2 access_flags;u2 this_class;u2 super_class;u2 interfaces_count;u2 interfaces[interfaces_count];u2 fields_count;field_info fields[fields_count];u2 methods_count;method_info methods[methods_count];u2 attributes_count;attribute_info attributes[attributes_count];}
ClassFile structure are as follows:
Theminor_version, major_versionmagicitem supplies the magic number identifying theclassfile format; it has the value0xCAFEBABE.
The values of theconstant_pool_countminor_versionandmajor_versionitems are the minor and major version numbers of the compiler that produced thisclassfile. An implementation of the Java Virtual Machine normally supportsclassfiles having a given major version number and minor version numbers0through some particularminor_version.If an implementation of the Java Virtual Machine supports some range of minor version numbers and a
classfile of the same major version but a higher minor version is encountered, the Java Virtual Machine must not attempt to run the newer code. However, unless the major version number differs, it will be feasible to implement a new Java Virtual Machine that can run code of minor versions up to and including that of the newer code.A Java Virtual Machine must not attempt to run code with a different major version. A change of the major version number indicates a major incompatible change, one that requires a fundamentally different Java Virtual Machine.
In Sun's Java Developer's Kit (JDK) 1.0.2 release, documented by this book, the value of
major_versionis45. The value ofminor_versionis3. Only Sun may define the meaning of newclassfile version numbers.
The value of theconstant_pool[]constant_pool_countitem must be greater than zero. It gives the number of entries in theconstant_pooltable of theclassfile, where theconstant_poolentry at index zero is included in the count but is not present in theconstant_pooltable of the class file. Aconstant_poolindex is considered valid if it is greater than zero and less thanconstant_pool_count.
Theaccess_flagsconstant_poolis a table of variable-length structures (§4.4) representing various string constants, class names, field names, and other constants that are referred to within theClassFilestructure and its substructures.The first entry of the
constant_pooltable,constant_pool[0], is reserved for internal use by a Java Virtual Machine implementation. That entry is not present in theclassfile. The first entry in theclassfile isconstant_pool[1].Each of the
constant_pooltable entries at indices1throughconstant_pool_count-1is a variable-length structure (§4.4) whose format is indicated by its first "tag" byte.
The value of thethis_classaccess_flagsitem is a mask of modifiers used with class and interface declarations. Theaccess_flagsmodifiers are shown in Table 4.1.An interface is distinguished by its
ACC_INTERFACEflag being set. IfACC_INTERFACEis not set, this class file defines a class, not an interface.Interfaces may only use flags indicated in Table 4.1 as used by interfaces. Classes may only use flags indicated in Table 4.1 as used by classes. An interface is implicitly
abstract(§2.13.1); itsACC_ABSTRACTflag must be set. An interface cannot befinal; its implementation could never be completed (§2.13.1) if it were, so it could not have itsACC_FINALflag set.The flags
ACC_FINALandACC_ABSTRACTcannot both be set for a class; the implementation of such a class could never be completed (§2.8.2).The setting of the
ACC_SUPERflag directs the Java Virtual Machine which of two alternative semantics for its invokespecial instruction to express; it exists for backward compatibility for code compiled by Sun's older Java compilers. All new implementations of the Java Virtual Machine should implement the semantics for invokespecial documented in Chapter 6, "Java Virtual Machine Instruction Set." All new compilers to the Java Virtual Machine's instruction set should set theACC_SUPERflag. Sun's older Java compilers generateClassFileflags withACC_SUPERunset. Sun's older Java Virtual Machine implementations ignore the flag if it is set.All unused bits of the
access_flagsitem, including those not assigned in Table 4.1, are reserved for future use. They should be set to zero in generatedclassfiles and should be ignored by Java Virtual Machine implementations.
The value of thesuper_classthis_classitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Class_info(§4.4.1) structure representing the class or interface defined by thisclassfile.
For a class, the value of theinterfaces_countsuper_classitem either must be zero or must be a valid index into theconstant_pooltable. If the value of thesuper_classitem is nonzero, theconstant_poolentry at that index must be aCONSTANT_Class_info(§4.4.1) structure representing the superclass of the class defined by thisclassfile. Neither the superclass nor any of its superclasses may be afinalclass.If the value of
super_classis zero, then thisclassfile must represent the classjava.lang.Object, the only class or interface without a superclass.For an interface, the value of
super_classmust always be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Class_infostructure representing the classjava.lang.Object.
The value of theinterfaces[]interfaces_countitem gives the number of direct superinterfaces of this class or interface type.
Each value in thefields_countinterfacesarray must be a valid index into theconstant_pooltable. Theconstant_poolentry at each value ofinterfaces[i], where0£ i <interfaces_count, must be aCONSTANT_Class_info(§4.4.1) structure representing an interface which is a direct superinterface of this class or interface type, in the left-to-right order given in the source for the type.
The value of thefields[]fields_countitem gives the number offield_infostructures in thefieldstable. Thefield_info(§4.5) structures represent all fields, both class variables and instance variables, declared by this class or interface type.
Each value in themethods_countfieldstable must be a variable-lengthfield_info(§4.5) structure giving a complete description of a field in the class or interface type. Thefieldstable includes only those fields that are declared by this class or interface. It does not include items representing fields that are inherited from superclasses or superinterfaces.
The value of themethods[]methods_countitem gives the number ofmethod_infostructures in themethodstable.
Each value in theattributes_countmethodstable must be a variable-lengthmethod_info(§4.6) structure giving a complete description of and Java Virtual Machine code for a method in the class or interface.The
method_infostructures represent all methods, both instance methods and, for classes, class (static) methods, declared by this class or interface type. Themethodstable only includes those methods that are explicitly declared by this class. Interfaces have only the single method<clinit>, the interface initialization method (§3.8). Themethodstable does not include items representing methods that are inherited from superclasses or superinterfaces.
The value of theattributes[]attributes_countitem gives the number of attributes (§4.7) in theattributestable of this class.
Each value of theattributestable must be a variable-length attribute structure. AClassFilestructure can have any number of attributes (§4.7) associated with it.The only attribute defined by this specification for the
attributestable of aClassFilestructure is theSourceFileattribute (§4.7.2).A Java Virtual Machine implementation is required to silently ignore any or all attributes in the
attributestable of aClassFilestructure that it does not recognize. Attributes not defined in this specification are not allowed to affect the semantics of theclassfile, but only to provide additional descriptive information (§4.7.1).
class file structures are always represented in a fully
qualified form (§2.7.9). These class names are always represented as
CONSTANT_Utf8_info (§4.4.7) structures, and they are referenced from those
CONSTANT_NameAndType_info (§4.4.6) structures that have class names as part of
their descriptor (§4.3), as well as from all CONSTANT_Class_info (§4.4.1) structures.
For historical reasons the exact syntax of fully qualified class names that appear in class file structures differs from the familiar Java fully qualified class name documented in §2.7.9. In the internal form, the ASCII periods ('.') that normally separate the identifiers (§2.2) that make up the fully qualified name are replaced by ASCII forward slashes ('/'). For example, the normal fully qualified name of class Thread is java.lang.Thread. In the form used in descriptors in class files, a reference to the name of class Thread is implemented using a CONSTANT_Utf8_info structure representing the string "java/lang/Thread".
FieldType
FieldType
BaseType
BObjectType:
L <classname> ;ArrayType:
[ ComponentTypeThe characters of BaseType, the L and ; of ObjectType, and the [ of ArrayType are all ASCII characters. The <classname> represents a fully qualified class name, for instance,
java.lang.Thread. For historical reasons it is stored in a class file in a modified internal form (§4.2).The meaning of the field types is as follows:
BFor example, the descriptor of anbytesigned byte
Ccharcharacter
Ddoubledouble-precision IEEE 754 float
Ffloatsingle-precision IEEE 754 float
Iintinteger
Jlonglong integer
L<classname>;... an instance of the class
Sshortsigned short
Zbooleantrueorfalse
[... one array dimension
int instance variable is simply I. The descriptor of
an instance variable of type Object is Ljava/lang/Object;. Note that the internal form of
the fully qualified class name for class Object is used. The descriptor of an instance
variable that is a multidimensional double array,
double d[][][];
is
[[[D
FieldTypeA method descriptor represents the parameters that the method takes and the value that it returns:
( ParameterDescriptor * ) ReturnDescriptorA return descriptor represents the return value from a method. It is a series of characters generated by the grammar:
FieldTypeThe character V indicates that the method returns no value (its return type is
void). Otherwise, the descriptor indicates the type of the return value.
A valid Java method descriptor must represent 255 or fewer words of method parameters, where that limit includes the word for this in the case of instance method invocations. The limit is on the number of words of method parameters and not on the number of parameters themselves; parameters of type long and double each use two words.
For example, the method descriptor for the method
Object mymethod(int i, double d, Thread t)
is
(IDLjava/lang/Thread;)Ljava/lang/Object;Note that internal forms of the fully qualified class names of
Thread and Object are
used in the method descriptor.
The method descriptor for mymethod is the same whether mymethod is static or is an instance method. Although an instance method is passed this, a reference to the current class instance, in addition to its intended parameters, that fact is not reflected in the method descriptor. (A reference to this is not passed to a static method.) The reference to this is passed implicitly by the method invocation instructions of the Java Virtual Machine used to invoke instance methods.
constant_pool table entries have the following general format:
cp_info {
Each item in theu1 tag;u1 info[];}
constant_pool table must begin with a 1-byte tag indicating the kind
of cp_info entry. The contents of the info array varies with the value of tag. The valid
tags and their values are listed in Table 4.2CONSTANT_Class_info structure is used to represent a class or an interface:
CONSTANT_Class_info {
The items of theu1 tag;u2 name_index;}
CONSTANT_Class_info structure are the following:
Thename_indextagitem has the valueCONSTANT_Class(7).
The value of theBecause arrays are objects, the opcodes anewarray and multianewarray can reference array "classes" vianame_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Utf8_info(§4.4.7) structure representing a valid fully qualified Java class name (§2.8.1) that has been converted to theclassfile's internal form (§4.2).
CONSTANT_Class_info (§4.4.1) structures in the constant_pool table. In this case, the name of the class is the descriptor of the array type. For example, the class name representing a two-dimensional int array type;
int[][]
is
[[I
The class name representing the type array of class Thread;
Thread[]
is
[Ljava.lang.Thread;
A valid Java array type descriptor must have 255 or fewer array dimensions.
CONSTANT_Fieldref_info {
The items of these structures are as follows:u1 tag;u2 class_index;u2 name_and_type_index;}
CONSTANT_Methodref_info {u1 tag;u2 class_index;u2 name_and_type_index;}
CONSTANT_InterfaceMethodref_info {u1 tag;u2 class_index;u2 name_and_type_index;}
Theclass_indextagitem of aCONSTANT_Fieldref_infostructure has the valueCONSTANT_Fieldref(9).The
tagitem of aCONSTANT_Methodref_infostructure has the valueCONSTANT_Methodref(10).The
tagitem of aCONSTANT_InterfaceMethodref_infostructure has the valueCONSTANT_InterfaceMethodref(11).
The value of thename_and_type_indexclass_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Class_info(§4.4.1) structure representing the class or interface type that contains the declaration of the field or method.The
class_indexitem of aCONSTANT_Fieldref_infoor aCONSTANT_Methodref_infostructure must be a class type, not an interface type. Theclass_indexitem of aCONSTANT_InterfaceMethodref_infostructure must be an interface type that declares the given method.
The value of thename_and_type_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_NameAndType_info(§4.4.6) structure. Thisconstant_poolentry indicates the name and descriptor of the field or method.If the name of the method of a
CONSTANT_Methodref_infoorCONSTANT_InterfaceMethodref_infobegins with a'<'('u003c'), then the name must be one of the special internal methods (§3.8), either<init>or<clinit>. In this case, the method must return no value.
CONSTANT_String_info structure is used to represent constant objects of the type
java.lang.String:
CONSTANT_String_info {
The items of theu1 tag;u2 string_index;}
CONSTANT_String_info structure are as follows:
Thestring_indextagitem of theCONSTANT_String_infostructure has the valueCONSTANT_String(8).
The value of thestring_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Utf8_info(§4.4.3) structure representing the sequence of characters to which thejava.lang.Stringobject is to be initialized.
CONSTANT_Integer_info and CONSTANT_Float_info structures represent four-byte
numeric (int and float) constants:
CONSTANT_Integer_info {
The items of these structures are as follows:u1 tag;u4 bytes;}
CONSTANT_Float_info {u1 tag;u4 bytes;}
Thebytestagitem of theCONSTANT_Integer_infostructure has the valueCONSTANT_Integer(3).The
tagitem of theCONSTANT_Float_infostructure has the valueCONSTANT_Float(4).
Thebytesitem of theCONSTANT_Integer_infostructure contains the value of theintconstant. The bytes of the value are stored in big-endian (high byte first) order.The
bytesitem of theCONSTANT_Float_infostructure contains the value of thefloatconstant in IEEE 754 floating-point "single format" bit layout. The bytes of the value are stored in big-endian (high byte first) order, and are first converted into anintargument. Then:
- If the argument is
0x7f800000, thefloatvalue will be positive infinity.- If the argument is
0xff800000, thefloatvalue will be negative infinity.- If the argument is in the range
0x7f800001through0x7fffffffor in the range0xff800001through0xffffffff, thefloatvalue will be NaN.- In all other cases, let
s,e, andmbe three values that might be computed by
int s = ((bytes >> 31) == 0) ? 1 : -1;int e = ((bytes >> 23) & 0xff);int m = (e == 0) ?(bytes & 0x7fffff) << 1 :(bytes & 0x7fffff) | 0x800000;
Then thefloatvalue equals the result of the mathematical expression![]()
.
CONSTANT_Long_info and CONSTANT_Double_info represent eight-byte numeric
(long and double) constants:
CONSTANT_Long_info {
All eight-byte constants take up two entries in theu1 tag;u4 high_bytes;u4 low_bytes;}
CONSTANT_Double_info {u1 tag;u4 high_bytes;u4 low_bytes;}
constant_pool table of the class file, as well as in the in-memory version of the constant pool that is constructed when a class file is read. If a CONSTANT_Long_info or CONSTANT_Double_info structure is the item in the constant_pool table at index n, then the next valid item in the pool is located at index n+2. The constant_pool index n+1 must be considered invalid and must not be used.1The items of these structures are as follows:
Thehigh_bytes, low_bytestagitem of theCONSTANT_Long_infostructure has the valueCONSTANT_Long(5).The
tagitem of theCONSTANT_Double_infostructure has the valueCONSTANT_Double(6).
The unsignedhigh_bytesandlow_bytesitems of theCONSTANT_Longstructure together contain the value of thelongconstant ((long)high_bytes<< 32) +low_bytes, where the bytes of each ofhigh_bytesandlow_bytesare stored in big-endian (high byte first) order.The
high_bytesandlow_bytesitems of theCONSTANT_Double_infostructure contain thedoublevalue in IEEE 754 floating-point "double format" bit layout. The bytes of each item are stored in big-endian (high byte first) order. Thehigh_bytesandlow_bytesitems are first converted into alongargument. Then:
- If the argument is
0x7f80000000000000L, thedoublevalue will be positive infinity.- If the argument is
0xff80000000000000L, thedoublevalue will be negative infinity.- If the argument is in the range
0x7ff0000000000001Lthrough0x7fffffffffffffffLor in the range0xfff0000000000001Lthrough0xffffffffffffffffL, thedoublevalue will be NaN.- In all other cases, let
s,e, andmbe three values that might be computed from the argument:
int s = ((bits >> 63) == 0) ? 1 : -1;int e = (int)((bits >> 52) & 0x7ffL);long m = (e == 0) ?(bits & 0xfffffffffffffL) << 1 :(bits & 0xfffffffffffffL) | 0x10000000000000L;
Then the floating-point value equals thedoublevalue of the mathematical expression![]()
.
CONSTANT_NameAndType_info structure is used to represent a field or method,
without indicating which class or interface type it belongs to:
CONSTANT_NameAndType_info {
The items of theu1 tag;u2 name_index;u2 descriptor_index;}
CONSTANT_NameAndType_info structure are as follows:
Thename_indextagitem of theCONSTANT_NameAndType_infostructure has the valueCONSTANT_NameAndType(12).
The value of thedescriptor_indexname_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Utf8_info(§4.4.7) structure representing a valid Java field name or method name (§2.7) stored as a simple (not fully qualified) name (§2.7.1), that is, as a Java identifier.
The value of thedescriptor_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Utf8_info(§4.4.7) structure representing a valid Java field descriptor (§4.3.2) or method descriptor (§4.3.3).
CONSTANT_Utf8_info structure is used to represent constant string values.
UTF-8 strings are encoded so that character sequences that contain only non-null ASCII characters can be represented using only one byte per character, but characters of up to 16 bits can be represented. All characters in the range 'u0001' to 'u007F' are represented by a single byte:
| 0 | bits 0-7 |
The seven bits of data in the byte give the value of the character represented. The
null character ('u0000') and characters in the range 'u0080' to 'u07FF' are represented
by a pair of bytes x and y:
| x: | 1 | 1 | 0 | bits 6-10 | y: | 1 | 0 | bits 0-5 |
The bytes represent the character with the value ((x & 0x1f) << 6) + (y & 0x3f).
Characters in the range 'u0800' to 'uFFFF' are represented by three bytes x, y, and z:
| x: | 1 | 1 | 1 | 0 | bits 12-15 | y: | 1 | 0 | bits 6-11 | z: | 1 | 0 | bits 0-5 |
The character with the value ((x & 0xf) << 12) + ((y & 0x3f) << 6) + (z & 0x3f) is represented by the bytes.
The bytes of multibyte characters are stored in the class file in big-endian (high byte first) order.
There are two differences between this format and the "standard" UTF-8 format. First, the null byte (byte)0 is encoded using the two-byte format rather than the one-byte format, so that Java Virtual Machine UTF-8 strings never have embedded nulls. Second, only the one-byte, two-byte, and three-byte formats are used. The Java Virtual Machine does not recognize the longer UTF-8 formats.
For more information regarding the UTF-8 format, see File System Safe UCS Transformation Format (FSS_UTF), X/Open Preliminary Specification, X/Open Company Ltd., Document Number: P316. This information also appears in ISO/IEC 10646, Annex P.
The CONSTANT_Utf8_info structure is
CONSTANT_Utf8_info {
The items of theu1 tag;u2 length;u1 bytes[length];}
CONSTANT_Utf8_info structure are the following:
Thelengthtagitem of theCONSTANT_Utf8_infostructure has the valueCONSTANT_Utf8(1).
The value of thebytes[]lengthitem gives the number of bytes in thebytesarray (not the length of the resulting string). The strings in theCONSTANT_Utf8_infostructure are not null-terminated.
Thebytesarray contains the bytes of the string. No byte may have the value(byte)0or(byte)0xf0-(byte)0xff.
field_info structure. The format of this
structure is
field_info {
The items of theu2 access_flags;u2 name_index;u2 descriptor_index;u2 attributes_count;attribute_info attributes[attributes_count];}
field_info structure are as follows:
The value of thename_indexaccess_flagsitem is a mask of modifiers used to describe access permission to and properties of a field. Theaccess_flagsmodifiers are shown in Table 4.3.Fields of interfaces may only use flags indicated in Table 4.3 as used by any field. Fields of classes may use any of the flags in Table 4.3.
All unused bits of the
access_flagsitem, including those not assigned in Table 4.3, are reserved for future use. They should be set to zero in generatedclassfiles and should be ignored by Java Virtual Machine implementations.Class fields may have at most one of flags
ACC_PUBLIC,ACC_PROTECTED, andACC_PRIVATEset (§2.7.8). A class field may not have bothACC_FINALandACC_VOLATILEset (§2.9.1).Each interface field is implicitly
staticandfinal(§2.13.4) and must have both itsACC_STATICandACC_FINALflags set. Each interface field is implicitlypublic(§2.13.4) and must have itsACC_PUBLICflag set.
The value of thedescriptor_indexname_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Utf8_info(§4.4.7) structure which must represent a valid Java field name (§2.7) stored as a simple (not fully qualified) name (§2.7.1), that is, as a Java identifier.
The value of theattributes_countdescriptor_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Utf8(§4.4.7) structure which must represent a valid Java field descriptor (§4.3.2).
The value of theattributes[]attributes_countitem indicates the number of additional attributes (§4.7) of this field.
Each value of theattributestable must be a variable-length attribute structure. A field can have any number of attributes (§4.7) associated with it.The only attribute defined for the
attributestable of afield_infostructure by this specification is theConstantValueattribute (§4.7.3).A Java Virtual Machine implementation must recognize
ConstantValueattributes in theattributestable of afield_infostructure. A Java Virtual Machine implementation is required to silently ignore any or all other attributes in theattributestable that it does not recognize. Attributes not defined in this specification are not allowed to affect the semantics of theclassfile, but only to provide additional descriptive information (§4.7.1).
<init>, is described by a variable-length method_info structure. The structure has the following format:
method_info {
The items of theu2 access_flags;u2 name_index;u2 descriptor_index;u2 attributes_count;attribute_info attributes[attributes_count];}
method_info structure are as follows:
The value of thename_indexaccess_flagsitem is a mask of modifiers used to describe access permission to and properties of a method or instance initialization method (§3.8). Theaccess_flagsmodifiers are shown in Table 4.4.Methods in interfaces may only use flags indicated in Table 4.4 as used by any method. Class and instance methods (§2.10.3) may use any of the flags in Table 4.4. Instance initialization methods (§3.8) may only use
ACC_PUBLIC,ACC_PROTECTED, andACC_PRIVATE.All unused bits of the
access_flagsitem, including those not assigned in Table 4.4, are reserved for future use. They should be set to zero in generatedclassfiles and should be ignored by Java Virtual Machine implementations.At most one of the flags
ACC_PUBLIC,ACC_PROTECTED, andACC_PRIVATEmay be set for any method. Class and instance methods may not useACC_ABSTRACTtogether withACC_FINAL,ACC_NATIVE, orACC_SYNCHRONIZED(that is,nativeandsynchronizedmethods require an implementation). A class or instance method may not useACC_PRIVATEwithACC_ABSTRACT(that is, aprivatemethod cannot be overridden, so such a method could never be implemented or used). A class or instance method may not useACC_STATICwithACC_ABSTRACT(that is, astaticmethod is implicitlyfinaland thus cannot be overridden, so such a method could never be implemented or used).Class and interface initialization methods (§3.8), that is, methods named
<clinit>, are called implicitly by the Java Virtual Machine; the value of theiraccess_flagsitem is ignored.Each interface method is implicitly
abstract, and so must have itsACC_ABSTRACTflag set. Each interface method is implicitlypublic(§2.13.5), and so must have itsACC_PUBLICflag set.
The value of thedescriptor_indexname_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Utf8_info(§4.4.7) structure representing either one of the special internal method names (§3.8), either<init>or<clinit>, or a valid Java method name (§2.7), stored as a simple (not fully qualified) name (§2.7.1).
The value of theattributes_countdescriptor_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Utf8_info(§4.4.7) structure representing a valid Java method descriptor (§4.3.3).
The value of theattributes[]attributes_countitem indicates the number of additional attributes (§4.7) of this method.
Each value of theattributestable must be a variable-length attribute structure. A method can have any number of optional attributes (§4.7) associated with it.The only attributes defined by this specification for the
attributestable of amethod_infostructure are theCode(§4.7.4) andExceptions(§4.7.5) attributes.A Java Virtual Machine implementation must recognize
Code(§4.7.4) andExceptions(§4.7.5) attributes. A Java Virtual Machine implementation is required to silently ignore any or all other attributes in theattributestable of amethod_infostructure that it does not recognize. Attributes not defined in this specification are not allowed to affect the semantics of theclassfile, but only to provide additional descriptive information (§4.7.1).
ClassFile (§4.1), field_info (§4.5), method_info (§4.6), and
Code_attribute (§4.7.4) structures of the class file format. All attributes have the following general format:
attribute_info {
For all attributes, theu2 attribute_name_index;u4 attribute_length;u1 info[attribute_length];}
attribute_name_index must be a valid unsigned 16-bit index into the constant pool of the class. The constant_pool entry at attribute_name_index must be a CONSTANT_Utf8 (§4.4.7) string representing the name of the attribute. The value of the attribute_length item indicates the length of the subsequent information in bytes. The length does not include the initial six bytes that contain the attribute_name_index and attribute_length items.
Certain attributes are predefined as part of the class file specification. The predefined attributes are the SourceFile (§4.7.2), ConstantValue (§4.7.3), Code (§4.7.4), Exceptions (§4.7.5), LineNumberTable (§4.7.6), and Local-VariableTable (§4.7.7) attributes. Within the context of their use in this specification, that is, in the attributes tables of the class file structures in which they appear, the names of these predefined attributes are reserved.
Of the predefined attributes, the Code, ConstantValue, and Exceptions attributes must be recognized and correctly read by a class file reader for correct interpretation of the class file by a Java Virtual Machine. Use of the remaining predefined attributes is optional; a class file reader may use the information they contain, and otherwise must silently ignore those attributes.
class files containing new attributes in the attributes tables of class file structures. Java Virtual Machine
implementations are permitted to recognize and use new attributes found in the
attributes tables of class file structures. However, all attributes not defined as part of
this Java Virtual Machine specification must not affect the semantics of class or
interface types. Java Virtual Machine implementations are required to silently
ignore attributes they do not recognize.
For instance, defining a new attribute to support vendor-specific debugging is permitted. Because Java Virtual Machine implementations are required to ignore attributes they do not recognize, class files intended for that particular Java Virtual Machine implementation will be usable by other implementations even if those implementations cannot make use of the additional debugging information that the class files contain.
Java Virtual Machine implementations are specifically prohibited from throwing an exception or otherwise refusing to use class files simply because of the presence of some new attribute. Of course, tools operating on class files may not run correctly if given class files that do not contain all the attributes they require.
Two attributes that are intended to be distinct, but that happen to use the same attribute name and are of the same length, will conflict on implementations that recognize either attribute. Attributes defined other than by Sun must have names chosen according to the package naming convention defined by The Java Language Specification. For instance, a new attribute defined by Netscape might have the name "COM.Netscape.new-attribute".
Sun may define additional attributes in future versions of this class file specification.
SourceFile attribute is an optional fixed-length attribute in the attributes table of
the ClassFile (§4.1) structure. There can be no more than one SourceFile attribute in
the attributes table of a given ClassFile structure.
The SourceFile attribute has the format
SourceFile_attribute {
The items of theu2 attribute_name_index;u4 attribute_length;u2 sourcefile_index;}
SourceFile_attribute structure are as follows:
The value of theattribute_lengthattribute_name_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Utf8_info(§4.4.7) structure representing the string"SourceFile".
The value of thesourcefile_indexattribute_lengthitem of aSourceFile_attributestructure must be2.
The value of thesourcefile_indexitem must be a valid index into theconstant_pooltable. The constant pool entry at that index must be aCONSTANT_Utf8_info(§4.4.7) structure representing the string giving the name of the source file from which thisclassfile was compiled.Only the name of the source file is given by the
SourceFileattribute. It never represents the name of a directory containing the file or an absolute path name for the file. For instance, theSourceFileattribute might contain the file namefoo.javabut not the UNIX pathname/home/lindholm/foo.java.
ConstantValue attribute is a fixed-length attribute used in the attributes table of the
field_info (§4.5) structures. A ConstantValue attribute represents the value of a constant
field that must be (explicitly or implicitly) static; that is, the ACC_STATIC bit (§Table
4.3) in the flags item of the field_info structure must be set. The field is not required
to be final. There can be no more than one ConstantValue attribute in the attributes
table of a given field_info structure. The constant field represented by the field_info
structure is assigned the value referenced by its ConstantValue attribute as part of its
initialization (§2.16.4).
Every Java Virtual Machine implementation must recognize ConstantValue attributes.
The ConstantValue attribute has the format
ConstantValue_attribute {
The items of theu2 attribute_name_index;u4 attribute_length;u2 constantvalue_index;}
ConstantValue_attribute structure are as follows:
The value of theattribute_lengthattribute_name_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Utf8_info(§4.4.7) structure representing the string"ConstantValue".
The value of theconstantvalue_indexattribute_lengthitem of aConstantValue_attributestructure must be2.
The value of theconstantvalue_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must give the constant value represented by this attribute.The
constant_poolentry must be of a type appropriate to the field, as shown by Table 4.5.
Field Type Entry Type longCONSTANT_LongfloatCONSTANT_FloatdoubleCONSTANT_Doubleint,short,char,byte,booleanCONSTANT_Integerjava.lang.StringCONSTANT_String
Code attribute is a variable-length attribute used in the attributes table of
method_info structures. A Code attribute contains the Java Virtual Machine instructions and auxiliary information for a single Java method, instance initialization
method (§3.8), or class or interface initialization method (§3.8). Every Java Virtual
Machine implementation must recognize Code attributes. There must be exactly one
Code attribute in each method_info structure.
The Code attribute has the format
Code_attribute {
The items of theu2 attribute_name_index;u4 attribute_length;u2 max_stack;u2 max_locals;u4 code_length;u1 code[code_length];u2 exception_table_length;{ u2 start_pc;u2 end_pc;u2 handler_pc;u2 catch_type;} exception_table[exception_table_length];u2 attributes_count;attribute_info attributes[attributes_count];}
Code_attribute structure are as follows:
The value of theattribute_lengthattribute_name_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Utf8_info(§4.4.7) structure representing the string"Code".
The value of themax_stackattribute_lengthitem indicates the length of the attribute, excluding the initial six bytes.
The value of themax_localsmax_stackitem gives the maximum number of words on the operand stack at any point during execution of this method.
The value of thecode_lengthmax_localsitem gives the number of local variables used by this method, including the parameters passed to the method on invocation. The index of the first local variable is0. The greatest local variable index for a one-word value ismax_locals-1. The greatest local variable index for a two-word value ismax_locals-2.
The value of thecode[]code_lengthitem gives the number of bytes in thecodearray for this method. The value ofcode_lengthmust be greater than zero; thecodearray must not be empty.
Theexception_table_lengthcodearray gives the actual bytes of Java Virtual Machine code that implement the method.When the
codearray is read into memory on a byte addressable machine, if the first byte of the array is aligned on a 4-byte boundary, the tableswitch and lookupswitch 32-bit offsets will be 4-byte aligned; refer to the descriptions of those instructions for more information on the consequences ofcodearray alignment.The detailed constraints on the contents of the
codearray are extensive and are given in a separate section (§4.8).
The value of theexception_table[]exception_table_lengthitem gives the number of entries in theexception_tabletable.
Each entry in thestart_pc, end_pcexception_tablearray describes one exception handler in thecodearray. Eachexception_tableentry contains the following items:
The values of the two itemshandler_pcstart_pcandend_pcindicate the ranges in thecodearray at which the exception handler is active. The value ofstart_pcmust be a valid index into thecodearray of the opcode of an instruction. The value ofend_pceither must be a valid index into thecodearray of the opcode of an instruction, or must be equal tocode_length, the length of thecodearray. The value ofstart_pcmust be less than the value ofend_pc.The
start_pcis inclusive andend_pcis exclusive; that is, the exception handler must be active while the program counter is within the interval [start_pc,end_pc).2
The value of thecatch_typehandler_pcitem indicates the start of the exception handler. The value of the item must be a valid index into thecodearray, must be the index of the opcode of an instruction, and must be less than the value of thecode_lengthitem.
If the value of theattributes_countcatch_typeitem is nonzero, it must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Class_info(§4.4.1) structure representing a class of exceptions that this exception handler is designated to catch. This class must be the classThrowableor one of its subclasses. The exception handler will be called only if the thrown exception is an instance of the given class or one of its subclasses.If the value of the
catch_typeitem is zero, this exception handler is called for all exceptions. This is used to implementfinally(see Section 7.13, "Compiling finally").
The value of theattributes[]attributes_countitem indicates the number of attributes of theCodeattribute.
Each value of theattributestable must be a variable-length attribute structure. ACodeattribute can have any number of optional attributes associated with it.Currently, the
LineNumberTable(§4.7.6) andLocalVariableTable(§4.7.7) attributes, both of which contain debugging information, are defined and used with theCodeattribute.A Java Virtual Machine implementation is permitted to silently ignore any or all attributes in the
attributestable of aCodeattribute. Attributes not defined in this specification are not allowed to affect the semantics of theclassfile, but only to provide additional descriptive information (§4.7.1).
Exceptions attribute is a variable-length attribute used in the attributes table of a
method_info (§4.6) structure. The Exceptions attribute indicates which checked exceptions a method may throw. There must be exactly one Exceptions attribute in each
method_info structure.
The Exceptions attribute has the format
Exceptions_attribute {
The items of theu2 attribute_name_index;u4 attribute_length;u2 number_of_exceptions;u2 exception_index_table[number_of_exceptions];}
Exceptions_attribute structure are as follows:
The value of theattribute_lengthattribute_name_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must betheCONSTANT_Utf8_info(§4.4.7) structure representing the string"Exceptions".
The value of thenumber_of_exceptionsattribute_lengthitem indicates the attribute length, excluding the initial six bytes.
The value of theexception_index_table[]number_of_exceptionsitem indicates the number of entries in theexception_index_table.
Each nonzero value in theA method should only throw an exception if at least one of the following three criteria is met:exception_index_tablearray must be a valid index into theconstant_pooltable. For each table item, ifexception_index_table[i] != 0, where0£ i <number_of_exceptions, then theconstant_poolentry at indexexception_index_table[i]must be aCONSTANT_Class_info(§4.4.1) structure representing a class type that this method is declared to throw.
RuntimeException or one of its subclasses.
Error or one of its subclasses.
exception_index_table above, or one of their subclasses.
throws clauses when classes
are verified.
LineNumberTable attribute is an optional variable-length attribute in the attributes
table of a Code (§4.7.4) attribute. It may be used by debuggers to determine which
part of the Java Virtual Machine code array corresponds to a given line number in the
original Java source file. If LineNumberTable attributes are present in the attributes
table of a given Code attribute, then they may appear in any order. Furthermore, multiple LineNumberTable attributes may together represent a given line of a Java source
file; that is, LineNumberTable attributes need not be one-to-one with source lines.3
The LineNumberTable attribute has the format
LineNumberTable_attribute {
The items of theu2 attribute_name_index;u4 attribute_length;u2 line_number_table_length;{ u2 start_pc;u2 line_number;} line_number_table[line_number_table_length];}
LineNumberTable_attribute structure are as follows:
attribute_lengthThe value of theattribute_name_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Utf8_info(§4.4.7) structure representing the string"LineNumberTable".
The value of theline_number_table_lengthattribute_lengthitem indicates the length of the attribute, excluding the initial six bytes.
The value of theline_number_table[]line_number_table_lengthitem indicates the number of entries in theline_number_tablearray.
Each entry in thestart_pcline_number_tablearray indicates that the line number in the original Java source file changes at a given point in thecodearray. Each entry must contain the following items:
The value of theline_numberstart_pcitem must indicate the index into thecodearray at which the code for a new line in the original Java source file begins. The value ofstart_pcmust be less than the value of thecode_lengthitem of theCodeattribute of which thisLineNumberTableis an attribute.
The value of theline_numberitem must give the corresponding line number in the original Java source file.
LocalVariableTable attribute is an optional variable-length attribute of a Code
(§4.7.4) attribute. It may be used by debuggers to determine the value of a given
local variable during the execution of a method. If LocalVariableTable attributes are
present in the attributes table of a given Code attribute, then they may appear in any
order. There may be no more than one LocalVariableTable attribute per local variable
in the Code attribute.
The LocalVariableTable attribute has the format
LocalVariableTable_attribute {
The items of theu2 attribute_name_index;u4 attribute_length;u2 local_variable_table_length;{ u2 start_pc;u2 length;u2 name_index;u2 descriptor_index;u2 index;} local_variable_table[local_variable_table_length];}
LocalVariableTable_attribute structure are as follows:
The value of theattribute_lengthattribute_name_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Utf8_info(§4.4.7) structure representing the string"LocalVariableTable".
The value of thelocal_variable_table_lengthattribute_lengthitem indicates the length of the attribute, excluding the initial six bytes.
The value of thelocal_variable_table[]local_variable_table_lengthitem indicates the number of entries in thelocal_variable_tablearray.
Each entry in thestart_pc, lengthlocal_variable_tablearray indicates a range ofcodearray offsets within which a local variable has a value. It also indicates the index into the local variables of the current frame at which that local variable can be found. Each entry must contain the following items:
The given local variable must have a value at indices into thename_index, descriptor_indexcodearray in the interval [start_pc,start_pc+length], that is, betweenstart_pcandstart_pc+lengthinclusive. The value ofstart_pcmust be a valid index into thecodearray of thisCodeattribute of the opcode of an instruction. The value ofstart_pc+lengthmust be either a valid index into thecodearray of thisCodeattribute of the opcode of an instruction, or the first index beyond the end of thatcodearray.
The value of theindexname_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must contain aCONSTANT_Utf8_info(§4.4.7)structure representing a valid Java local variable name stored as a simple name (§2.7.1).The value of the
descriptor_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must contain aCONSTANT_Utf8_info(§4.4.7) structure representing a valid descriptor for a Java local variable. Java local variable descriptors have the same form as field descriptors (§4.3.2).
The given local variable must be atindexin its method's local variables. If the local variable atindexis a two-word type (doubleorlong), it occupies bothindexandindex+1.
code array of the
Code attribute of a method_info structure of a class file. This section describes the constraints associated with the contents of the Code_attribute structure.
class file are those defining the well-formedness of the
file. With the exception of the static constraints on the Java Virtual Machine code of
the class file, these constraints have been given in the previous section. The static
constraints on the Java Virtual Machine code in a class file specify how Java Virtual
Machine instructions must be laid out in the code array, and what the operands of
individual instructions must be.
The static constraints on the instructions in the code array are as follows:
code array must not be empty, so the code_length attribute cannot have the value 0.
code array begins at index 0.
code array. Instances of instructions using the reserved opcodes (§6.2), the _quick opcodes documented in Chapter 9, "An Optimization," or any opcodes not documented in this specification may not appear in the code array.
code array except the last, the index of the opcode of the next instruction equals the index of the opcode of the current instruction plus the length of that instruction, including all its operands. The wide instruction is treated like any other instruction for these purposes; the opcode specifying the operation that a wide instruction is to modify is treated as one of the operands of that wide instruction. That opcode must never be directly reachable by the computation.
code array must be the byte at index code_length-1.
code array are as follows:
constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_Integer, CONSTANT_Float, or CONSTANT_String.
constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_Long or CONSTANT_double. In addition, the subsequent constant pool index must also be a valid index into the constant pool, and the constant pool entry at that index must not be used.
constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_Fieldref.
constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_Methodref.
<init>, the instance initialization method (§3.8). No other method whose name begins with the character '<' ('u003c') may be called by the method invocation instructions. In particular, the class initialization method <clinit> is never called explicitly from Java Virtual Machine instructions, but only implicitly by the Java Virtual Machine itself.
constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_InterfaceMethodref. The value of the nargs operand of each invokeinterface instruction must be the same as the number of argument words implied by the descriptor of the CONSTANT_NameAndType_info structure referenced by the CONSTANT_InterfaceMethodref constant pool entry. The fourth operand byte of each invokeinterface instruction must have the value zero.
constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_Class.
CONSTANT_Class constant_pool table entry representing an array class. The new instruction cannot be used to create an array. The new instruction also cannot be used to create an interface or an instance of an abstract class, but those checks are performed at link time.
CONSTANT_Class operand, it must not attempt to create more dimensions than are in the array type. The dimensions operand of each multianewarray instruction must not be zero.
T_BOOLEAN (4), T_CHAR (5), T_FLOAT (6), T_DOUBLE (7), T_BYTE (8), T_SHORT (9), T_INT (10), or T_LONG (11).
max_locals-1.
max_locals-1.
max_locals-2.
max_locals-2.
code array specify constraints on relationships
between Java Virtual Machine instructions. The structural constraints are as follows:
int is also permitted to operate on values of type byte, char, and short. (As noted in §3.11.1, the Java Virtual Machine internally converts values of types byte, char, and short to type int.)
long or double) be reversed or split up. At no point can the words of a two-word type be operated on individually.
max_stack words.
<init>, a method in this, a private method, or a method in a superclass of this.
<init> is invoked, an uninitialized class instance must be in an appropriate position on the operand stack. The <init> method must never be invoked on an initialized class instance.
finally clause. However, an uninitialized class instance may be on the operand stack in code protected by an exception handler or a finally clause. When an exception is thrown, the contents of the operand stack are discarded.
Object, must call either another instance initialization method of this or an instance initialization method of its immediate superclass super before its instance members are accessed. However, this is not necessary in the case of class Object, which does not have a superclass (§2.4.6).
abstract method must never be invoked.
byte, char, short, or int, only the ireturn instruction may be used. If the method returns a float, long, or double, only an freturn, lreturn, or dreturn instruction, respectively, may be used. If the method returns a reference type, it must do so using an areturn instruction, and the returned value must be assignment compatible (§2.6.6) with the return descriptor (§4.3.3) of the method. All instance initialization methods, static initializers, and methods declared to return void must only use the return instruction.
protected field of a superclass, then the type of the class instance being accessed must be the same as or a subclass of the current class. If invokevirtual is used to access a protected method of a superclass, then the type of the class instance being accessed must be the same as or a subclass of the current class.
byte, char, short, or int, then the value must be an int. If the descriptor type is float, long, or double, then the value must be a float, long, or double, respectively. If the descriptor type is a reference type, then the value must be of a type that is assignment compatible (§2.6.6) with the descriptor type.
reference by an aastore instruction must be assignment compatible (§2.6.6) with the component type of the array.
Throwable or of subclasses of Throwable.
code array.
returnAddress) may be loaded from a local variable.
try-finally constructs from within a finally clause. For more information on Java Virtual Machine subroutines, see §4.9.6.)
returnAddress can be returned to at most once. If a ret instruction returns to a point in the subroutine call chain above the ret instruction corresponding to a given instance of type returnAddress, then that instance can never be used as a return address.
class files. The HotJava browser needs to determine whether the class file
was produced by a trustworthy Java compiler or by an adversary attempting to
exploit the interpreter.
An additional problem with compile-time checking is version skew. A user may have successfully compiled a class, say PurchaseStockOptions, to be a subclass of TradingClass. But the definition of TradingClass might have changed in a way that is not compatible with preexisting binaries since the time the class was compiled. Methods might have been deleted, or had their return types or modifiers changed. Fields might have changed types or changed from instance variables to class variables. The access modifiers of a method or variable may have changed from public to private. For a discussion of these issues, see Chapter 13, "Binary Compatibility," in The Java Language Specification.
Because of these potential problems, the Java Virtual Machine needs to verify for itself that the desired constraints hold on the class files it attempts to incorporate. A well-written Java Virtual Machine emulator could reject poorly formed instructions when a class file is loaded. Other constraints could be checked at run time. For example, a Java Virtual Machine implementation could tag runtime data and have each instruction check that its operands are of the right type.
Instead, Sun's Java Virtual Machine implementation verifies that each class file it considers untrustworthy satisfies the necessary constraints at linking time (§2.16.3). Structural constraints on the Java Virtual Machine code are checked using a simple theorem prover.
Linking-time verification enhances the performance of the interpreter. Expensive checks that would otherwise have to be performed to verify constraints at run time for each interpreted instruction can be eliminated. The Java Virtual Machine can assume that these checks have already been performed. For example, the Java Virtual Machine will already know the following:
class file verifier is independent of any Java compiler. It should certify all code generated by Sun's current Java compiler; it should also certify code that other compilers can generate, as well as code that the current compiler could not possibly generate. Any class file that satisfies the structural criteria and static constraints will be certified by the verifier.
The class file verifier is also independent of the Java language. Other languages can be compiled into the class format, but will only pass verification if they satisfy the same constraints as a class file compiled from Java source.
class file verifier operates in four passes:
Pass 1: When a prospective class file is loaded (§2.16.2) by the Java Virtual
Machine, the Java Virtual Machine first ensures that the file has the basic format of a
Java class file. The first four bytes must contain the right magic number. All recognized attributes must be of the proper length. The class file must not be truncated or
have extra bytes at the end. The constant pool must not contain any superficially
unrecognizable information.
While class file verification properly occurs during class linking (§2.16.3), this check for basic class file integrity is necessary for any interpretation of the class file contents and can be considered to be logically part of the verification process.
Pass 2: When the class file is linked, the verifier performs all additional verification
that can be done without looking at the code array of the Code attribute (§4.7.4). The
checks performed by this pass include the following:
final classes are not subclassed, and that final methods are not overridden.
Object) has a superclass.
CONSTANT_Utf8 string reference in the constant pool.
Pass 3: Still during linking, the verifier checks the code array of the Code attribute for
each method of the class file by performing data-flow analysis on each method. The
verifier ensures that at any given point in the program, no matter what code path is
taken to reach that point:
Pass 4: For efficiency reasons, certain tests that could in principle be performed in
Pass 3 are delayed until the first time the code for the method is actually invoked. In
so doing, Pass 3 of the verifier avoids loading class files unless it has to.
For example, if a method invokes another method that returns an instance of class A, and that instance is only assigned to a field of the same type, the verifier does not bother to check if the class A actually exists. However, if it is assigned to a field of the type B, the definitions of both A and B must be loaded in to ensure that A is a subclass of B.
Pass 4 is a virtual pass whose checking is done by the appropriate Java Virtual Machine instructions. The first time an instruction that references a type is executed, the executing instruction does the following:
LinkageError to be thrown.
A Java Virtual Machine is allowed to perform any or all of the Pass 4 steps, except for class or interface initialization, as part of Pass 3; see 2.16.1, "Virtual Machine Start-up" for an example and more discussion.
In Sun's Java Virtual Machine implementation, after the verification has been performed, the instruction in the Java Virtual Machine code is replaced with an alternative form of the instruction (see Chapter 9, "An Optimization"). For example, the opcode new is replaced with new_quick. This alternative instruction indicates that the verification needed by this instruction has taken place and does not need to be performed again. Subsequent invocations of the method will thus be faster. It is illegal for these alternative instruction forms to appear in class files, and they should never be encountered by the verifier.
class file verification. This section looks at the verification of Java Virtual Machine code in more detail.
The code for each method is verified independently. First, the bytes that make up the code are broken up into a sequence of instructions, and the index into the code array of the start of each instruction is placed in an array. The verifier then goes through the code a second time and parses the instructions. During this pass a data structure is built to hold information about each Java Virtual Machine instruction in the method. The operands, if any, of each instruction are checked to make sure they are valid. For instance:
code array for the method.
int or float, or for instances of class String; the instruction getfield must reference a field.
byte, short, char) when determining the value types on the operand stack. Next, a data-flow analyzer is initialized. For the first instruction of the method, the local variables which represent parameters initially contain values of the types indicated by the method's type descriptor; the operand stack is empty. All other local variables contain an illegal value. For the other instructions, which have not been examined yet, no information is available regarding the operand stack or local variables.
Finally, the data-flow analyzer is run. For each instruction, a "changed" bit indicates whether this instruction needs to be looked at. Initially, the "changed" bit is only set for the first instruction. The data-flow analyzer executes the following loop:
reference values may appear at corresponding places on the two stacks. In this case, the merged operand stack contains a reference to an instance of the first common superclass or common superinterface of the two types. Such a reference type always exists because the type Object is a supertype of all class and interface types. If the operand stacks cannot be merged, verification of the method fails.
To merge two local variable states, corresponding pairs of local variables are compared. If the two types are not identical, then unless both contain reference values, the verifier records that the local variable contains an unusable value. If both of the pair of local variables contain reference values, the merged state contains a reference to an instance of the first common superclass of the two types.
If the data-flow analyzer runs on a method without reporting a verification failure, then the method has been successfully verified by Pass 3 of the class file verifier.
Certain instructions and data types complicate the data-flow analyzer. We now examine each of these in more detail.
long and double types each take two consecutive words on the operand
stack and in the local variables.
Whenever a long or double is moved into a local variable, the subsequent local variable is marked as containing the second half of a long or double. This special value indicates that all references to the long or double must be through the index of the lower-numbered local variable.
Whenever any value is moved to a local variable, the preceding local variable is examined to see if it contains the first word of a long or a double. If so, that preceding local variable is changed to indicate that it now contains an unusable value. Since half of the long or double has been overwritten, the other half must no longer be used.
Dealing with 64-bit quantities on the operand stack is simpler; the verifier treats them as single units on the stack. For example, the verification code for the dadd opcode (add two double values) checks that the top two items on the stack are both of type double. When calculating operand stack length, values of type long and double have length two.
Untyped instructions that manipulate the operand stack must treat values of type double and long as atomic. For example, the verifier reports a failure if the top value on the stack is a double and it encounters an instruction such as pop or dup. The instructions pop2 or dup2 must be used instead.
...
can be implemented by the following:new myClass(i, j, k);...
...
new #1 // Allocate uninitialized space forThis instruction sequence leaves the newly created and initialized object on top of the operand stack. (More examples of compiling Java code to the instruction set of the Java Virtual Machine are given in Chapter 7, "Compiling for the Java Virtual Machine.")myClassdup // Duplicate object on the operand stack iload_1 // Push i iload_2 // Push j iload_3 // Push k invokespecialmyClass.<init>// Initialize object...
The instance initialization method <init> for class myClass sees the new uninitialized object as its this argument in local variable 0. It must either invoke an alternative instance initialization method for class myClass or invoke the initialization method of a superclass on the this object before it is allowed to do anything else with this.
When doing dataflow analysis on instance methods, the verifier initializes local variable 0 to contain an object of the current class, or, for instance initialization methods, local variable 0 contains a special type indicating an uninitialized object. After an appropriate initialization method is invoked (from the current class or the current superclass) on this object, all occurrences of this special type on the verifier's model of the operand stack and in the local variables are replaced by the current class type. The verifier rejects code that uses the new object before it has been initialized or that initializes the object twice. In addition, it ensures that every normal return of the method has either invoked an initialization method in the class of this method or in the direct superclass.
Similarly, a special type is created and pushed on the verifier's model of the operand stack as the result of the Java Virtual Machine instruction new. The special type indicates the instruction by which the class instance was created and the type of the uninitialized class instance created. When an initialization method is invoked on that class instance, all occurrences of the special type are replaced by the intended type of the class instance. This change in type may propagate to subsequent instructions as the dataflow analysis proceeds.
The instruction number needs to be stored as part of the special type, as there may be multiple not-yet-initialized instances of a class in existence on the operand stack at one time. For example, the Java Virtual Machine instruction sequence that implements
new InputStream(new Foo(), new InputStream("foo"))
may have two uninitialized instances of InputStream on the operand stack at once.
When an initialization method is invoked on a class instance, only those occurrences of the special type on the operand stack or in the registers that are the same
object as the class instance are replaced.
A valid instruction sequence must not have an uninitialized object on the operand stack or in a local variable during a backwards branch, or in a local variable in code protected by an exception handler or a finally clause. Otherwise, a devious piece of code might fool the verifier into thinking it had initialized a class instance when it had, in fact, initialized a class instance created in a previous pass through the loop.
class file verifier since they do not pose a
threat to the integrity of the Java Virtual Machine. As long as every nonexceptional
path to the exception handler causes there to be a single object on the operand stack,
and as long as all other criteria of the verifier are met, the verifier will pass the code.
finally
...
the Java language guarantees thattry {startFaucet();waterLawn();} finally {stopFaucet();}...
stopFaucet is invoked (the faucet is turned off)
whether we finish watering the lawn or whether an exception occurs while starting
the faucet or watering the lawn. That is, the finally clause is guaranteed to be executed whether its try clause completes normally, or completes abruptly by throwing an exception.
To implement the try-finally construct, the Java compiler uses the exception-handling facilities together with two special instructions jsr ("jump to subroutine") and ret ("return from subroutine"). The finally clause is compiled as a subroutine within the Java Virtual Machine code for its method, much like the code for an exception handler. When a jsr instruction that invokes the subroutine is executed, it pushes its return address, the address of the instruction after the jsr that is being executed, onto the operand stack as a value of type returnAddress. The code for the subroutine stores the return address in a local variable. At the end of the subroutine, a ret instruction fetches the return address from the local variable and transfers control to the instruction at the return address.
Control can be transferred to the finally clause (the finally subroutine can be invoked) in several different ways. If the try clause completes normally, the finally subroutine is invoked via a jsr instruction before evaluating the next Java expression. A break or continue inside the try clause that transfers control outside the try clause executes a jsr to the code for the finally clause first. If the try clause executes a return, the compiled code does the following:
finally clause.
finally clause, returns the value saved in the local variable.
try clause. If an exception is thrown in the try clause, this exception
handler does the following:
finally clause.
finally clause, rethrows the exception.
try-finally construct, see
Section 7.13, "Compiling finally."
The code for the finally clause presents a special problem to the verifier. Usually, if a particular instruction can be reached via multiple paths and a particular local variable contains incompatible values through those multiple paths, then the local variable becomes unusable. However, a finally clause might be called from several different places, yielding several different circumstances:
return may have some local variable that contains the return value.
try clause may have an indeterminate value in that same local variable.
finally clause itself might pass verification, but after updating
all the successors of the ret instruction, the verifier would note that the local
variable that the exception handler expects to hold an exception, or that the
return code expects to hold a return value, now contains an indeterminate
value.
Verifying code that contains a finally clause is complicated. The basic idea is the following:
finally clause, it is of length one. For multiply nested finally code (extremely rare!), it may be longer than one.
class File Formatconstant_pool_count field of the ClassFile structure (§4.1). This acts as an internal limit on the total complexity of a single class.
exception_table of the Code attribute (§4.7.4), in the LineNumberTable attribute (§4.7.6), and in the LocalVariableTable attribute (§4.7.7).
max_locals item of the ClassFile structure (§4.1). (Recall that values of type long and double are considered to occupy two local variables.)
fields_count item of the ClassFile structure (§4.1).
methods_count item of the ClassFile structure (§4.1).
max_stack field of the Code_attribute structure (§4.7.4).
this in the case of instance method invocations. Note that the limit is on the number of words of method arguments, and not on number of arguments themselves. Arguments of type long and double are two words long; arguments of all other types are one word long.
2
The fact that end_pc is exclusive is an historical mistake in the Java Virtual Machine: if the Java Virtual Machine code for a method is exactly 65535 bytes long and ends with an instruction that is one byte long, then that instruction cannot be protected by an exception handler. A compiler writer can work around this bug by limiting the maximum size of the generated Java Virtual Machine code for any method, instance initialization method, or static initializer (the size of any code array) to 65534 bytes.
3
The javac compiler in Sun's JDK 1.0.2 release can in fact generate LineNumberTable attributes which are not in line number order and which are not one-to-one with source lines. This is unfortunate, as we would prefer to specify a one-to-one, ordered mapping of LineNumberTable attributes to source lines, but must yield to backward compatibility.
Contents | Prev | Next | Index
Java Virtual Machine Specification
Copyright © 1996, 1997 Sun Microsystems, Inc.
All rights reserved
Please send any comments or corrections to jvm@java.sun.com