Hessian 2.0
From Resin 3.0
Line 166: | Line 166: | ||
xf5 x01 x02 x03 x04 x05 | xf5 x01 x02 x03 x04 x05 | ||
+ | |||
+ | == Objects (repeated maps) == | ||
+ | |||
+ | Map which have a consistent set of fields like objects can be represented by a object defintion/object instance pair. | ||
+ | |||
+ | The object defintion defines the expected type (required), the number of fields, and the field names. The object definition also includes data for the first object instances: | ||
+ | |||
+ | 'O' | ||
+ | <int> type-name -- length of type encoded as an integer followed by type name | ||
+ | <int> -- number of fields | ||
+ | (<string>)* -- strings representing the field names | ||
+ | (<object>)* -- object data for the first object instance | ||
+ | |||
+ | The object instance refers to an earlier object definition and then follows with the field data: | ||
+ | |||
+ | 'o' | ||
+ | <int> -- integer referencing the object definition | ||
+ | (<object>)* -- field values |
Latest revision as of 21:37, 23 June 2006
The current draft grammar is at Hessian 2.0 Grammar
Hessian 2.0 is in the very early stages. Feedback is welcome. Some data on efficiency vs Java serialization is at [1]forum.caucho.com/.
Snapshots with draft implementations will be available in the Resin 3.0 snapshot at http://www.caucho.com/download.
Contents |
Non-Goals for Hessian 2.0
We are not planning on any semantic additions or changes for Hessian 2.0. The current datatype and object model is intended to remain the same.
The only changes planned are extra compact encodings for better serialization and performance.
Goals for Hessian 2.0
Hessian 2.0 will be interoperable with Hessian 2.0
A Hessian 1.0 client can talk to any Hessian 2.0 server and receive a Hessian 1.0 response.
A Hessian 2.0 client can use Hessian 1.0 encoding to a server, but indicate that it can upgrade to Hessian 2.0 encoding.
Small number compression
In Hessian 1.0, all 32-bit integers are encoded in 5 bytes: 'I' b3 b2 b1 b0.
Most integers in actual data tends to be small. "0" is the most common integer value and "1" is the next most common value.
Small integers will be encoded in the single lead-byte, e.g. 0x90 might represent integer 0.
Bytes can be encoded in two bytes, e.g. x51 b0. Shorts encoded in three bytes e.g. x53 b1.
Similarly, small longs will have short encodings, and integer-valued doubles also have short encodings.
Short string compression
In Hessian 1.0, strings have a 3-byte overhead, 'S' b1 b0 data.
Hessian 2.0 will encode small strings with only a 1-byte overhead, e.g.
x25 hello
Object definition and instance
Hessian 1.0 encodes objects as associative arrays, where the keys correspond to fields, e.g.
M t x00 x08 test.Car S x00 x05 model S x00 x05 Honda S x00 x04 make S x00 x05 Civic S x00 x05 color S x00 x03 red z
When multiple Car objects are serialized, Hessian 1.0 has unnecessary overhead of duplicating the "test.Car", the "model", the "make", and the "color" strings, even though those fields are unchanged for all Cars.
Hessian 2.0 will have an Object definition/instance, which is equivalent to the above map
O x98 test.Car -- code and type/class x93 -- number of fields encoded as an integer xd5 model -- short string xd4 make xd5 color xd5 Honda -- data for first object follows immediately xd5 Civic xd3 red
A following car would look like:
o x91 -- integer representing defined object xda Volkswagen xd6 Beetle xd4 blue
Encodings in Current Hessian 2.0 Grammar draft
32-bit integers
Direct integers:
0x80 - 0xcf
The codes between 0x80 and 0xcf represent integers between -16 and 63, i.e. code - 0x90
. For example, integer zero is represented as
0x90
Bytes, i.e. integers between -128 and 127:
0x01 b0
Shorts, i.e. integers between -32768 and 32767
0x02 b1 b0
The Hessian 1.0 encoding for integers is always available:
'I' b3 b2 b1 b0
64-bit longs
Direct longs. A single byte representation for the smallest long values. The codes between 0x20 and 0x3f represent 64-bit longs between -16 and 15, e.g. long zero is represented by 0x30
0x20 - 0x3f
Bytes, i.e. longs between -128 and 127
0x03 b0
Shorts, i.e. longs between -32768 and 32767
0x04 b1 b0
32-bit longs
0x05 b3 b2 b1 b0
The Hessian 1.0 encoding for longs is always available:
'L' b7 b6 b5 b4 b3 b2 b1 b0
Doubles
Direct values. 0.0 and 1.0 are represented by a single code
0x06 - 0.0 0x07 - 1.0
Single byte integer doubles. Integer values between -127.0 and 128.0 are represented by
0x08 b0
Two byte integer doubles.
0x09 b1 b0
Four byte integer doubles
0x0b b3 b2 b1 b0
Doubles which are equivalent to floats:
0x0c b3 b2 b1 b0
Where b3,b2,b1,b0 are the byte-encoding of a floag
Short strings
Strings between length 0 and 31 can have the <type,length> represented by a single byte:
0xd0 - 0xef
So, "hello, world" would look like:
0xdc hello, world
Short binary data
Binary data between length 0 and 15 can have the <type,length> represented by a single byte:
0xf0 - 0xff
e.g. new byte[] { 1, 2, 3, 4, 5};
xf5 x01 x02 x03 x04 x05
Objects (repeated maps)
Map which have a consistent set of fields like objects can be represented by a object defintion/object instance pair.
The object defintion defines the expected type (required), the number of fields, and the field names. The object definition also includes data for the first object instances:
'O' <int> type-name -- length of type encoded as an integer followed by type name <int> -- number of fields (<string>)* -- strings representing the field names (<object>)* -- object data for the first object instance
The object instance refers to an earlier object definition and then follows with the field data:
'o' <int> -- integer referencing the object definition (<object>)* -- field values