August 28, 2005
Using Tomcat Coyote ( part II )

Memory management is the most important thing to keep in mind when working with Coyote. Just like most other servers - and Apache is a great example - we put a lot of effort into minimizing the buffer copy and allocation overhead in the critical path. Apache2 uses memory pools and bucket brigades - the first improves the allocation times, the second reduces the buffer copy.  In Coyote, ByteChunk is used to reduce copy, and aggressive recycling of each object is minimizing the GC overhead.

When working with Coyote, you need to understand few classes:

  • ByteChunk is the main buffer used in tomcat. At core, it is similar with a java.nio.ByteBuffer, but with many additional methods optimized to work on the bytes. The main thing to keep in mind is that we want to avoid copy - the data from the network should be manipulated in place as much as possible. Eventually higher-level objects like OutputStream will have to be created to support the servlet API, but as long as you're in the low-layer, you should use ByteChunk and its peers as much as possible. There are few sets of methods:
    • StringBuffer or ByteBuffer style append,  allowing the buffer to grow to an optional limit. When the limit is reached - the buffer can flush itself to a sink.
    • Read methods - called strangely 'substract', allowing it's use for implementing input streams and parsing the request.
    • few of the String methods, but operating on the byte buffer:  equals, equalsIgnoreCase, startsWith, startsWithIgnoreCase, findChar(s), findNotChars, indexOf(),  .
    • few conversion methods - getInt, getLong - parsing the bytes directly, without conversion to chars and strings.
  • MessageBytes is a union of a ByteChunk, a String,  a CharChunk, an int and a date. This is mainly used to represent headers. The headers are read as bytes, in the ByteChunk ( pointers to the input ByteChunk are used, no copy is made ). Some of the headers are never used, so no conversion to String is made until the user asks for their string value. Internal header processing can be made in the ByteChunks themself, at least in the common cases. Another function of the MessageBytes is to delay the charset decoding of headers until the right encoding is known. Most of the conversions are optimized to minimize allocations and operate directly on buffers.
  • C2BConverter and B2CConverter are used for the byte[] and char[] charset decoding. It provides the a lot  of optimization over Writer/Reader as NIO does. In future, when tomcat will use NIO - those classes can be simply substituted with CharsetEncoder and CharsetDecoder. The performance should be similar, but it'll be cleaner.
  • MimeHeaders - headers are represented by MessageBytes, with the conversion to String or int performed on demand ( most headers will never be requested by user ). Like all other internal classes, the common case operates without generating garbage.
A code example:
    class MyAdapter implements Adapter {
/** Get the C2BConverter, which allows writing strings and chars to the output.
*/
protected C2BConverter getOut(final Response res) throws IOException {
// You can fine tune the buffers. In a real app - this should be reused, this is just
// an example to show how the buffers are used.
final ByteChunk myChunk=new ByteChunk( 1024 );
myChunk.setLimit(4096);
C2BConverter out=new C2BConverter( myChunk, "UTF8" );

// You can set limit to -1, add to it, and explicitly write when you
// want to, or you can have it write automatically when the buffer is filled.
myChunk.setByteOutputChannel( new ByteChunk.ByteOutputChannel() {
public void realWriteBytes(byte cbuf[], int off, int len) throws IOException {
res.doWrite(myChunk);
}
});
return out;
}

public void service(Request req, final Response res) throws Exception {
System.out.println("Incoming request: uri=" + req.requestURI() +
" queryString=" + req.queryString().toString() +
" protocol=" + req.protocol().toString());

res.setStatus(200);

res.setHeader( "foo", "bar");
res.setContentType("text/plain");

// We must explicitely flush the headers.
res.sendHeaders();

// tomcat doesn't use a stream internally - so the object can be recycled and avoids
// buffer to buffer copy.

// To write to the stream, tomcat uses 'byte chunk' - a buffer optimized for char-byte conversions
// This is similar with NIO buffers
C2BConverter out=getOut(res);


// You can also add bytes
//ByteChunk myChunk=out.getByteChunk();
//String h="hellow";
//byte hB[]=h.getBytes();
//myChunk.append( hB, 0, hB.length);

out.convert("Hello World");
// make sure all the chars are sent to the byte[].
// Unlike OutputStream/Writer - you can mix char and bytes, but you must remember to flush when switching.
out.flushBuffer();

// Send it to the stream. Output will also be sent when the buffer is full, based on the limit
// Buffer size is set with myChunk.setLimit()
out.getByteChunk().flushBuffer();

// final processing
res.finish();

// Alternatively, you can use a growing ByteChunk and write it explicitely:
//res.doWrite(myChunk);
req.recycle();
res.recycle();
}
}


Posted by costin at August 28, 2005 04:03 PM
Comments
Disabled due to spam. Click on the link to post a comment, it'll be sent in email ( and thus usual mail spam filters and blacklist applied ). It may be made accessible later on, but code needs to be written for that. Comments