Monday 16 March 2009

Tail Call Optimisation on the JVM

It looks like an initial cut of tail-call optimisation (TCO) on the MLVM is in.

This is important, both for general optimisation and also for implementing languages like Scheme on the JVM. The Scheme case is particularly important, because the Scheme standard requires optimisation of tail recursion (as this is the standard idiom for implementing iteration in Scheme).

The basic form of a tail call which we would want to optimise, looks like this in C:


void foo() {
// stuff
// ....

bar();
return;
}


The call to bar() is followed immediately by a return, so if we don't optimise, we're growing the stack unnecessarily. The idea of TCO is to rewrite the current stack frame (and so avoid growing the stack) and then issue an unconditional jump to start executing the callee.

Now, in Java we don't have direct access to modify the current stack frame, so we instead need to have some bytecode which has the semantic meaning of the JVM of "this call must be implemented as a hard tail call". This change requires changes to compiler and interpreter, and is currently implemented like this:

The bytecode for a optimised tail call would look something like this in the patched HotSpot:


0xc4 wide
0xb6 invokevirtual
0x00 // high byte of index of method to call
0x03 // low byte of index of method to call
0xb0 areturn


The wide bytecode is used to indicate that the opcode immediately following it (the embedded instruction) is operating in a variant mode. Correctly implementing interpreters will verify the correctness of the instruction form of wide, parse the form and then evaluate the instruction.

That is, the embedded instruction is considered an operand to wide - not all forms which follow it are legal bytecode as they stand - so in particular, there should never be a jump directly to an embedded instruction. See this link for more details.

The form of wide above is a new, experimental form for supporting TCO (if you look at the current documentation for wide you'll see that all the embedded opcodes it currently supports are of type load / store / inc, so a form which supported invoke* would have to be a new form). To become standardised, the TCO change needs to pass the JCP, which first requires evidence that this will be of sufficient benefit to justify the changes by vendors to support it, so it might be a while - this is a fairly hot-off-the-presses change.

Next up, I'll discuss how to get the TCO patches switched on (and confirm that you have) and some test results for some preliminary cases.

Codegen Update

Since my last post (gosh, a couple of months ago), I've actually been investigating the codegen's for both Kawa and JRuby, and I've come to the conclusion that while Kawa has some very nice features, it was too hard to get it to play nicely with the AST that I'm outputting from sablecc.

The documentation being sparse was also a problem, as was the fact that the classes that Kawa produces are a long way from being usable as regular Java types - and still display a large amount of their Scheme heritage.

I've found that JRuby does a lot better - in some ways it is a lot closer to the AST that I'm producing. In addition, the code is quite easy to follow, and the low-level generation is all handled with ASM.

I have some basic examples of programmatically generating a JRuby AST and then compiling it - there's no nice API method to do it, but by cribbing from the compiler source it's relatively easy.

Adapting this method for purr, in the Main class of the purr compiler, we now have some code like this:


Lexer lexer = new Lexer (new PushbackReader(new BufferedReader(new FileReader(filename)), 1024));

Parser parser = new Parser(lexer);
Start ast = parser.parse();

SemanticAnalyzer sem = new SemanticAnalyzer();
ast.apply(sem);

NodeMapper nmap = new NodeMapper(sem);
ast.apply(nmap);
nmap.stitchTree(ast);

Node jHead = nmap.getNodeMap(ast);
compile(jHead);


where the compile method is as simple as:


private static void compile(Node ast) {

try {
String classname = JavaNameMangler.mangledFilenameForStartupClasspath(filename);

ASTInspector inspector = new ASTInspector();
inspector.inspect(ast);

StandardASMCompiler asmCompiler = new StandardASMCompiler(classname, filename);
ASTCompiler compiler = new ASTCompiler();
compiler.compileRoot(ast, asmCompiler, inspector, false, false);

File file = new File(filename);
asmCompiler.writeClass(file);
}
catch (Exception x) {
System.out.println(x.toString());
}
}


So, still quite a way to go (especially with figuring out all the details of NodeMapper, which is the class which transforms from purr's AST to JRuby's) - but some good progress.