Tales From Boxcat Junction

Monday, 16 March 2009

Tail Call Optimisation on the JVM

It looks like an initial cut of tail-call optimisation (TCO) on the MLVM is in.

This is important, both for general optimisation and also for implementing languages like Scheme on the JVM. The Scheme case is particularly important, because the Scheme standard requires optimisation of tail recursion (as this is the standard idiom for implementing iteration in Scheme).

The basic form of a tail call which we would want to optimise, looks like this in C:


void foo() {
 // stuff
 // ....

 bar();
 return;
}

The call to bar() is followed immediately by a return, so if we don't optimise, we're growing the stack unnecessarily. The idea of TCO is to rewrite the current stack frame (and so avoid growing the stack) and then issue an unconditional jump to start executing the callee.

Now, in Java we don't have direct access to modify the current stack frame, so we instead need to have some bytecode which has the semantic meaning of the JVM of "this call must be implemented as a hard tail call". This change requires changes to compiler and interpreter, and is currently implemented like this:

The bytecode for a optimised tail call would look something like this in the patched HotSpot:


0xc4   wide
0xb6   invokevirtual
0x00   // high byte of index of method to call
0x03   // low byte of index of method to call
0xb0   areturn

The wide bytecode is used to indicate that the opcode immediately following it (the embedded instruction) is operating in a variant mode. Correctly implementing interpreters will verify the correctness of the instruction form of wide, parse the form and then evaluate the instruction.

That is, the embedded instruction is considered an operand to wide - not all forms which follow it are legal bytecode as they stand - so in particular, there should never be a jump directly to an embedded instruction. See this link for more details.

The form of wide above is a new, experimental form for supporting TCO (if you look at the current documentation for wide you'll see that all the embedded opcodes it currently supports are of type load / store / inc, so a form which supported invoke* would have to be a new form). To become standardised, the TCO change needs to pass the JCP, which first requires evidence that this will be of sufficient benefit to justify the changes by vendors to support it, so it might be a while - this is a fairly hot-off-the-presses change.

Next up, I'll discuss how to get the TCO patches switched on (and confirm that you have) and some test results for some preliminary cases.

Codegen Update

Since my last post (gosh, a couple of months ago), I've actually been investigating the codegen's for both Kawa and JRuby, and I've come to the conclusion that while Kawa has some very nice features, it was too hard to get it to play nicely with the AST that I'm outputting from sablecc.

The documentation being sparse was also a problem, as was the fact that the classes that Kawa produces are a long way from being usable as regular Java types - and still display a large amount of their Scheme heritage.

I've found that JRuby does a lot better - in some ways it is a lot closer to the AST that I'm producing. In addition, the code is quite easy to follow, and the low-level generation is all handled with ASM.

I have some basic examples of programmatically generating a JRuby AST and then compiling it - there's no nice API method to do it, but by cribbing from the compiler source it's relatively easy.

Adapting this method for purr, in the Main class of the purr compiler, we now have some code like this:


Lexer lexer = new Lexer (new PushbackReader(new BufferedReader(new FileReader(filename)), 1024));

Parser parser = new Parser(lexer);
Start ast = parser.parse();

SemanticAnalyzer sem = new SemanticAnalyzer();
ast.apply(sem);

NodeMapper nmap = new NodeMapper(sem);
ast.apply(nmap);
nmap.stitchTree(ast);
             
Node jHead = nmap.getNodeMap(ast);
compile(jHead);

where the compile method is as simple as:


    private static void compile(Node ast) {
       
          try {
                String classname = JavaNameMangler.mangledFilenameForStartupClasspath(filename);

                ASTInspector inspector = new ASTInspector();
                inspector.inspect(ast);

                StandardASMCompiler asmCompiler = new StandardASMCompiler(classname, filename);
                ASTCompiler compiler = new ASTCompiler();
                compiler.compileRoot(ast, asmCompiler, inspector, false, false);
       
                File file = new File(filename);
                asmCompiler.writeClass(file);
            }
            catch (Exception x) {
                System.out.println(x.toString());
            }
    }

So, still quite a way to go (especially with figuring out all the details of NodeMapper, which is the class which transforms from purr's AST to JRuby's) - but some good progress.

Monday, 5 January 2009

Which Codegen and IR to use?

I know I haven't really explained much about why I want to build out a JVM with InvokeDynamic, so I thought I'd try and put some of my thoughts down.

Broadly, I'd like to be able to write a compiler for a dynamic language, consisting of a modest subset of Perl 5's syntax. This is largely just to teach myself more about the internals of the JVM, and about language design and implementation in general, but it might come in handy for other people as well - who knows.

I have been getting to grips with the sablecc parser generator toolkit, and already have a grammar for a working subset of the language (although there's quite a long way to go yet).

Sablecc has some features which rather surprised me at first - its Node is an abstract class rather than an interface and the classes which represent terminals in the abstract syntax are all declared final.

This makes it rather hard to use Decorator to annotate the abstract parse tree (as for a Decorator-type approach one would subclass the abstract parse tree terminals).

After a bit of head scratching, I'm now generating a set of Analyzer classes, each of which are obtained by visiting the tree, and storing additional context information in the Analyzer classes themselves.

Each Analyzer instances can now be though of as being a self-consistent set of annotations about the parse tree - all the annotations relevant to a particular type of analysis can be represented as one Analyzer, but there may be multiple Analyzers corresponding to different, disjoint classes of semantic information.

With this now reasonably well understood, I'm turning my attention to the IR and code generation stages. I'm weighing up 2 possible tools and approaches.

ASM. This has a lot of advantages - it is reasonably well documented, once you're over the initial hump, and seems to have good low-level control as well as consistent use of the Visitor pattern to help with conceptual consistency.
The gnu.bytecode and gnu.expr packages from Kawa. These are part of Kawa (a Scheme-orientated package for producing bytecode).

At the moment I'm still trying to decide which I'm going to opt for. Kawa has the advantage that several other languages are already using it, so it may have better performance for repl and similar approaches (which will be necessary if I want to support eval STRING as a case). It also seems to have a relatively consistent set of classes which could be targeted as an IR (which so far, I don't really see ASM having).

However, Kawa doesn't (so far) to be very well documented - I've found myself having to dig through the source code several times already just to find out whether method names need to be specified as package names or internal names, etc. The conceptual framework doesn't seem to be quite as consistent somehow. I'm also slightly scared by the use of bare integers in some places in the gnu.bytecode package - I'd prefer a symbolic constant for each opcode, which I can pull in with a wildcarded static import.

Still, I'll post more as I investigate further.

Thursday, 18 December 2008

Some Updates to the Build Process

Thanks to the bsd-port-dev mailing list for removing some cryft from the build process.

One important thing is that the latest repo seems to now be broken with SoyLatte. Bootstrapping with an earlier OpenJDK 7 (eg the ones on Landon Fuller's page: http://landonf.bikemonkey.org/code/java/OpenJDK_7_Binaries.20080820.html) now seems to be essential.

The current repo is now working on Mac OS X again, providing that you apply the patches provided by Michael Franz (see: http://mail.openjdk.java.net/pipermail/bsd-port-dev/2008-December/000189.html for the post, the attachment with the patch is: http://mail.openjdk.java.net/pipermail/bsd-port-dev/attachments/20081213/9b7550d6/attachment-0001.obj )

It still fails over the "Queens.java" issue mentioned by Karen Kinnear (http://mail.openjdk.java.net/pipermail/bsd-port-dev/2008-September/000044.html) - to work around this, find sources/hotspot/make/bsd/makefiles/buildtree.make and edit it, so that the test_gamma target is simply:

test_gamma:

(ie, so that test_gamma is now an empty target). The build will still fail, complaining that:

build/bsd-i586/hotspot/outputdir/bsd_i486_compiler2/product/test_gamma does not exist.

Create this as an empty file, with executable permission and rerun the make command. It will pick up where it left off and then fail because:

build/bsd-i586/hotspot/outputdir/bsd_i486_compiler1/product/test_gamma does not exist.

These are, presumably the server and client JVMs respectively. Repeat the fix, creating an empty test_gamma in the compiler1 tree, and all should be well.

This JVM should be saved somewhere (/usr/local seems like a good place) and can be used for testing. I've rerun the build using a JDK generated like this, and it seems happy.

(Btw, patches for the above build process, especially that ungainly test_gamma hack, would be very welcome).

Tuesday, 16 December 2008

Building OpenJDK 7 On Mac OS X 10.5

i) OK, first off OpenJDK 7 is still a moving target, and will require use of the source control systems it is stored in - this is Mercurial (rather than svn or CVS) - usually abbreviated hg (as per the symbol for the element mercury).

ii) The Mac user is also at a disadvantage as we're dependent on the BSD port, which may lag behind the Linux and Windows builds.

iii) All of this requires Leopard. Tiger's compilers and build tools are simply a world of pain (and may actually be completely impossible). Just give it up and upgrade - you also need XCode 3.1.

Check you do actually have XCode installed - just checking for the presence of gcc from a command line does not do this - it is possible to have ended up with gcc installed but not XCode. If XCode is not installed, the eventual build step will fail with a cryptic error (possibly about fonts or FreeType) which does not clearly indicate that XCode is missing.

Getting Started

Get Mercurial
Mercurial is a relatively simple, modern version control and patching system. It's written in Python and will install to /usr/local/bin (plus a few well-behaved extensions in the standard python place).

See: http://openjdk.java.net/guide/repositories.html#installConfig

You want version 0.9.4 or 0.9.5 - not any of the post-1.0 versions

The instructions on getting Forest are missing one very important step. They don't tell you how to actually get the extension itself.

There seem to be two ways:

1) Go to the public repository (http://hg.akoha.org/hgforest/), and click on the top link, then the 'files' link on the changeset description page, then the 'raw' link at the top and save it with the other extensions (/Library/Python/2.5/site-packages/hgext/ on a Mac)

2) Some way to use hg itself to pull the latest version from the repository but I figured I'd done enough yak shaving by this point and didn't figure out how to do this. Instructions on how to do this part very welcome.

Once you have Forest set up you should be able to continue with the instructions from installConfig, but please ensure that you also add the line:

hgext.mq=

to the end of your extensions section. This tells Mercurial to activate the Mercurial Queues extension, which is used for patching. It's always referred to as MQ in the Mercurial docs, which makes it difficult to find any useful information about it, as that acronym is hopelessly overloaded, and is swamped by hits from IBM's messaging middleware product.

Note: If you fail to activate this line in .hgrc, then the main patching step will inexplicably fail with a cryptic error message later.

With all of this done, you should have a working hg install. Test it out with a couple of:

hg version
hg showconfig
hg help

and if all seems well, change to your work directory and check out the main Mac source:

hg fclone http://hg.openjdk.java.net/bsd-port/bsd-port

This could take a while, depending on your connection speed.

Next, we want to build this version to ensure that our build process is nice and clean before patching for invokedynamic.

Building a Vanilla OpenJDK7
First we need to download SoyLatte, which is an older Java 6 port for Mac. This is to allow us to bootstrap the compilation of OpenJDK 7.

See:

http://landonf.bikemonkey.org/code/java/SoyLatte_Meets_OpenJDK.20080819.html

and

http://landonf.bikemonkey.org/static/soylatte/

to download the actual binary JDK - we want the 32-bit version for now.

If you've followed the notes from Landon Fuller above, you should now be ready to build. This step is:

make ALT_JDK_IMPORT_PATH=/usr/local/soylatte16-i386-1.0.3 ALT_BOOTDIR=/usr/local/soylatte16-i386-1.0.3 ALT_BINARY_PLUGS_PATH=$HOME/jdk-7-icedtea-plugs ALT_FREETYPE_HEADERS_PATH=/usr/X11R6/include ALT_FREETYPE_LIB_PATH=/usr/X11R6/lib ALT_CUPS_HEADERS_PATH=/usr/include ANT_HOME=/usr/share/ant NO_DOCS=true HOTSPOT_BUILD_JOBS=1

If all goes well, you should now have a build of OpenJDK 7, which you should save to /usr/local - and save the source tree too just in case.

The current state of the repo is that this build is failing, and is being actively worked on, so I'll stop here, and will carry on next time with details of how to apply the invokedynamic patches to build what we need for mlvm.

Sunday, 7 December 2008

Dynamic Languages on Managed Runtimes

Dynamic Languages have been around for a long time (Perl 5 was released in 1994).

They have traditionally run on top of loosely-defined C runtimes, which are custom to each language. Sometimes they have a type of virtual machine used for execution - but not one which corresponds closely with the Java or .NET VMs.

For many reasons, not least of which taking advantage of the more enterprise features of the VMs, and interoperating with modules written in other languages, there are many groups and developers interested in running dynamic languages on the general purpose VMs.

The problems involved in doing this are manifold - the most valuable languages (ie those with most developer marketshare) were not designed with the Java or .NET VM in mind - and the VMs were designed with type and dispatch systems which are not at first glance suitable for dynamic languages.

In the .NET case, the Compilers group at MSFT are producing extra functionality for dynamic dispatch for C# 4.0, and with IronPython and IronRuby are producing a new component called the Dynamic Languages Runtime. John Lam's blog at http://www.iunknown.com/ is a good place to start for the .NET side of things.

For Java, the work is centred around JSR 292 (see http://jcp.org/aboutJava/communityprocess/edr/jsr292/) - and while initially attention was focused around a new bytecode invokedynamic (to add to invokevirtual and friends) there is some debate as to whether a new opcode is the right approach - or whether the same end could be achieved by using invokeinterface as the opcode, and a special class (in the java.dyn package, perhaps) which was treated in a special way by the runtime.

This second approach (classes with special treatment by the runtime) is apparently already used by the implementations of other VM features in HotSpot, so may prove to be the approach used. People still talk about invokedynamic as a name for the featureset, even without the new bytecode.

Next time I'll post my build instructions for trying to get OpenJDK 7 with invokedynamic on Mac. This will only work on Leopard, as Tiger's build infrastructure is not quite up to it.

Intro

Hello,

This is my technical blog.

In it, I'll talk about my work with dynamic languages, the JVM and whatever else comes to mind.