Wednesday, 30 December 2009

New Talk

Here is my recent talk on JVM Performance Tuning at the Skills Matter Open Source In Finance Exchange.

Thanks to Alan Hardy and the Skills Matter people for organising, and for all the great feedback from the attendees.

Monday, 30 November 2009


It's been a very few weeks round at Boxcat Junction.

I was lucky enough to get to Devoxx 2009 in Antwerp a couple of weeks ago - and was very impressed by the presentations and people I met there.

Particular props are due to Brian Goetz and Alex Buckley from Sun, James Strachan for his OSS ESB talk and Holly Cummins (superb performance talk), Zoe Slattery and Robin Fernandes from IBM.

This was followed last weekend by the London Java Community's first Unconference, graciously hosted by IBM at their Southbank location - organised by Barry Cranford and Zoe Slattery. The format went off really well, although I think next time I'd like to see more sessions using less traditional teaching methods (eg group discussions, fishbowl, etc). I hope to make my slides for my Performance talk available soon.

Next week sees me in St Petersburg, meeting with Russian colleagues, and then I'm attending (and speaking at) the SkillsMatter Open Source in Finance Exchange on December 15th.

Saturday, 12 September 2009

Knew I Couldn't Stay Away

In this post, I said that I was done with language hacking for a while.

Well, I guess I can't stay away from it. Mattia Barbon has been running the Language::P project for a while now, over on github, and I've agreed to give him a hand with the Java backend pieces of that project.

The current state is that he has a Perl5 parser, which covers a reasonable (and improving) subset of P5. The parser is written in Perl5, and currently outputs a packed bytecode format (consisting of P5ish opcodes).

He has a branch containing a .NET backend which uses DLR expression trees, and I've started work on a Java backend.

My initial target is an interpreter, but I have several pieces which I would hope to use in an eventual compiler.

The current aim is to support enough of P5 for the Language::P parser to parse itself, and from there to bootstrap parsers in the target backends. For an interpreted backend, such as my current Java prototype, this will mean shipping a packed bytecode parser for the interpreter to fire up. For compiled backends, other avenues are possible.

Both the .NET and Java parts are currently on private branches, but I hope to make the code available soon, and merge to HEAD.

If anyone wants to help, please leave a comment / email me.

Tuesday, 25 August 2009

Spring 2.5 and Eclipse 3.5 JUnit Problems

Spring 2.5 test classes (eg those that use @RunWith(SpringJUnit4ClassRunner.class) ) implicitly rely on a class which was removed in JUnit 4.5

See this Jira:

Unfortunately, the latest Eclipse (3.5) ships JUnit 4.5 by default, meaning that the newly-production Eclipse version is broken for running tests written for the current production Spring version.

No workaround is currently available (and "use Spring 3.0" as a resolution isn't very helpful for production stuff - it's still in beta) - I will investigate with the Eclipse people as to whether anything (eg a downgrade of the JUnit plugin to 4.4 would be feasible) but don't hold out much hope.

For now, looks like it's back to running Spring tests from the command-line with a local JUnit 4.4 jar on your classpath.

(Btw, if anyone does have a good workaround, please let me know.)

Sunday, 16 August 2009

Thanks For All The Fish

One of the major themes of the blog so far has been my attempts to write a version of (perhaps a subset of) Perl 5 that will run on the JVM, and/or find out what makes this a difficult exercise. And, to have fun while doing so.

For reasons which I outline in this post, I think it's time to give up, and I would strongly encourage anyone reading who is tempted to have a go at carrying on where I'm leaving off to think again. The code we wrote is out there, but I would suggest that anyone who's really interested contact me first.

So, here's the lowdown on the problems we faced.

There are really two separate sets of issues - first of all there are issues related to nullability.

Because Perl does not require brackets around function parameters, and has functions which do not need to specify their prototypes, then this line of code:

my $a = dunno + 4;

can be parsed in one of two ways:
  • $a gets 4 + the result of calling the function dunno(), which takes no arguments. That is, + is treated as a binary operator
  • $a gets the result of calling dunno(4), where dunno takes at least one argument. That is, the + is treated as a unary operator on 4.
This line of argument is expanded upon significantly by Jeffrey Kegler at - where he links it to Halting. I have not fully satisfied myself of the full implications yet, but the initial points amount to a hugely significant parsing problem, with no real good solution.

The second source of problems is that Perl 5 is old (1994) - and dates from a time when automated language tools were rather more lacking than they are today. When Larry Wall was working on the first versions of p5, lex and yacc were pretty much the state of the art in terms of what was practical for autogeneration, and a skilled practitioner could outperform them by modifying the output, or writing from scratch.

Perl wasn't written with formal parser theory in mind, and has now reached the stage where the implementation is really all we have. It does not fit well into a rigorous model of language, and during its development flexible language features were considered to be more important than linguistic concerns (such as static analysis).

Simply put, there's no grammar, and attempting to write something which matches the only existing implementation is a major undertaking - no existing automated language tools will help much, it's largely a matter of needing to completely reimplement the existing C code in Java (or bytecode). This is a huge amount of work, if it's possible at all, and is not going to be fun - and will be likely to be very frustrating for a large chunk of the time spent on it.

So, here we are. I've had a lot of fun working on this (and particular thanks to James Laver and Shevek, both of whom provided insight, help and encouragement - and to the many other people in the Perl, Java and Ruby worlds with whom I had interesting and sometimes amazing conversations) and I'd like to close with a short summary of what I've learned from this project:
  • Too much backwards-compatibility is a huge millstone
  • Always ensure that the people you're talking to have the same definitions you do
  • If you're going to use formal language, you must have proofs available. Declaring a problem to be in a particular class by fiat does not help anyone.
  • Perl's block / closure / subroutine definitions are too overlapping and unclear. This is a major problem
  • Indirect object syntax in Perl 5 was a misstep
So, that's it for now.

I'll be moving on to other problems on my stack now, so my next posts will be about broader topics than just language design / implementation but I'm sure I'll return to language design in due course - after all, I just can't seem to stay away from it.

Saturday, 1 August 2009

My Interview Checklist

Someone asked me recently about what sort of job interview prep I do, and having recently found myself a new job, I thought I'd post a sample here.

This is the bare bones of what I polished before my most recent job hunt. It's skewed towards Java for some of the actual technology bits, but the CS fundamentals should be language-independent.

My attitude is that the working practitioner should have a good command of a lot of this (especially the CS topics) at all times, and should only need to briefly revisit each subject to ensure the polish and the details are 100% there.

The books I used most heavily were "Introduction to Algorithms" (Rivest et al) and Doug Lea's "Concurrent Programming in Java".

Comments and suggestions for things other people have found useful would be most welcome.


Details of order notation (eg Omega etc)
String Matching
NP / NP-completeness

Basic Trees and Tree Construction
Red / Black Trees
Hashing / Hashtable
HashMap / TreeMap

Representations of Graphs in code (object / pointers, matrix, adjacency list)
Graph Traversal (BFS, DFS)
Minimal Spanning Tree

Discrete Maths / Probability / "Logic Puzzles"
Probability Exercises
Decision Trees Exercises
n-choose-k Problems
Permutation Groups, Cycle, Reprns, etc
"Perfectly Logical Beings" puzzles
Decision / Ply problems (eg Monty Hall)

DB / Hibernate
Normal Form
Having clause
Outer joins
XML Form of Hibernate - Basics
JPA / Hibernate
Indexes and Optimisation

Java Internals and Details
Bitwise operators
Collections / Generics nitty-gritty (ie to bytecode level)
OO nitty-gritty (and nasty edge cases)
Annotations nitty-gritty
Arrays nitty-gritty

OS/Concurrency details
Safety, Liveness, Performance, Reusability
Permanent Fail: Deadlock, Missed Signals, Nested Monitor Locks, LiveLock, Starvation, Resource Exhaustion, Distributed Fail
Block-structured Locking, Synchronisation, JMM, Fully Synchronized Objects
Other Constructs: Mutex, Latch, Futures, Callable / Command Adapter
Real-world multithreaded application development

Future Web Tech
ECMAScript 4 hassles
Flex vs Silverlight (ref issues with LCDS and the Adobe approach)
Asynch Messaging for webapps

Thursday, 11 June 2009

Last Night's Dynamic Languages Meeting

Yesterday evening, a group of us gathered to talk about dynamic languages, at the British Computer Society in London. The format of the evening was lightning talks - ie 5 minutes per talk - on any subject connected with dynamic languages.

There were a lot of great talks given - from using Ruby for systems programming to improving the test coverage of PHP.

I spoke about the JVM / MLVM as a platform for dynamic languages - my slides are here.

It was really great to meet a lot of people from different parts of the dynamic languages world - especially Rob and Zoe from IBM's sMash / Zero implementation of PHP.

Many thanks to the BCS for hosting, Leon from for organising it and Billy for being the liason with the BCS. Hopefully there'll be another similar event soon.

Wednesday, 10 June 2009

Status Update

It's been a while since last update, largely being driven by my not having many cycles.

The grammar rewrite for p5vm is coming on well, but is still being debugged.

If there's anyone reading who is very good with SableCC (or shift-reduce parsers in general) and could spare a few hours helping us debug the new grammar, please get in touch.

I'm giving a lightning talk at the British Computer Society tonight about some of the work being done to put dynamic languages on the JVM - slides will appear here tomorrow.

Monday, 20 April 2009

State of Play for Perlish PoC

First off, the slides from my talk are here.

A tarball of a Mac MLVM (OpenJDK7 with invokedynamic) dating from 2009-04-10 is here.

So, where are we with the code?
The current codebase uses SableCC, an automated parser generator to transform an EBNF grammar into a parser, lexer, etc.

I'm not saying that's the most powerful way to do it, or that it will ultimately suffice for a reasonable stab at getting Perl onto the JVM. It was simply what was to hand that I had a modicum of experience with and with which I could up to speed with some necessary chunks of parser theory in order to get off the ground.

My current grammar has a number of deficiencies, mostly because I cobbled it together by cribbing from some pre-existing grammars which parse Java and PHP.

Longer term we may need either or both of: a more subtle grammar and/or a parser written in Java (basically, a port of the existing C-based parser to Java, but one which tries to stay out of the later phases of compilation).

In terms of code generation, I have two branches - one which uses a homegrown ASM codegen, and one which translates the AST to JRuby's AST, and then uses JRuby's codegen (which is also based on ASM).

Going forward, the semantic differences between Perl and Ruby (notably Everything-Is-An-Object and string handling) probably make AST translation not viable as a long-term strategy.

However, one thing which did occur is if there are parts of the JRuby codegen libs which could be refactored out into a separate package that we could use for Perl / other langs, that would be helpful.

In addition, when designing the AST for use with a homegrown ASM-based codegen, a good hard look at the JRuby AST seems like a good plan - the scoping constructs that they use are directly relevant to Perl, for example (although I'm aware that for performance reasons, they may need to optimise how they handle scope constructs).

Places to get started

  • A better grammar. One quick way to improve the current situation is to improve the quality of the EBNF grammar. The current grammar is here. It may not be the long term plan, but making progress with what we've got in advance of a parser port should help to figure out issues and will help momentum

  • Design session for how to handle Perlish dispatch in full generality. This is probably the most fundamental design issue which needs to get nailed right now. I have some ideas about how to do it, but they need validating and the input of others. If there's interest, I suggest we get together in the back room of a pub or cafe in London one afternoon and thrash this out.

  • Test cases. Having a set of test cases (grouped by complexity - ie low-hanging fruit first) would be very useful. Ultimately, we want to run as much of the test suite as possible, but little acorns are the first step...

  • Starting up a wiki or similar to track know issues with syntax and semantic issues (eg the semantic issues around 'new', GC, etc)

  • Help from a guru who understands the OP representations. This would be really useful in starting to think about the ultimate form of the parser.

If any of these are appealing, especially the dispatch design task, please get in touch.

Sunday, 19 April 2009

invokedynamic on OS X - Update

OK, so the big news is that the Sun guys (John Rose and co) have formally switched to working primarily off the bsd-port tree, rather than the mainline.

This has the advantage that as a lot of the people on mlvm-dev are on Macs, the patches should just work.

Based on this, I have also confirmed that at present, something appears to be preventing OpenJDK7 from self-hosting, ie that currently, on OSX at least you have to use Java 6 (ie SoyLatte) for builds.

Thus, the sequence is that of, but making sure that you start of with this line:

hg fclone sources

if you're on a Mac - ie download the bsd-port branch, not mainline as the source base.

For the actual build step, ie the gnumake build line, I recommend that you adapt John's build script from here:

The project should then build cleanly. If you do manage to get it to build cleanly on a Mac using an OpenJDK7 as the bootstrapping JVM, please let me (and the mlvm-dev group) know.

With all this in place, it should build cleanly. Install the resulting SDK somewhere sensible (I use /usr/local, labelled something like: openjdk7-sdk-darwin-i386-20090410).

This should now be available to be installed as a primary JVM for your Java IDE (I use Eclipse). You will have to explicitly enable MethodHandle functionality for now. In Eclipse, this is under Preferences > Java > Installed JREs. Highlight the new JRE, and click 'Edit".

Within the IDE detail, you need to add -XX:+EnableMethodHandles to the Default VM arguments.

With all of this done, you should be ready to test.

Remi Forax has posted here with a description of a simple test case.

Download the zip file from here and unpick it - if you've built a JVM as above, all you'll need is a snippet of the code in the src directory of the zip file.

Import the package fr.umlv.davinci.test into a suitable test project. It will not compile, because the class 'Magic' does not exist yet - there's no Java-language source representation of a dynamic invocation like: Magic.invoke(mh, element)

Instead, run the MagicPatcher class. This will generate a binary-only Magic.class file, which you should put on the build path of your project. Refreshing and rebuilding the Eclipse project and the IDE should now be happy, and you should be able to run MethodHandleTest.

For any Perl people who are here - the point which should leap out is the similarity of the second invdyn call (ie the one involving the Java Sum class) to a Perl closure. It's this ability which is motivating my interest from a Perl perspective in this area (I have other interests from a Java perspective, but that's another story).

Now I've got all this down on electrons, I'll move to trying to explain where my proof-of-concept baby-Per-lish dynlang is at, how I see possible approaches from here, and try to flesh out the slides from my talk a bit more so that they make a bit more sense without me talking to them.

Tuesday, 7 April 2009

Akamai IP Application Accelerator

This post is all about a PoC I did with Akamai's IP Application Accelerator technology.

The basic idea is that you change your DNS entries for your services (and think services other than just websites here - this is a Layer 3 solution) to be CNAMEs for Akamaised DNS entries (which will of course resolve to IPs which are close to your end users) and the solution then opens a tunnel over Akamai's private network, which does not use standard Internet routing (and also uses a technique where multiple copies of each packet are sent by diverse routes) or the main exchanges, until the packets are reassembled close to the real origin servers. NATting is used heavily (both SNAT and DNAT) to ensure that this is invisible to the application.

Note that this seamlessness is from an IP perspective - but this does not cover all corner cases completely, and there may be problems with load balancers and integrating fully with your DNS - depending on the details.

Joel Spolsky has written up his description of it here:

Akamai's description of it is here:

So how did we find it? Well, it definitely has a place for some companies. Joel's setup and customer distribution seem to highlight the upside quite well, so I'll leave you to read that on his site.

Some of the problems we found with it:

  1. It only really works well if your destination is in a different "logical metro". This one probably isn't too surprising - if you're in the same city or data centre you wouldn't expect there to be any gain by routing onto the Akamai network and back again.

  2. It has to be either on for all customers, or off for all customers - there's no way to have it only switched on for (say) just Middle Eastern customers with sucky connectivity.

  3. It's charged for both by number of concurrent sessions, and by total bandwidth. Make sure you tie Akamai down about exactly how the costs are calculated - some of the salespeople we spoke to were overly evasive about the costs.

Taking these together means that you may well have to know more about the geographic distribution of your users and their bandwidth usage patterns than you currently do.

What's also worth noting is the use of the phrase "logical metro". Sure, LA customers might see a speedup back to an NY datacentre (Joel's example), but let's suppose you have 3 regional datacentres (NY, LN, TK) covering the Americas, EMEA and Asia respectively.

It's likely that virtually none of your customers in Europe will see any meaningful speedup (leaving aside outages where you fail those customers over to another region) - and certainly no-one in the LN / Paris / ADM / BRX / Frankfurt area will. The links back to the LN datacentre should just be too damned good for the Akamai solution to be able to shave any time off.

Also, it's plausible that your concentration of Asian customers could be in HK / TK / Shanghai so similar remarks could apply about routing back to TK from within Asia.

Akamai didn't give us a straight answer when we asked a question like: "OK, so suppose we only have a very slight, not user-visible speedup. At what stage will the product not route over the Akamai network (and thus not charge us for an acceleration that didn't really do anything)?"

Given that for customers within a logical metro where you have a datacentre, this is the norm, this is absolutely critical for finding out how much this is really going to cost.

In the end, we decided not to go for it, simply because when we'd analysed where our customers were in the world, we realised we'd be paying a lot of money to speed up about 2% of our mid-range customers.

If you are thinking of deploying this tech consider these points as well:

  • Consider how your failover model works in the event of complete loss of a datacentre. Do your customers fail across to another region? What is the impact of that? How often do events like that occur? What is the typical duration?

  • Load balancing and the details of how your DNS is setup. You may be surprised by how much detail you need to understand of this in order to get this tech to work smoothly. Do not assume that it will necessarily be straightforward for an enterprise deployment

Is there a point here? I guess just that you have to see the acceleration tech in context of your application architecture, understand where your customers are coming from, test it out and check that it's not going to cost the company an arm and a leg.

New technologies like this can be a real benefit, but they must always be seen in context of the business, and in this case - in the geographic and spend distribution of your customers.

Monday, 16 March 2009

Tail Call Optimisation on the JVM

It looks like an initial cut of tail-call optimisation (TCO) on the MLVM is in.

This is important, both for general optimisation and also for implementing languages like Scheme on the JVM. The Scheme case is particularly important, because the Scheme standard requires optimisation of tail recursion (as this is the standard idiom for implementing iteration in Scheme).

The basic form of a tail call which we would want to optimise, looks like this in C:

void foo() {
// stuff
// ....


The call to bar() is followed immediately by a return, so if we don't optimise, we're growing the stack unnecessarily. The idea of TCO is to rewrite the current stack frame (and so avoid growing the stack) and then issue an unconditional jump to start executing the callee.

Now, in Java we don't have direct access to modify the current stack frame, so we instead need to have some bytecode which has the semantic meaning of the JVM of "this call must be implemented as a hard tail call". This change requires changes to compiler and interpreter, and is currently implemented like this:

The bytecode for a optimised tail call would look something like this in the patched HotSpot:

0xc4 wide
0xb6 invokevirtual
0x00 // high byte of index of method to call
0x03 // low byte of index of method to call
0xb0 areturn

The wide bytecode is used to indicate that the opcode immediately following it (the embedded instruction) is operating in a variant mode. Correctly implementing interpreters will verify the correctness of the instruction form of wide, parse the form and then evaluate the instruction.

That is, the embedded instruction is considered an operand to wide - not all forms which follow it are legal bytecode as they stand - so in particular, there should never be a jump directly to an embedded instruction. See this link for more details.

The form of wide above is a new, experimental form for supporting TCO (if you look at the current documentation for wide you'll see that all the embedded opcodes it currently supports are of type load / store / inc, so a form which supported invoke* would have to be a new form). To become standardised, the TCO change needs to pass the JCP, which first requires evidence that this will be of sufficient benefit to justify the changes by vendors to support it, so it might be a while - this is a fairly hot-off-the-presses change.

Next up, I'll discuss how to get the TCO patches switched on (and confirm that you have) and some test results for some preliminary cases.

Codegen Update

Since my last post (gosh, a couple of months ago), I've actually been investigating the codegen's for both Kawa and JRuby, and I've come to the conclusion that while Kawa has some very nice features, it was too hard to get it to play nicely with the AST that I'm outputting from sablecc.

The documentation being sparse was also a problem, as was the fact that the classes that Kawa produces are a long way from being usable as regular Java types - and still display a large amount of their Scheme heritage.

I've found that JRuby does a lot better - in some ways it is a lot closer to the AST that I'm producing. In addition, the code is quite easy to follow, and the low-level generation is all handled with ASM.

I have some basic examples of programmatically generating a JRuby AST and then compiling it - there's no nice API method to do it, but by cribbing from the compiler source it's relatively easy.

Adapting this method for purr, in the Main class of the purr compiler, we now have some code like this:

Lexer lexer = new Lexer (new PushbackReader(new BufferedReader(new FileReader(filename)), 1024));

Parser parser = new Parser(lexer);
Start ast = parser.parse();

SemanticAnalyzer sem = new SemanticAnalyzer();

NodeMapper nmap = new NodeMapper(sem);

Node jHead = nmap.getNodeMap(ast);

where the compile method is as simple as:

private static void compile(Node ast) {

try {
String classname = JavaNameMangler.mangledFilenameForStartupClasspath(filename);

ASTInspector inspector = new ASTInspector();

StandardASMCompiler asmCompiler = new StandardASMCompiler(classname, filename);
ASTCompiler compiler = new ASTCompiler();
compiler.compileRoot(ast, asmCompiler, inspector, false, false);

File file = new File(filename);
catch (Exception x) {

So, still quite a way to go (especially with figuring out all the details of NodeMapper, which is the class which transforms from purr's AST to JRuby's) - but some good progress.

Monday, 5 January 2009

Which Codegen and IR to use?

I know I haven't really explained much about why I want to build out a JVM with InvokeDynamic, so I thought I'd try and put some of my thoughts down.

Broadly, I'd like to be able to write a compiler for a dynamic language, consisting of a modest subset of Perl 5's syntax. This is largely just to teach myself more about the internals of the JVM, and about language design and implementation in general, but it might come in handy for other people as well - who knows.

I have been getting to grips with the sablecc parser generator toolkit, and already have a grammar for a working subset of the language (although there's quite a long way to go yet).

Sablecc has some features which rather surprised me at first - its Node is an abstract class rather than an interface and the classes which represent terminals in the abstract syntax are all declared final.

This makes it rather hard to use Decorator to annotate the abstract parse tree (as for a Decorator-type approach one would subclass the abstract parse tree terminals).

After a bit of head scratching, I'm now generating a set of Analyzer classes, each of which are obtained by visiting the tree, and storing additional context information in the Analyzer classes themselves.

Each Analyzer instances can now be though of as being a self-consistent set of annotations about the parse tree - all the annotations relevant to a particular type of analysis can be represented as one Analyzer, but there may be multiple Analyzers corresponding to different, disjoint classes of semantic information.

With this now reasonably well understood, I'm turning my attention to the IR and code generation stages. I'm weighing up 2 possible tools and approaches.
  1. ASM. This has a lot of advantages - it is reasonably well documented, once you're over the initial hump, and seems to have good low-level control as well as consistent use of the Visitor pattern to help with conceptual consistency.
  2. The gnu.bytecode and gnu.expr packages from Kawa. These are part of Kawa (a Scheme-orientated package for producing bytecode).
At the moment I'm still trying to decide which I'm going to opt for. Kawa has the advantage that several other languages are already using it, so it may have better performance for repl and similar approaches (which will be necessary if I want to support eval STRING as a case). It also seems to have a relatively consistent set of classes which could be targeted as an IR (which so far, I don't really see ASM having).

However, Kawa doesn't (so far) to be very well documented - I've found myself having to dig through the source code several times already just to find out whether method names need to be specified as package names or internal names, etc. The conceptual framework doesn't seem to be quite as consistent somehow. I'm also slightly scared by the use of bare integers in some places in the gnu.bytecode package - I'd prefer a symbolic constant for each opcode, which I can pull in with a wildcarded static import.

Still, I'll post more as I investigate further.