Tuesday, 7 April 2009

Akamai IP Application Accelerator

This post is all about a PoC I did with Akamai's IP Application Accelerator technology.

The basic idea is that you change your DNS entries for your services (and think services other than just websites here - this is a Layer 3 solution) to be CNAMEs for Akamaised DNS entries (which will of course resolve to IPs which are close to your end users) and the solution then opens a tunnel over Akamai's private network, which does not use standard Internet routing (and also uses a technique where multiple copies of each packet are sent by diverse routes) or the main exchanges, until the packets are reassembled close to the real origin servers. NATting is used heavily (both SNAT and DNAT) to ensure that this is invisible to the application.

Note that this seamlessness is from an IP perspective - but this does not cover all corner cases completely, and there may be problems with load balancers and integrating fully with your DNS - depending on the details.

Joel Spolsky has written up his description of it here: http://www.joelonsoftware.com/items/2009/02/05.html

Akamai's description of it is here: http://www.akamai.com/ipa

So how did we find it? Well, it definitely has a place for some companies. Joel's setup and customer distribution seem to highlight the upside quite well, so I'll leave you to read that on his site.

Some of the problems we found with it:

  1. It only really works well if your destination is in a different "logical metro". This one probably isn't too surprising - if you're in the same city or data centre you wouldn't expect there to be any gain by routing onto the Akamai network and back again.

  2. It has to be either on for all customers, or off for all customers - there's no way to have it only switched on for (say) just Middle Eastern customers with sucky connectivity.

  3. It's charged for both by number of concurrent sessions, and by total bandwidth. Make sure you tie Akamai down about exactly how the costs are calculated - some of the salespeople we spoke to were overly evasive about the costs.

Taking these together means that you may well have to know more about the geographic distribution of your users and their bandwidth usage patterns than you currently do.

What's also worth noting is the use of the phrase "logical metro". Sure, LA customers might see a speedup back to an NY datacentre (Joel's example), but let's suppose you have 3 regional datacentres (NY, LN, TK) covering the Americas, EMEA and Asia respectively.

It's likely that virtually none of your customers in Europe will see any meaningful speedup (leaving aside outages where you fail those customers over to another region) - and certainly no-one in the LN / Paris / ADM / BRX / Frankfurt area will. The links back to the LN datacentre should just be too damned good for the Akamai solution to be able to shave any time off.

Also, it's plausible that your concentration of Asian customers could be in HK / TK / Shanghai so similar remarks could apply about routing back to TK from within Asia.

Akamai didn't give us a straight answer when we asked a question like: "OK, so suppose we only have a very slight, not user-visible speedup. At what stage will the product not route over the Akamai network (and thus not charge us for an acceleration that didn't really do anything)?"

Given that for customers within a logical metro where you have a datacentre, this is the norm, this is absolutely critical for finding out how much this is really going to cost.

In the end, we decided not to go for it, simply because when we'd analysed where our customers were in the world, we realised we'd be paying a lot of money to speed up about 2% of our mid-range customers.

If you are thinking of deploying this tech consider these points as well:

  • Consider how your failover model works in the event of complete loss of a datacentre. Do your customers fail across to another region? What is the impact of that? How often do events like that occur? What is the typical duration?

  • Load balancing and the details of how your DNS is setup. You may be surprised by how much detail you need to understand of this in order to get this tech to work smoothly. Do not assume that it will necessarily be straightforward for an enterprise deployment

Is there a point here? I guess just that you have to see the acceleration tech in context of your application architecture, understand where your customers are coming from, test it out and check that it's not going to cost the company an arm and a leg.

New technologies like this can be a real benefit, but they must always be seen in context of the business, and in this case - in the geographic and spend distribution of your customers.


Unknown said...

How is this seamless from the IP perspective? What source address do your servers see? Akamai's end points or the real one (ie natted)?

If it's the former, how can you tell what the original source IP was? And if it's the latter, doesn't that mean that the route back to the customer is over the regular internet?

Ben said...

If the traffic is HTTP-based, then the original IP address will be present as an X-Something header.

If not, then you lose the original source IP of the incoming user.