Cloud Zone is brought to you in partnership with:

' ! Moshe Kaplan constantly helps successful firms getting to the next level and he is thrilled to uncover some of his secrets. Mr. Kaplan is a seasoned project management and cloud technologies lecturer. He is also known to be a cloud and SCRUM evangelist Moshe is a Co-Founder. He was a R&D Director at Essence Security, led RockeTier and served as a board member in the IGT and as a department head at a top IDF IT unit. Moshe holds M.Sc and B.Sc from TAU. Moshe is a DZone MVB and is not an employee of DZone and has posted 59 posts at DZone. You can read more from them at their website. View Full User Profile

Load Balacing Support in Dynamic Environments

  • submit to reddit
The Mission
These days we face a challenging task: designing a very large system of scalable instances. Each of these instances may be in a different geographic location, and many of them are on demand instances that are  being started and shutdown instantly.
Another requirement in this system that a given client will be direct to a defined instance, due to system restriction (a round robin is not an option is this case).

One Step Further
Since the number of IP addresses in the internet is limited, we would like to use as few as possible Public IP addresses. This can be done using a load balancer or a proxy.
In the current state we would like to avoid using hardware load balancers in order to keep initial fixed costs minimal, but we may consider to use them in the future.

Is Amazon Cloud Load Balancer Service (AWS) is an Option?

AWS EC2 instances is a feasible option for on demand instances hosing. However, AWS charges $0.025 per a single load balancing rule per hour (+traffic). Therefore, it can be used, but for a large number of rules (>7) or high traffic, better solutions can be found in the market.

So What Can Be Done?
We left with software load balancers. The major ones Apache mod_proxy and HAProxy. Supporting large number of instances behind the load balancer can be done in one of the following two options:

  1. Pre register a large number of DNS addresses (sub domain) and associate them with the load balacner IP. The load balancer will simply redirect the request to the defined instance, based on the a simple rule in the load balancer. For example:  Pros: simple. Cons: not fully dynamic, requires additional DNS registration once in a while to keep up with the application growth.
  2. Performing ProxyPass in the Load balancer: Every request will include in its path an instance identification for example: This method does not require mass DNS declarations, but it requires specific definitions in the load balancers that may be more CPU consuming. In Apache the definition is pretty trivial, however, this product is less scalable from HAProxy. In HAProxy the task can be done as well based on a two phases: switching to the server and rewrite the URI:
    In order to switch to the server, you have to use ACLs to match the path,
    then a use_backend directive to select a server farm ("backend"). Your
    farm may very well support only one server if you want.

    Then in this "backend", you can use a rewrite rule ("reqrep") to replace
    the request line.

    This would basically look like this :

    frontend xxx
           acl path_mirror_foo path_beg /mirror/foo/
           use_backend bk_66 if path_mirror_foo

    backend bk_66
           reqrep ^([^: ]*\ )/instN/\(.*\)  \1/\2
           balance roundrobin
           server srv66

However, Willy Tarreau, HAProxy author who kindly provided this hint for me, recommends that you avoid the second part (rewriting) because :

 1) it requires good regex skills which sometimes makes the configs hard
    to maintain for other people

 2) rewriting URIs in applications is the worst ever thing to do, because
    they never know where they are mapped, and regularly emit wrong links
    and wrong Location headers during redirections.

Willy Tarreau also advices that the best thing to do clearly is to correctly configure your application to be able to respond with the real, original URI. Remapping it can be used as a transitional setup in order to ease a graceful switchover, though. Bottom line: Pros: No DNS configuration and fully scalable solution, with no dependence on DNS replication. Cons: CPU Consuming and error prone declarations.

So, What to Choose?
The answer is based on your needs, and your believe in your people regex capabilities. We made our choice.

Keep Performing,
Moshe Kaplan. Performance Expert.
Published at DZone with permission of Moshe Kaplan, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)


Juergen Brendel replied on Tue, 2011/07/05 - 5:47pm

Actually, none of this is necessary if you create your own network topology. With solutions, such as vCider, you can create a virtual network even on Amazon EC2, where you have full control over the IP address assignment. In that case, you just setup your load-balancer once. Disclaimer: I'm the architect at vCider.

Abhishek Choudhary replied on Wed, 2013/09/18 - 6:36am

I have tried AWS and specifically for load balancing I found Jelastic came up with more flexibility and performance. Currently in my organisation we migrated to Jelastic and both the format of load balancing ie TCP and HTTP acted very well. few details can be found here-

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.