Resin Clustering

From Resin 3.0

Revision as of 07:35, 17 December 2009 by Reza (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

<document> <header> <title>Resin Clustering</title> <description>

As traffic increases beyond a single server, Resin's clustering lets you add new machines to handle the load and simultaneously improves uptime and reliability by failing over requests from a downed or maintenance server to a backup transparently.

</description> </header>

<body>

<localtoc/>

<s1 title="Persistent Sessions">

A session needs to stay on the same JVM that started it. Otherwise, each JVM would only see every second or third request and get confused.

To make sure that sessions stay on the same JVM, Resin encodes the cookie with the host number. In the previous example, the hosts would generate cookies like:

<deftable> <tr>

 <th>index</th>
 <th>cookie prefix</th>

</tr> <tr>

 <td>1</td>
 <td>axxx</td>

</tr> <tr>

 <td>2</td>
 <td>bxxx</td>

</tr> <tr>

 <td>3</td>
 <td>cxxx</td>

</tr> </deftable>

On the web-tier, Resin will decode the cookie and send it to the appropriate host. So bacX8ZwooOz would go to app-b.

In the infrequent case that app-b fails, Resin will send the request to app-a. The user might lose the session but that's a minor problem compared to showing a connection failure error.

The following example is a typical configuration for a distributed server using an external hardware load-balancer, i.e. where each Resin is acting as the HTTP server. Each server will be started as -server a or -server b to grab its specific configuration.

In this example, sessions will only be stored when the server shuts down, either for maintenance or with a new version of the server. This is the most lightweight configuration, and doesn't affect performance significantly. If the hardware or the JVM crashes, however, the sessions will be lost. (If you want to save sessions for hardware or JVM crashes, remove the <save-only-on-shutdown/> flag.)

<example title="resin.xml"> <resin xmlns="http://caucho.com/ns/resin"> <cluster id="app-tier">

 <server-default>
   <http port='80'/>
 </server-default>
 <server id='app-a' address='192.168.0.1'/>
 <server id='app-b' address='192.168.0.2'/>
 <server id='app-c' address='192.168.0.3'/>
 <web-app-default>
   <!-- enable tcp-store for all hosts/web-apps -->
   <session-config>
     <use-persistent-store/>
     <save-only-on-shutdown/>
   </session-config>
 </web-app-default>
 ...

</cluster> </resin> </example>

<s2 title="Choosing a backend server">

Requests can be made to specific servers in the app-tier. The web-tier uses the value of the jsessionid to maintain sticky sessions. You can include an explicit jsessionid to force the web-tier to use a particular server in the app-tier.

Resin uses the first character of the jsessionid to identify the backend server to use, starting with 'a' as the first backend server. If wwww.example.com resolves to your web-tier, then you can use:

  1. http://www.example.com/proxooladmin;jsessionid=abc
  2. http://www.example.com/proxooladmin;jsessionid=bcd
  3. http://www.example.com/proxooladmin;jsessionid=cde
  4. http://www.example.com/proxooladmin;jsessionid=def
  5. http://www.example.com/proxooladmin;jsessionid=efg
  6. etc.

</s2>

<s2 title="File Based">

For single-server configurations, the "cluster" store saves session data on disk, allowing for recovery after system restart or during development.

Sessions are stored as files in the resin-data directory. When the session changes, the updates will be written to the file. After Resin loads an Application, it will load the stored sessions.

</s2>

<s2 title="Distributed Sessions">

Distributed sessions are intrinsically more complicated than single-server sessions. Single-server session can be implemented as a simple memory-based Hashtable. Distributed sessions must communicate between machines to ensure the session state remains consistent.

Load balancing with multiple machines either uses sticky sessions or symmetrical sessions. Sticky sessions put more intelligence on the load balancer, and symmetrical sessions puts more intelligence on the JVMs. The choice of which to use depends on what kind of hardware you have, how many machines you're using and how you use sessions.

Distributed sessions can use a database as a backing store, or they can distribute the backup among all the servers using TCP.

<s3 title="Symmetrical Sessions">

Symmetrical sessions happen with dumb load balancers like DNS round-robin. A single session may bounce from machine A to machine B and back to machine B. For JDBC sessions, the symmetrical session case needs the always-load-session attribute described below. Each request must load the most up-to-date version of the session.

Distributed sessions in a symmetrical environment are required to make sessions work at all. Otherwise the state will end up spread across the JVMs. However, because each request must update its session information, it is less efficient than sticky sessions.

</s3>

<s3 title="Sticky Sessions">

Sticky sessions require more intelligence on the load-balancer, but are easier for the JVM. Once a session starts, the load-balancer will always send it to the same JVM. Resin's load balancing, for example, encodes the session id as 'aaaXXX' and 'baaXXX'. The 'aaa' session will always go to JVM-a and 'baa' will always go to JVM-b.

Distributed sessions with a sticky session environment add reliability. If JVM-a goes down, JVM-b can pick up the session without the user noticing any change. In addition, distributed sticky sessions are more efficient. The distributor only needs to update sessions when they change. So if you update the session once when the user logs in, the distributed sessions can be very efficient.

</s3>

<s3 title="always-load-session">

Symmetrical sessions must use the 'always-load-session' flag to update each session data on each request. always-load-session is only needed for jdbc-store sessions. tcp-store sessions use a more-sophisticated protocol that eliminates the need for always-load-session, so tcp-store ignores the always-load-session flag.

The always-load-session attribute forces sessions to check the store for each request. By default, sessions are only loaded from persistent store when they are created. In a configuration with multiple symmetric web servers, sessions can be loaded on each request to ensure consistency.

</s3>

<s3 title="always-save-session">

By default, Resin only saves session data when you add new values to the session object, i.e. if the request calls setAttribute. This may be insufficient when storing large objects. For example, if you change an internal field of a large object, Resin will not automatically detect that change and will not save the session object.

With always-save-session Resin will always write the session to the store at the end of each request. Although this is less efficient, it guarantees that updates will get stored in the backup after each request.

</s3>

</s2>


<s2 title="Cluster Sessions">

The distributed cluster stores the sessions across the cluster servers. In some configurations, the cluster store may be more efficient than the database store, in others the database store will be more efficient.

With cluster sessions, each session has an owning JVM and a backup JVM. The session is always stored in both the owning JVM and the backup JVM.

The cluster store is configured in the in the <cluster>. It uses the <server> hosts in the <cluster> to distribute the sessions. The session store is enabled in the <session-config> with the <use-persistent-store>.

<example> <resin xmlns="http://caucho.com/ns/resin">

 ...
 <cluster id="app-tier">
   <server id="app-a" host="192.168.0.1" port="6802"/>
   <server id="app-b" host="192.168.0.2" port="6802"/>
   ...
 </cluster>

</resin> </example>

The configuration is enabled in the web-app.

<example> <web-app xmlns="http://caucho.com/ns/resin">

 <session-config>
   <use-persistent-store="true"/>
 </session-config>

</web-app> </example>

The <server> are treated as a cluster of server. Each server uses the other servers as a backup. When the session changes, the updates will be sent to the backup server. When the server starts, it looks up old sessions in the other servers to update its own version of the persistent store.

<example title="Symmetric load-balanced servers"> <resin xmlns="http://caucho.com/ns/resin"> <cluster id="app-tier">

 <server-default>
   <http port='80'/>
 </server-default>
 <server id="app-a" address="192.168.2.10" port="6802"/>
 <server id="app-b" address="192.168.2.11" port="6803"/>
 <host id=>
 <web-app id=>
 <session-config>
   <use-persistent-store="true"/>
 </session-config>
 </web-app>
 </host>

</cluster> </resin> </example> </s2>

<s2 title="Clustered Distributed Sessions">

Resin's cluster protocol for distributed sessions can is an alternative to JDBC-based distributed sessions. In some configurations, the cluster-stored sessions will be more efficient than JDBC-based sessions. Because sessions are always duplicated on separate servers, cluster sessions do not have a single point of failure. As the number of servers increases, JDBC-based sessions can start overloading the backing database. With clustered sessions, each additional server shares the backup load, so the main scalability issue reduces to network bandwidth. Like the JDBC-based sessions, the cluster store sessions uses sticky-session caching to avoid unnecessary network traffic.

</s2>

<s2 title="Configuration">

The cluster configuration must tell each host the servers in the cluster and it must enable the persistent in the session configuration with <a href="../reference/session-tags.xtp#session-config">use-persistent-store</a>. Because session configuration is specific to a virtual host and a web-application, each web-app needs use-persistent-store enabled individually. The <a href="../reference/webapp-tags.xtp#web-app-default">web-app-default</a> tag can be used to enable distributed sessions across an entire site.

<example title="resin.xml fragment"> <resin xmlns="http://caucho.com/ns/resin">

 ...
 
 <cluster id="app-tier">
   <server id="app-a" host="192.168.0.1"/>
   <server id="app-b" host="192.168.0.2"/>
   <server id="app-c" host="192.168.0.3"/>
   <server id="app-d" host="192.168.0.4"/>
   ...
   <host id="">
   <web-app id='myapp'>
     ...
     <session-config>
       <use-persistent-store/>
     </session-config>
   </web-app>
   </host>
 </cluster>

</resin> </example>

Usually, hosts will share the same resin.xml. Each host will be started with a different -server xx to select the correct block. The startup will look like:

<example title="Starting Server C"> resin-4.0.x> java -jar lib/resin.jar -conf conf/resin.xml -server c start </example>

<s3 title="always-save-session">

Resin's distributed sessions needs to know when a session has changed in order to save the new session value. Although Resin can detect when an application calls HttpSession.setAttribute, it can't tell if an internal session value has changed. The following Counter class shows the issue:

<example title="Counter.java"> package test;

public class Counter implements java.io.Serializable {

 private int _count;
 public int nextCount() { return _count++; }

} </example>

Assuming a copy of the Counter is saved as a session attribute, Resin doesn't know if the application has called nextCount. If it can't detect a change, Resin will not backup the new session, unless always-save-session is set. When always-save-session is true, Resin will back up the session on every request.

<example> ... <web-app id="/foo"> ... <session-config>

 <use-persistent-store/>
 <always-save-session/>

</session-config> ... </web-app> </example>


</s3>

<s3 title="Serialization">

Resin's distributed sessions relies on Hessian serialization to save and restore sessions. Application object must implement java.io.Serializable for distributed sessions to work.

</s3>

</s2>

<s2 title="Protocol Examples">

<s3 title="Session Request">

To see how cluster sessions work, consider a case where the load balancer sends the request to a random host. Server C owns the session but the load balancer gives the request to Server A. In the following figure, the request modifies the session so it must be saved as well as loaded.

<figure src="srunc.gif"/>

The session id encodes the owning host. The example session id, ca8MbyA, decodes to an server index of 3, mapping to Server C. Resin determines the backup host from the cookie as well. Server A must know the owning host for every cookie so it can communicate with the owning srun. The example configuration defines all the sruns Server A needs to know about. If Server C is unavailable, Server A can use its configuration knowledge to use Server D as a backup for ca8MbyA instead..

When the request first accesses the session, Server A asks Server C for the serialized session data (2:load). Since Server A doesn't cache the session data, it must ask Server C for an update on each request. For requests that only read the session, this TCP load is the only extra overhead, i.e. they can skip 3-5. The always-save-session flag, in contrast, will always force a write.

At the end of the request, Server A writes any session updates to Server C (3:store). If always-save-session is false and the session doesn't change, this step can be skipped. Server A sends the new serialized session contents to Server C. Server C saves the session on its local disk (4:save) and saves a backup to Server D (5:backup).

</s3>

<s3 title="Sticky Session Request">

Smart load balancers that implement sticky sessions can improve cluster performance. In the previous request, Resin's cluster sessions maintain consistency for dumb load balancers or twisted clients like the AOL browsers. The cost is the additional network traffic for 2:load and 3:store. Smart load-balancers can avoid the network traffic of 2 and 3.

<figure src="same_srun.gif"/>

Server C decodes the session id, caaMbyA. Since it owns the session, Server C gives the session to the servlet with no work and no network traffic. For a read-only request, there's zero overhead for cluster sessions. So even a semi-intelligent load balancer will gain a performance advantage. Normal browsers will have zero overhead, and bogus AOL browsers will have the non-sticky session overhead.

A session write saves the new serialized session to disk (2:save) and to Server D (3:backup). always-save-session will determine if Resin can take advantage of read-only sessions or must save the session on each request.

</s3>

<s3 title="Disk copy">

Resin stores a disk copy of the session information, in the location specified by the path. The disk copy serves two purposes. The first is that it allows Resin to keep session information for a large number of sessions. An efficient memory cache keeps the most active sessions in memory and the disk holds all of the sessions without requiring large amounts of memory. The second purpose of the disk copy is that the sessions are recovered from disk when the server is restarted.

</s3>

<s3 title="Failover">

Since the session always has a current copy on two servers, the load balancer can direct requests to the next server in the ring. The backup server is always ready to take control. The failover will succeed even for dumb load balancers, as in the non-sticky-session case, because the srun hosts will use the backup as the new owning server.

In the example, either Server C or Server D can stop and the sessions will use the backup. Of course, the failover will work for scheduled downtime as well as server crashes. A site could upgrade one server at a time with no observable downtime.

</s3>

<s3 title="Recovery">

When Server C restarts, possibly with an upgraded version of Resin, it needs to use the most up-to-date version of the session; its file-saved session will probably be obsolete. When a "new" session arrives, Server C loads the saved session from both the file and from Server D. It will use the newest session as the current value. Once it's loaded the "new" session, it will remain consistent as if the server had never stopped.

</s3>

<s3 title="No Distributed Locking">

Resin's cluster sessions does not lock sessions. For browser-based sessions, only one request will execute at a time. Since browser sessions have no concurrently, there's no need for distributed locking. However, it's a good idea to be aware of the lack of distributed locking.

</s3>

</s2>

</s1>

 </body>

</document>

<document> <header> <product>resin</product> <title>Dynamic Servers</title> <description>

Resin includes the ability to add servers to clusters dynamically. These dynamic servers are able to use distributed sessions and the distributed object cache. The triad also updates these servers with applications that are deployed via the remote deployment server. The Resin load balancer is also able to dispatch requests to them as with any static server.

</description> </header>

<body>

<localtoc/>

<s1 title="Overview">

Adding a dynamic server to a cluster is a simple two-step process:

  1. Register the dynamic server with a triad server via JMX.
  2. Start the new dynamic server using the registration in the previous step.

</s1>

<s1 title="Preliminaries">

Before adding a dynamic server, you must:

  • Set up and start a cluster with a triad, e.g. <example title="Example: conf/resin.xml"> <resin xmlns="http://caucho.com/ns/resin"> <cluster id="app-tier"> ... <server id="triad-a" address="234.56.78.90" port="6800"/> <server id="triad-b" address="34.56.78.90" port="6800"/> <server id="triad-c" address="45.67.89.12" port="6800"/> </example>
  • Install at least one admin password, usually in admin-users.xml
  • Enable the RemoteAdminService for the cluster, e.g. <example> <resin xmlns="http://caucho.com/ns/resin"> <cluster id="app-tier"> ... <admin:RemoteAdminService xmlns:admin="urn:java:com.caucho.admin"/> ... </example>
  • Enable the dynamic servers for the cluster, e.g. <example> <resin xmlns="http://caucho.com/ns/resin"> <cluster id="app-tier"> ... <dynamic-server-enable>true</dynamic-server-enable> ... </example>

Check the main <a href="clustering.xtp">Clustering</a> section for more information on this topic.

</s1>

<s1 title="Registering a dynamic server">

For the first step of registration, you can use a JMX tool like jconsole or simply use the Resin administration web console. We'll show how to do the latter method here. For registration, you'll specify three values:

<deftable title="web-app deployment options"> <tr>

 <th>Name</th>
 <th>Description</th>

</tr> <tr>

 <td>Server id</td>
 <td>Symbolic identifier of the new dynamic server.  
     This is also specified when starting the new server.</td>

</tr> <tr>

 <td>IP</td>
 <td>The IP address of the new dynamic server.  May also be host name.</td>

</tr> <tr>

 <td>Port</td>
 <td>The server port of the new dynamic server.  Usually 6800.</td>

</tr> </deftable>

With these three values, browse to the Resin administration application's "cluster" tab. If you have enabled dynamic servers for your cluster, you should see a form allowing you to register the server in the "Cluster Overview" table.

<figure src="dynamic-server-add.png"/>

Once you have entered the values and added the server, it should show up in the table as a dead server because we haven't started it yet. The dynamic server's registration will be propagated to all the servers in the cluster.

<figure src="dynamic-server-added.png"/> </s1>

<s1 title="Starting a dynamic server">

Now that we've registered the dynamic server, we can start it and have it join the cluster. In order for the new server to be recognized and accepted by the triad, it needs to start with the same resin.xml that the triad is using, the name of the cluster it is joining, and the values entered in the registration step. These can all be specified on the command line when starting the server:

<example> dynamic-server> java -jar $RESIN_HOME/lib/resin.jar -conf /etc/resin/resin.xml \

                    -dynamic-server app-tier:123.45.67.89:6800 start

</example>

Specifying the configuration file allows the new server to configure itself using the <server-default> options, to find the triad servers of the cluster it is joining, and to authenticate using the administration logins. This command starts the server, which immediately contacts the triad to join the cluster. Once it has successfully joined, the "Cluster" tab of the administration application should look like this:

<figure src="dynamic-server-started.png"/> </s1>

</body> </document>

Personal tools