Friday, September 10, 2010

Tomcat 6 - Clustering


Tomcat Clustering
Clustering refers to running multiple instances of Tomcat that appear as one Tomcat instance. If one instance was to fail the other instances would take over thus the end user would not notice any failures.
Clustering in Tomcat enables a set of Tomcat instances on a LAN to appear to the users a single server, as detailed in the below picture. This architecture allows more requests to handled and can handle if one server were to crash (High Availability) .

Incoming requests are distributed across all servers, thus the service can handle more users. This approach is known as horizontal scaling thus you can buy cheaper hardware and still use existing hardware without having to upgrade your existing hardware.
There are a number of different clustering models that are used Master-Backup, Fail-Over, Tomcat uses both of these and incorporates load balancing as well.
Tomcat Clustering Model
The Tomcat clustering model can be divided into two layers and various components

The two layers that enable clustering are the load-balancing frontend and the state-sharing/synchronization backend. The front-end deals with incoming requests and balancing them over a number of instances, while the backend is concerned with ensuring that shared session data is available to different instances.
There are a number of load-balancing frontends that you can use
  • Round-robin DNS, whereby a domain name resolution results in a set of IP addresses
  • A hardware-based load balancer
  • A software load-balancer like PLB (Pure Load Balancer)
  • Apache mod_proxy or mod_jk as a load balancer
Depending on your budget, most setups either use hardware-based load-balancer or Apache mod_proxy/mod_jk. I have already touched on how to configure a Apache mod_proxy and mod_jk in theApache Server selection. The area I did not cover was the use of Sticky Sessions or Session Affinity, what this means when set is that incoming requests with the same session are routed to the same Tomcat worker.

They only problem with sticky sessions is that if a Tomcat instance were to fail then all sessions are lost, but as with the load-balancing frontend, you have numerous session-sharing backends from which to choose. Each provides a different level of functionality as well as implementation complexity.
mod_proxy and mod_jk offer the following options
  • Sticky Sessions with no session sharing
  • Sticky Sessions with a Persistent Session Manager and a shared file store
  • Sticky Sessions with a Persistent Session Manager and a JDBC store to RDBMS
  • In-memory session replication
Sticky Session with no clustered session sharing ensures that that requests are handled by the same instance, the session ID is encoded with the route name of the server instance that created it, assisting in the routing of the request. This solution can be used by most Production system, it is simple and easy to maintain, there is no additional configuration or resource overhead but this solution has no HA capability, thus if a server were to crash all session data is lost with it.
Sticky Session with a persistence manager and a shared file store, which is already built into Tomcat. The idea is to stored session data thus in the event of a failure the session data can be retrieved. Using a shared disk device (NFS, SMB) all the Tomcat instances has access to the session data. However Tomcat will not guarantee when a sessions data will be persisted to the file store, thus you could have a case where a Tomcat instance crashes but the session data was not written to the file store. It only offers a slightly better solution to the one above.
Sticky session with a persistence session manager and a JDBC-based store, is the same as the file-based store, but uses a RDMBS instead
In-memory session replication, is session data replicated across all Tomcat instances within the cluster, Tomcat offers two solutions, replication across all instances within the cluster or replication to only its backup server, this solution offers a guaranteed session data replication, however this solution is more complex.

The Tomcat server instances running in the cluster are implemented as a communications group. At the Tomcat instance level, the cluster implementation is an instance of the SimpleTcpCluster class. Depending on your needs and the replication pattern, you can configure SimpleTcpCluster with one or two managers
  • DeltaManager - replicate sessions across all Tomcat instances
  • BackupManager - replicate sessions from a master to a backup instance
SimpleTcpCluster uses Apache Tribes to maintain communicate with the communications group. Group membership is established and maintained by Apache Tribes, it handles server crashes and recovery. Apache Tribes also offer several levels of guaranteed message delivery between group members. This is achieved updating in-session memory to reflect any session data changes, the replication is done immediately between members. This solution offers full HA but at a cost to heavy network loads, also additional hardware is often required to make sure there are no single points of failure in the network.
Session Management
Sessions are objects (which can contain and reference to other objects) that are kept on behalf of the a client. Because HTTP is stateless there is no simple way to maintain application state using the protocol alone. A server-side session is the main mechanism used to maintain state, it works as follows
  1. The server writes a cookie to the users browser instance, the cookie contains a token to retrieve the server-side session (data structure)
  2. The cookie is supplied by the browser instance every time it accesses a page on the site
  3. The server reads the token in the cookie to extract the corresponding session.
If a browser does not support cookies, it is possible to use URL rewrite to achieve a similar effect, the URL is decorated with the session ID being used.
Setting up Multiple Instances on one Machine
The cluster consists of three independent Tomcat instances and uses the following, the following is the setup used by all three Session-Sharing methods which i will go into detail later, but first the front-end
  • mod_jk load-balancing frontend
In the real world you would setup each instance on a separate server, but in this case each instance must have the following
  • Its own configuration directory
  • Its own temp directory
  • Its own webapps directory
  • Its own temporary work directory
  • TCP ports that are different from each other (AJP Connector)
  • Optionally other TCP or JDBC resources, depending on the backend session-sharing mechanism being deployed
Three batch files called start1.bat, start2.bat and start3.bat are created and placed in the Tomcat bin directory. Each of the files set the CATALINA_HOME environment variable and then call the startup.bat file, so in each of the startup set the CATALINA_HOME variable to point to each instance
startup file# startup1.bat

set CATALINA_HOME=c:\cluster\machine1
call startup
shutdown file# stop1.bat

set CATALINA_HOME=c:\cluster\machine1
call shutdown
We will use the minimal configuration just to get the cluster up and running, so copy the directory structure below from a clean installation of Tomcat

First you must disable the HTTP connector, then change the AJP Connector and shutdown port for each of the instances
Instance NameFile to ModifyTCP Ports (Shutdown, AJP Connector)
machine 1\cluster\machine1\conf\server.xml8005, 8009
machine 2\cluster\machine2\conf\server.xml8105, 8109
machine 3\cluster\machine3\conf\server.xml8205, 8209
If you still have a problems starting the Tomcat instance due to port binding use the "netstat" command to find what is using the above port or to look for free ports.
Next you need to set a unique jvmRoute for each instance, i have already discussed jvmRoute
jvmRoutejvmRoute="machine1">
Note: the other instances will be machine2 and machine3
To indicate to a Servlet Container that the application can be clustered, a Servlet 2.4 standard element is placed into the applications deployment descriptor (web.xml). If this element is not added, the session maintained by this application across the three Tomcat instances will not be shared. You can also add it to the Context element
distributable
The front-end is setup in the Apache Server, i am not going to go into to much detail as i have already covered load-balancing
workers.properties fileworker.list = loadbal1,stat1

worker.machine1.type = ajp13
worker.machine1.host =192.168.0.1
worker.machine1.port = 8009
worker.machine1.lbfactor = 10

worker.machine2.type = ajp13
worker.machine2.host =192.168.0.2
worker.machine2.port = 8109
worker.machine2.lbfactor = 10

worker.machine3.type = ajp13
worker.machine3.host =192.168.0.3
worker.machine3.port = 8209
worker.machine3.lbfactor = 10

worker.bal1.type = lb
worker.bal1.sticky_session = 1
worker.bal1.balance_workers = machine1, machine2, machine3

worker.stat1.type= status
httpd.conf updatesJkMount /examples/jsp/* bal1
JkMount /jkstatus/ stat1

JkWorkersFile conf/workers.properties
JSP page# Create file sestest.jsp




Session serviced by machine1














Session ID


Created on




Note: create for each instance and change the machine name above
Session-Sharing Backend
The above is the same setup for all three Session-Sharing backends, which I am now going to show you how to set these up.
The first i am going to setup is the In-Memory replication, two components needs to be configured to enable in-memory configuration, the element is responsible for the actual session replication, this includes the sending of new session information to the group, incorporating new incoming session information locally and management of group membership (it uses Apache Tribes). The other component is a replication Valve which is used to reduce the potential session replication traffic by ruling out (filtering) certain requests from session replication.

The only implementation of in-memory replication is called SimpleTcpCluster, it uses Apache Tribes for communication which uses regular multicasts (heartbeat packets) to determine membership. All node that are running must multicasts a heartbeat at regular frequency, if they do not send a heartbeat the node is considered dead and is removed from the cluster. The membership of the cluster is managed dynamically as nodes are added or removed. Session data is replicated between the all nodes in the cluster via TCP by using end-to-end communication.

Because of the amount of network traffic generated, you should only use a small number of nodes within the cluster, unless you have vast amounts of memory and can supply a high bandwidth network. You can reduce the amount of data by using the BackupManager (send only to one node, the backup node) instead of DeltaManager (sends to all nodes within the cluster).
ElementDescription
The cluster element is nested inside an enclosing element, it essentially enables session replication for all applications in the host.
This is a mandatory component, this is where you configure either DeltaManager or BackupManager, they both send replication information to others via Channels from the Apache Tribes group communications library.
A channel is an abstract endpoint, (like a socket) that a member of the group can send and receive replicated information through. Channels are managed and implemented by the Apache Tribes communications framework. Channel has only one attribute.
This attribute selects the physical network interface to use on the server (if you have only one network adapter you generally don't need this). This service is based on sending a multicasts heartbeat regularly, which determines and maintains information on the servers that are considered part of the group (cluster) at any point in time.
This element configures the TCP receiver component of the Apache Tribes Framework, it receives the replicated data information from other members.

This element configures the TCP sender component of the Apache Tribes Framework, it sends the replicated data information to other members.
This element performs the real stuff, tribes support having a pool of senders, so that messages can be sent in parallel and if using NIO sender, you can send messages concurrently as well.
Interceptor components are nested components of and are message processing components that can chained together to alter the behavior or add value to the option of a channel. basically you have a option flag that will trigger its operation.
This element acts as a filter for In-Memory replication, it reduces the actual session replication network traffic by determining if the current session needs to be replicated at the end of the request cycle. Even through this element is inside the element it is consider to be inside the element.
Some of the work of the cluster is performed by hooking up listeners to replication messages that are passing through it.
You must configure org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener if you are using JvmRouteBinderValve to ensure session stickiness transfers with a fail-over

You must also configure org.apache.catalina.ha.session.ClusterSessionListener if using DeltaManager because this listener forwards the messages to the manager for delta and merging operations.
Now to go into details on what options each element can have
Element
Attribute Name
Description
default
className
The implementation Java Class for the cluster manager current uses org.apache.catalina.ha.tcp.SimpleTcpCluster
channelSendOptionsOption flags are included with messages sent and can be used to trigger Apache Tribes channel interceptors. The numerical value is a logical OR flag values including

Channel.SEND_OPTIONS_ASYNCHRONRONUS 8
Channel.SEND_OPTIONS_BYTE_MESSAGE 1
Channel.SEND_OPTIONS_SECURE 16
Channel.SEND_OPTIONS_SYNCHRONIZED_ACK 4
Channel.SEND_OPTIONS_USE_ACK 2
11 (async with ack)
Element (mandatory)
Attribute Name
Description
default
classNameorg.apache.catalina.ha.session.DeltaManager
org.apache.catalina.ha.session.BackupManager
nameA name for the cluster manager, this name should be the same on all instances
notifyListeners-OnReplicationIndicates if any session listeners should be notified when sessions are replicated between instances
false
expireSessions-OnShutdownSpecifies whether it is necessary to expire of all sessions upon application shutdown
false
domainReplicationSpecifies whether replication should be limited to domain members only, this option is only available for DeltaManager
false
mapSendOptionsWhen using the BackupManager, this maps the send options that are set to trigger interceptors
8 (async)
Element
Attribute Name
Description
default
classNameorg.apache.catalina.tribes.group.GroupChannel
Element
Attribute Name
Description
default
classNameorg.apache.catalina.tribes.membership.McastService
addressThe multicast address selected for this instance
228.0.0.4
portThe multicast port used
45564
frequency (milliseconds)Frequency which heartbeat multicasts are sent (in milliseconds)
500
dropTime (milliseconds)The time elapsed without heartbeats before the service considers a member has died and removes it from the group (in milliseconds)
3000
ttlSets the time-to-live for multicast messages sent (may be used if network traffic are going through any routers)
soTimeoutThe SO_TIMEOUT value on the socket that multicasts messages are sent to. Controls the maximum time to wait for a send/receive to complete
0
domainFor partitioning group members into separate domains for replication.
bindThe IP address of the adaptor that the service should bind to.
0.0.0.0
Element
Attribute Name
Description
default
classNameorg.apache.catalina.tribes.transport.nio.NioReceiver
addressThe IP address to bind to, to receive incoming TCP data (you must sent this for multi-homed hosts)
auto
portSelects the port to use for imcoming TCP data.
4000
autoBindTells the framework to hunt for an available port, starting for the specified port number and add up to this number
1000
selectorTimeout (millseconds)Bypass for old NIO bug. sets the milliseconds timeout while polling for incoming messages
5000
maxThreadsThe maximum number of threads to create to receive incoming messages
6
minThreadsThe minimum number of threads to create to receive incoming messages
6
Element (must contain a Transmitter element)
Attribute Name
Description
default
classNameorg.apache.catalina.tribes.transport.ReplicationTransmitter
Element
Attribute Name
Description
default
classNameorg.apache.catalina.tribes.transport.nio.PooledParallelSender
org.apache.catalina.tribes.transport.bio.PooledMultiSender
maxRetryAttemptsThe number of retries the framework conducts when encountering socket-level errors during sending of a message
1
timeoutThe SO_TIMEOUT value on the socket that messages are sent on. Controls the maximum time to wait for a send to complete
3000
poolsizeControls the maximum number of TCP connections opened by the sender between the current and another member in the group. only available when using org.apache.catalina.tribes.transport.nio.PooledParallelSender
25
Element
Attribute Name
Description
org.apache.catalina.tribes.group.interceptors
.TcpFailureDetector
When membership pings do not arrive the interceptor attempts to connect to the problematic member to validate that the member is no longer reachable before the membership list is adjusted
org.apache.catalina.tribes.group.interceptors
.MessageDispatch5Interceptor
The asynchronous message dispatcher, triggered by default send option value 8 (its hard coded)
org.apache.catalina.tribes.group.interceptors
.ThroughputInterceptor
Logs cluster messages throughput information to Tomcat logs
Replication Element
Attribute Name
Description
classNameorg.apache.catalina.ha.tcp.ReplicationValve
filterA semicolon-delimited list of URL pattern for requests that are to be filtered out.
Example
server.xml file




port="4200"
autoBind="100"
selectorTimeout="5000"
maxThreads="6" />
















Note: the port in bold is the one you need to change for each instance
Testing In-Memory Session Replication Cluster
You need to perform the following to test the In-Memory session replication
  1. Ensure that the , are uncommented in all three instances, also make sure that the element is commented out.
  2. Start all three instances
  3. Start the Apache server with the mod_jk module
  4. Open a browser to http:///examples/jsp/sesstest.jsp
  5. Check that its working correctly, then shutdown the instance that had your session
  6. Retest using the browser and hopefully it should pickup the other instance but have the same session ID
I have a catalina log file with cluster logging information, starting up, a node going down and rejoining, this log file is from the example I have given above, although I used different IP address but i am sure you get the idea.
Persistent Session Manager with a shared file store
To setup a persistent session manager you must comment out the element in each instance, this disables the In-Memory replication mechanism, then add a context.xml file to each instance with the below
context.xml


  1. Restart the tomcat instances
  2. Open a browser to http:///examples/jsp/sesstest.jsp
  3. Click a number of times, you should stay with the same instance (sticky session working)
  4. Shutdown an instance
  5. Retest using the browser and hopefully it should pickup the other instance but have the same session ID
Persistent Session Manager with a JDBC store
The only difference between a file-based store and a JDBC-based store is the element
context.xml
Create a table in your databasecreate table tomcat_sessions (
session_id varchar(100) not null primay key,
valid_session char(1) not null,
max_inactive int not null,
last_access bigint not null,
app_context varchar(255),
session_data meduimlob,
KEY kapp_context(app_context)
);
  1. Restart the tomcat instances
  2. Open a browser to http:///examples/jsp/sesstest.jsp
  3. Click a number of times, you should stay with the same instance (sticky session working)
  4. Shutdown an instance
  5. Retest using the browser and hopefully it should pickup the other instance but have the same session ID

No comments:

Post a Comment