FITECH Laboratories spacer
graphic Company graphic Products graphic Support graphic Customers graphic Partners
The Power of Choice
spacer » Buy graphic » Try graphic » Map graphic » Contact graphic
spacer
spacer
xTier™
Overview
xTier Services
Business Case
Documentation
F.A.Q.
Buy xTier™
Try xTier™
Professional Services
graphic
spacer xTier
spacer
cluster
Product: xTier™/CLUSTER 2.3
spacer
 support@fitechlabs.com
 Download
 Buy
 Depends: os  objpool marshal  log
 Related: cache  grid
xTier™ Navigator:
cache cluster config email i18n
ioc info jmx jndi security
log marshal objpool os fs
tx uidgen workflow jobs
db startup jms grid

Description
The basic feature of the cluster service is the ability for any node to join or leave cluster at startup or shutdown. xTier™ does not have a notion of administrator node, nore does it require the startup of any node before other nodes can be started. All nodes in xTier™ cluster are absolutely identical and can be started or stopped concurrently.

Some of the main features of cluster service are:

  • Geting all cluster nodes - getAllNodes() method returns a read-only list of all nodes in the cluster including this node.
  • Cluster filters - allows to get any subset of cluster nodes selected via filter provided by user. xTier™ cluster serivice comes with following filters out of the box:
    • ClusterAddressFilter - accepts all nodes for given IP address.
    • ClusterGroupFilter - accepts all nodes that belong to given group.
    • ClusterNodeTypeFilter - allows to select between local node, localhost nodes, and remote nodes. For more information about types of nodes refer to ClusterNode documentation.
    • ClusterServiceFilter - accepts all nodes that have specified service running.
    Cluster filters are passed into getNodes(ClusterFilter) method. For more information about cluster filters refer to ClusterFilter documentation.
  • Node failure resolving - pluggable resolving for failed nodes. A node is considered failed whenever it has missed a cetain number of heatbeats (loss-threshold parameter in XML configuration). Note that all failed nodes are autmatically removed from cluster. xTier™ comes with following node failure resolvers out-of-the-box:
    • ClusterBasicFailureResolver - allows to specify whether a node should always be considered failed, or whether a node should never be considered failed.
    • ClusterTcpNodeFailureResolver - assumes that remote node is not failed if TCP connection to it can be established. This failure resolver should be useful for majority of applications.
    For more information about cluster node failure resolving refer to ClusterNodeFailureResolver documentation.
  • Cluster groups - allows for any node in the cluster to specify which cluster group it belongs to. Cluster groups are defined by user; xTier™ does not require that a node joins any cluster group. User specifies if a node should become a member of any cluster group and what attributes are associated with this membership. Specifying different cluster groups for different sets of nodes enables user to define many virtual clusters within xTier™ cluster. For more information about cluster group memberships see ClusterGroupMembership documentation.
  • Event notifications - user can subscribe a listener for cluster event notifications and be notified whenever a node joins or leaves the cluster, or whenever a node has crashed.

 Top

Configuration
'cluster' service is configured via pre-defined xtier_cluster.xml configuration file. This file follows standard xTier™ service configuration pattern that can be demonstrated by the following complete example of cluster configuration:

1<region name="examples">
2  <!-- Local node settings. -->
3  <local-node addr="localhost" port="64001"/>
4
5  <!-- Cluster network settings. -->
6  <network timeout="300" mcast-group="233.31.37.41" 
7    mcast-port="64001" mcast-ttl="0"/>
8
9  <!-- Cluster join configuration properties. -->
10  <join retries="2">
11    <!-- Optional list of seed nodes. -->
12    <seed-nodes>
13      <node addr="192.168.0.1" port="64002"/>
14      <node addr="192.168.0.2" port="64002"/>
15      <node addr="192.168.0.3" port="64002"/>
16    </seed-nodes>
17  </join>
18
19  <!-- Heartbeat settings for this node. -->
20  <heartbeat frequency="3000" loss-threshold="3"/>
21
22  <!-- 
23    Cluster node failure resolver implementation 
24    used within example. 
25  -->
26  <node-failure-resolver>
27    <ioc policy="singleton">
28      <java class="com.fitechlabs.xtier.services.cluster.
29        resolvers.ClusterBasicFailureResolver">
30        <ctor>
31          <!-- 
32            True if behavior is to always fail, 
33            false if behavior is to never fail. 
34          -->
35          <arg type="boolean">false</arg>
36        </ctor>
37      </java>
38    </ioc>
39  </node-failure-resolver>
40
41  <!-- Cluster groups this node belongs to. -->
42  <memberships>
43    <group name="cache"/>
44
45    <group name="grid">
46      <prop name="algorithm" value="bfs"/>
47    </group>
48  </memberships>
49</region>

Formal sepcification for this service configuration can be found in xtier_cluster.dtd file in ${XTIER_ROOT}/config/dtd folder. Following is detailed description of cluster configuration parameters.

local-node This element specified IP and port for this nodes. It has two attributes:
  • addr - Optional attribute that specifies the local IP address this node should be bound to. This parameter is usefull on multihomed servers. If ommitted then implementation will pick up the first avialable IP address.
  • port - TCP port to listen for cluster messsages.
network This element specifies network settings for the cluster. It has following attributes:
  • timeout - Timeout to wait for reply or establishing and closing connections.
  • mcast-group - IP-multicast group used for cluster communications.
  • mcast-port - IP-multicast port used for cluster communications.
  • mcast-ttl - Time-to-live in router hops for IP-multicast messages.
join This element specfied neccessary parameters used for joining cluster.When a node tries to join cluster it will first send IP-multicast 'join-cluster' advertisement. If no reply is received, then the node will attempt to contact each of the 'seed-nodes' specified, and if there is still no reply, the node will assume that it is first one in the cluster. For absolute guarantee of correct startup completion make sure to specify all possible cluster members in the list of seed-nodes.

This element has the following attributes and elements:

  • retries - Number of times IP-multicast message will be sent before node will attempt to contact seed nodes.
  • seed-nodes - List of 'seed-nodes' to be contacted. Every seed-node is identified by IP address and port number.
heartbeat This element specifies heartbeat settings for this node. Heartbeat is a small datagram message sent to all other nodes via IP-multicast protocol. Every node emmit a heartbeat message within predefined time interval. This element has the following attributes:
  • frequency - Time interval between two consecutive heartbeat messages.
  • loss-threshold - Number of missed heartbeats when a node will be suspected of failure and will be passed into ClusterNodeFailureResolver#isFailed(ClusterNode) method to verify its failure status.
node-failure-resolver This element defines IoC object that represents cluster node failure resolver for cluster service. See IocService for more details on IoC usage. See ClusterNodeFailureResolver documentation for more information about failure resolving within cluster service.
memberships This optional element defines cluster groups, if any, this node belongs to. It is composed of one or more <group> elements. Every <group> element can optionally have any number of properties (<property> element) associated with them. A property has two attributes: name and value. Note that since properties are defined in XML file, they can only be of type String. For more information about cluster group memberships refer to ClusterGroupMembership documentation.

 Top

Unix & IP-multicast
xTier™ cluster service requires use of IP-multicast protocol. Although Windows IP-multicast usually works out-of-the-box, Linux and UNIX systems may need to be configured before IP-multicast can operate properly. Note that most of modern Linux and UNIX systems ship with pre-enabled IP-multicasting however older systems may require kernel reconfiguration to enable IP-multicast before configuring it. In most cases you need to add specific routing information to the local box (kernel) and instruct kernel to forward IP-multicast packets (Linux example):

$ route add -net 224.0.0.0 netmask 240.0.0.0 dev eth0
$ echo 1 > /proc/sys/net/ipv4/ip_forward

Note that you can also "ping" 224.0.0.1 "broadcast" IP and all network nodes that have IP-multicast enalbed will answer. Also make sure that specified IP-multicast group and port are "allowed" by local and router firewall settings.

 Top

Linux & Failures
Some versions of Linux OS may operate as there is no error in case of network failure. If Linux OS is not able to recongnize network failure, then the nodes in the cluster are simply not able to contact other nodes and operate if they are alone in the cluster. Such scenario may occur if the router gets unplugged, for example. Note that after fixing the network problem (e.g. recycling the router), Linux OS may still not be able to connect to neighboring nodes. To ensure that network is functioning properly on Linux OS, user may need to run following commands to recyle the network (commands bellow are Linux equivalent of 'ipconfig /release' and 'ipconfig /renew' on Windows):

$ ifdown eth0
$ ifup eth0

Note that even if command sequence above is executed, cluster may still need to be recycled, since some nodes may start receiving heartbeats from the nodes that they were not aware of. Make sure to implement ClusterErrorListener interface in order to be programatically notified when cluster nodes start getting hearbeats from nodes that are not members of the cluster.

 Top

Examples
Usage of 'cluster' service follows the standard pattern of using xTier™ service: you need to obtain an instance of xTier™ kernel that serves as a service registry. Once you have xTier™ kernel you can get an instance of any service, in our case the cluster service. Once the service instance is obtained the service API can be used.

Note that usage of 'cluster' service depends on 'object pool' and 'marshal' services. See ObjectPoolService and MarshalService services for details on their usage.

Following code snippet is taken out from cluster service example:

1// Get the instance of xTier kernel.
2XtierKernel xtier = XtierKernel.getInstance();
3 
4// Get the instance of 'cluster' service.
5ClusterService cluster = xtier.cluster();
6 
7// Get local node.
8ClusterNode localNode = cluster.getLocalNode();
9 
10// Get remote nodes ('false' for local node, 'false' 
11// for localhost nodes, true for remote nodes).
12Set remoteNodes = cluster.getNodes(
13    new ClusterNodeTypeFilter(false, false, true));
14 
15// Get group membships for the local node.
16Map grpMshps = localNode.getGroupMemberships();
17 
18// Locks cluster topology. Note that this method 
19// doesn't perform any distributed operations
20// and can be called frequently. Cluster topology will 
21// not change untill 'unlockTopology()' method is called.
22cluster.lockTopology();
23 
24// Get version of cluster topology while the lock is held.
25int version = cluster.getTopologyVersion();
26 
27// Sleep for a short period of time.
28Thread.sleep(100);
29 
30// Assert that cluster topology did not change because 
31// we are holding the topology lock.
32assert version == cluster.getTopologyVersion();
33 
34// Unlock cluster topology.
35cluster.unlockTopology();
36 
37// Stop the xTier kernel.
38xtier.stop(); 

 Download xTier™ for full examples and documentation.

 Top

spacer