Description
The basic feature of the cluster service is the ability for any node to join or leave cluster at startup
or shutdown. xTier does not have a notion of administrator node, nore does it require the startup of
any node before other nodes can be started. All nodes in xTier cluster are absolutely identical and
can be started or stopped concurrently.
Some of the main features of cluster service are:
-
Geting all cluster nodes - getAllNodes() method returns a read-only list of all nodes
in the cluster including this node.
-
Cluster filters - allows to get any subset of cluster nodes selected via filter provided by user.
xTier cluster serivice comes with following filters out of the box:
-
ClusterAddressFilter - accepts all nodes for given IP address.
-
ClusterGroupFilter - accepts all nodes that belong to given group.
-
ClusterNodeTypeFilter - allows to select between
local node, localhost nodes, and remote nodes. For more information about types of nodes refer to
ClusterNode documentation.
-
ClusterServiceFilter - accepts all nodes that have specified service running.
Cluster filters are passed into getNodes(ClusterFilter) method. For more information about cluster
filters refer to ClusterFilter documentation.
-
Node failure resolving - pluggable resolving for failed nodes. A node is considered failed
whenever it has missed a cetain number of heatbeats (loss-threshold parameter in XML
configuration). Note that all failed nodes are autmatically removed from cluster. xTier comes with
following node failure resolvers out-of-the-box:
-
ClusterBasicFailureResolver - allows to specify
whether a node should always be considered failed, or whether a node should never be considered failed.
-
ClusterTcpNodeFailureResolver - assumes that
remote node is not failed if TCP connection to it can be established. This failure resolver
should be useful for majority of applications.
For more information about cluster node failure resolving refer to ClusterNodeFailureResolver
documentation.
-
Cluster groups - allows for any node in the cluster to specify which cluster group it belongs to.
Cluster groups are defined by user; xTier does not require that a node joins any cluster group. User
specifies if a node should become a member of any cluster group and what attributes are associated with
this membership. Specifying different cluster groups for different sets of nodes enables user to define
many virtual clusters within xTier cluster. For more information about cluster group memberships see
ClusterGroupMembership documentation.
-
Event notifications - user can subscribe a listener for cluster event notifications and be
notified whenever a node joins or leaves the cluster, or whenever a node has crashed.
Top
Configuration
'cluster' service is configured via pre-defined xtier_cluster.xml configuration file. This file
follows standard xTier service configuration pattern that can be demonstrated by the following complete example
of cluster configuration:
| 1 |  | <region name="examples"> |
| 2 |  | <!----> |
| 3 |  | <local-node addr="localhost" port="64001"/> |
| 4 |  | |
| 5 |  | <!----> |
| 6 |  | <network timeout="300" mcast-group="233.31.37.41" |
| 7 |  | mcast-port="64001" mcast-ttl="0"/> |
| 8 |  | |
| 9 |  | <!----> |
| 10 |  | <join retries="2"> |
| 11 |  | <!----> |
| 12 |  | <seed-nodes> |
| 13 |  | <node addr="192.168.0.1" port="64002"/> |
| 14 |  | <node addr="192.168.0.2" port="64002"/> |
| 15 |  | <node addr="192.168.0.3" port="64002"/> |
| 16 |  | </seed-nodes> |
| 17 |  | </join> |
| 18 |  | |
| 19 |  | <!----> |
| 20 |  | <heartbeat frequency="3000" loss-threshold="3"/> |
| 21 |  | |
| 22 |  | <!-- |
| 23 |  | |
| 24 |  | |
| 25 |  | --> |
| 26 |  | <node-failure-resolver> |
| 27 |  | <ioc policy="singleton"> |
| 28 |  | <java class="com.fitechlabs.xtier.services.cluster. |
| 29 |  | resolvers.ClusterBasicFailureResolver"> |
| 30 |  | <ctor> |
| 31 |  | <!-- |
| 32 |  | |
| 33 |  | |
| 34 |  | --> |
| 35 |  | <arg type="boolean">false</arg> |
| 36 |  | </ctor> |
| 37 |  | </java> |
| 38 |  | </ioc> |
| 39 |  | </node-failure-resolver> |
| 40 |  | |
| 41 |  | <!----> |
| 42 |  | <memberships> |
| 43 |  | <group name="cache"/> |
| 44 |  | |
| 45 |  | <group name="grid"> |
| 46 |  | <prop name="algorithm" value="bfs"/> |
| 47 |  | </group> |
| 48 |  | </memberships> |
| 49 |  | </region> |
Formal sepcification for this service configuration can be found in xtier_cluster.dtd file in
${XTIER_ROOT}/config/dtd folder. Following is detailed description of cluster configuration
parameters.
| local-node |
This element specified IP and port for this nodes. It has two attributes:
-
addr - Optional attribute that specifies the local IP address this node should be bound to.
This parameter is usefull on multihomed servers. If ommitted then implementation will pick up the first
avialable IP address.
-
port - TCP port to listen for cluster messsages.
|
| network |
This element specifies network settings for the cluster. It has following attributes:
-
timeout - Timeout to wait for reply or establishing and closing connections.
-
mcast-group - IP-multicast group used for cluster communications.
-
mcast-port - IP-multicast port used for cluster communications.
-
mcast-ttl - Time-to-live in router hops for IP-multicast messages.
|
| join |
This element specfied neccessary parameters used for joining cluster.When a node tries to
join cluster it will first send IP-multicast 'join-cluster' advertisement.
If no reply is received, then the node will attempt to contact each of the 'seed-nodes'
specified, and if there is still no reply, the node will assume that it is first one in the
cluster. For absolute guarantee of correct startup completion make sure to specify all
possible cluster members in the list of seed-nodes.
This element has the following attributes and elements:
-
retries - Number of times IP-multicast message will be sent before node will
attempt to contact seed nodes.
-
seed-nodes - List of 'seed-nodes' to be contacted. Every seed-node is identified
by IP address and port number.
|
| heartbeat |
This element specifies heartbeat settings for this node. Heartbeat is a small datagram message
sent to all other nodes via IP-multicast protocol. Every node emmit a heartbeat message within
predefined time interval. This element has the following attributes:
-
frequency - Time interval between two consecutive heartbeat messages.
-
loss-threshold - Number of missed heartbeats when a node will be suspected of failure
and will be passed into ClusterNodeFailureResolver#isFailed(ClusterNode) method to
verify its failure status.
|
| node-failure-resolver |
This element defines IoC object that represents cluster node failure resolver for cluster service.
See IocService for more details on IoC usage.
See ClusterNodeFailureResolver documentation for more information about failure resolving
within cluster service.
|
| memberships |
This optional element defines cluster groups, if any, this node belongs to. It is composed of one or more
<group> elements. Every <group> element can optionally have any
number of properties (<property> element) associated with them. A property has
two attributes: name and value. Note that since properties are defined in XML
file, they can only be of type String. For more information about cluster group memberships
refer to ClusterGroupMembership documentation.
|
Top
Unix & IP-multicast
xTier cluster service requires use of IP-multicast protocol. Although Windows IP-multicast usually
works out-of-the-box, Linux and UNIX systems may need to be configured before IP-multicast can operate
properly. Note that most of modern Linux and UNIX systems ship with pre-enabled IP-multicasting however
older systems may require kernel reconfiguration to enable IP-multicast before configuring it. In most
cases you need to add specific routing information to the local box (kernel) and instruct kernel to
forward IP-multicast packets (Linux example):
$ route add -net 224.0.0.0 netmask 240.0.0.0 dev eth0
$ echo 1 > /proc/sys/net/ipv4/ip_forward
Note that you can also "ping" 224.0.0.1 "broadcast" IP and all network nodes that have IP-multicast
enalbed will answer. Also make sure that specified IP-multicast group and port are "allowed" by local and router
firewall settings.
Top
Linux & Failures
Some versions of Linux OS may operate as there is no error in case of network failure. If Linux OS is not able
to recongnize network failure, then the nodes in the cluster are simply not able to contact other nodes and
operate if they are alone in the cluster. Such scenario may occur if the router gets unplugged, for example.
Note that after fixing the network problem (e.g. recycling the router), Linux OS may still not be able to
connect to neighboring nodes. To ensure that network is functioning properly on Linux OS, user may need to
run following commands to recyle the network (commands bellow are Linux equivalent of
'ipconfig /release' and 'ipconfig /renew' on Windows):
$ ifdown eth0
$ ifup eth0
Note that even if command sequence above is executed, cluster may still need to be recycled, since some nodes
may start receiving heartbeats from the nodes that they were not aware of. Make sure to implement
ClusterErrorListener interface in order to be programatically notified when cluster nodes start
getting hearbeats from nodes that are not members of the cluster.
Top
Examples
Usage of 'cluster' service follows the standard pattern of using xTier service: you need to obtain
an instance of xTier kernel that serves as a service registry. Once you have xTier kernel you can get
an instance of any service, in our case the cluster service. Once the service instance is obtained
the service API can be used.
Note that usage of 'cluster' service depends on 'object pool' and 'marshal' services.
See ObjectPoolService and
MarshalService services for details on their usage.
Following code snippet is taken out from cluster service example:
| 1 |  | |
| 2 |  | XtierKernel xtier = XtierKernel.getInstance(); |
| 3 |  | |
| 4 |  | |
| 5 |  | ClusterService cluster = xtier.cluster(); |
| 6 |  | |
| 7 |  | |
| 8 |  | ClusterNode localNode = cluster.getLocalNode(); |
| 9 |  | |
| 10 |  | |
| 11 |  | |
| 12 |  | Set remoteNodes = cluster.getNodes( |
| 13 |  | new ClusterNodeTypeFilter(false, false, true)); |
| 14 |  | |
| 15 |  | |
| 16 |  | Map grpMshps = localNode.getGroupMemberships(); |
| 17 |  | |
| 18 |  | |
| 19 |  | |
| 20 |  | |
| 21 |  | |
| 22 |  | cluster.lockTopology(); |
| 23 |  | |
| 24 |  | |
| 25 |  | int version = cluster.getTopologyVersion(); |
| 26 |  | |
| 27 |  | |
| 28 |  | Thread.sleep(100); |
| 29 |  | |
| 30 |  | |
| 31 |  | |
| 32 |  | assert version == cluster.getTopologyVersion(); |
| 33 |  | |
| 34 |  | |
| 35 |  | cluster.unlockTopology(); |
| 36 |  | |
| 37 |  | |
| 38 |  | xtier.stop(); |
Download xTier for full examples and documentation.
Top
|