Tomcat 集群配置
Clustering Configuration
概述
Tomcat集群技术实现了多个Tomcat实例间的会话复制和故障转移,提供高可用性和可扩展性。本文详细介绍Tomcat集群的配置、管理和优化技术。
1. 集群架构概览
1.1 集群拓扑结构
集群架构
Load Balancer
│
┌──────┼──────┐
│ │ │
Tomcat1 Tomcat2 Tomcat3
│ │ │
└──────┼──────┘
Cluster
(Session Replication)
│
Shared Database
1.2 集群组件说明
集群组件
├── Cluster Manager
│ ├── SimpleTcpCluster
│ └── DeltaManager
├── Channel Components
│ ├── Membership Service
│ ├── Receiver
│ ├── Sender
│ └── Interceptors
├── Valves
│ ├── ReplicationValve
│ └── JvmRouteBinderValve
└── Deployer
└── FarmWarDeployer
2. 基础集群配置
2.1 最小集群配置
<!-- server.xml - 基础集群配置 -->
<Engine name="Catalina" defaultHost="localhost" jvmRoute="tomcat1">
<!-- 简单TCP集群配置 -->
<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
channelSendOptions="8">
<!-- 会话管理器 -->
<Manager className="org.apache.catalina.ha.session.DeltaManager"
expireSessionsOnShutdown="false"
notifyListenersOnReplication="true"/>
<!-- 通道配置 -->
<Channel className="org.apache.catalina.tribes.group.GroupChannel">
<!-- 成员发现服务 -->
<Membership className="org.apache.catalina.tribes.membership.McastService"
address="228.0.0.4"
port="45564"
frequency="500"
dropTime="3000"/>
<!-- 接收器 -->
<Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
address="auto"
port="4000"
autoBind="100"
selectorTimeout="5000"
maxThreads="6"/>
<!-- 发送器 -->
<Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
<Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
</Sender>
<!-- 拦截器 -->
<Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
<Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
</Channel>
<!-- 集群阀门 -->
<Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
filter=""/>
<Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>
<!-- 集群监听器 -->
<ClusterListener className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
<ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
</Cluster>
<Host name="localhost" appBase="webapps">
<!-- Host配置 -->
</Host>
</Engine>
2.2 多节点集群配置
节点1 (tomcat1) 配置:
<Engine name="Catalina" defaultHost="localhost" jvmRoute="node1">
<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster">
<Manager className="org.apache.catalina.ha.session.DeltaManager"/>
<Channel className="org.apache.catalina.tribes.group.GroupChannel">
<Membership className="org.apache.catalina.tribes.membership.McastService"
address="228.0.0.4"
port="45564"/>
<Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
address="192.168.1.10"
port="4000"/>
<Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
<Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
</Sender>
</Channel>
</Cluster>
</Engine>
节点2 (tomcat2) 配置:
<Engine name="Catalina" defaultHost="localhost" jvmRoute="node2">
<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster">
<Manager className="org.apache.catalina.ha.session.DeltaManager"/>
<Channel className="org.apache.catalina.tribes.group.GroupChannel">
<Membership className="org.apache.catalina.tribes.membership.McastService"
address="228.0.0.4"
port="45564"/>
<Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
address="192.168.1.11"
port="4000"/>
<Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
<Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
</Sender>
</Channel>
</Cluster>
</Engine>
3. 高级集群配置
3.1 静态成员配置
<!-- 静态成员配置(不使用组播) -->
<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster">
<Channel className="org.apache.catalina.tribes.group.GroupChannel">
<!-- 静态成员配置 -->
<Membership className="org.apache.catalina.tribes.membership.StaticMembershipInterceptor">
<Member className="org.apache.catalina.tribes.membership.StaticMember"
port="4000"
host="192.168.1.10"
domain="cluster"
uniqueId="{0,1,2,3,4,5,6,7,8,9}"/>
<Member className="org.apache.catalina.tribes.membership.StaticMember"
port="4000"
host="192.168.1.11"
domain="cluster"
uniqueId="{0,1,2,3,4,5,6,7,8,10}"/>
<Member className="org.apache.catalina.tribes.membership.StaticMember"
port="4000"
host="192.168.1.12"
domain="cluster"
uniqueId="{0,1,2,3,4,5,6,7,8,11}"/>
</Membership>
<Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
address="auto"
port="4000"/>
<Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
<Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
</Sender>
<!-- 故障检测 -->
<Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
<Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
</Channel>
</Cluster>
3.2 加密通信配置
<!-- 加密集群通信 -->
<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster">
<Channel className="org.apache.catalina.tribes.group.GroupChannel">
<Membership className="org.apache.catalina.tribes.membership.McastService"
address="228.0.0.4"
port="45564"/>
<Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
address="auto"
port="4000"/>
<Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
<Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
</Sender>
<!-- 加密拦截器 -->
<Interceptor className="org.apache.catalina.tribes.group.interceptors.EncryptInterceptor"
encryptionKey="mySecretKey123456"/>
<!-- 压缩拦截器 -->
<Interceptor className="org.apache.catalina.tribes.group.interceptors.GzipInterceptor"/>
<Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
<Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
</Channel>
</Cluster>
4. 会话管理配置
4.1 应用级会话配置
<!-- web.xml - 启用分布式会话 -->
<web-app>
<!-- 启用分布式标记 -->
<distributable/>
<!-- 会话配置 -->
<session-config>
<session-timeout>30</session-timeout>
<cookie-config>
<name>JSESSIONID</name>
<path>/</path>
<http-only>true</http-only>
<secure>false</secure>
</cookie-config>
</session-config>
</web-app>
4.2 会话管理器配置
<!-- Context级会话管理器 -->
<Context path="/myapp" docBase="myapp.war">
<!-- Delta会话管理器 -->
<Manager className="org.apache.catalina.ha.session.DeltaManager"
expireSessionsOnShutdown="false"
notifyListenersOnReplication="true"
maxInactiveInterval="1800"
sessionIdLength="16"/>
<!-- 持久化会话管理器 -->
<!--
<Manager className="org.apache.catalina.ha.session.PersistentManager"
saveOnRestart="true"
maxActiveSession="1000">
<Store className="org.apache.catalina.ha.session.JDBCStore"
connectionURL="jdbc:mysql://localhost:3306/sessions"
connectionName="sessionuser"
connectionPassword="sessionpass"
sessionTable="tomcat_sessions"
sessionAppCol="app_name"
sessionIdCol="session_id"
sessionDataCol="session_data"
sessionValidCol="valid"
sessionMaxInactiveCol="max_inactive"
sessionLastAccessedCol="last_access"/>
</Manager>
-->
</Context>
5. 集群监控与管理
5.1 集群状态监控
// ClusterMonitor.java
package com.example.cluster;
import org.apache.catalina.ha.ClusterManager;
import org.apache.catalina.ha.CatalinaCluster;
import org.apache.catalina.tribes.Member;
import javax.management.MBeanServer;
import javax.management.ObjectName;
import java.lang.management.ManagementFactory;
public class ClusterMonitor {
private MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();
public void monitorCluster() throws Exception {
ObjectName clusterName = new ObjectName("Catalina:type=Cluster");
if (mbs.isRegistered(clusterName)) {
// 获取集群成员信息
String[] memberNames = (String[]) mbs.getAttribute(clusterName, "memberNames");
System.out.println("=== 集群状态监控 ===");
System.out.println("集群成员数: " + memberNames.length);
for (String memberName : memberNames) {
System.out.println("成员: " + memberName);
}
// 获取会话复制统计
monitorSessionReplication();
} else {
System.out.println("集群未配置或未启动");
}
}
private void monitorSessionReplication() throws Exception {
ObjectName managerName = new ObjectName("Catalina:type=Manager,*");
for (ObjectName name : mbs.queryNames(managerName, null)) {
String className = (String) mbs.getAttribute(name, "className");
if (className.contains("DeltaManager")) {
Integer activeSessions = (Integer) mbs.getAttribute(name, "activeSessions");
Long sessionCounter = (Long) mbs.getAttribute(name, "sessionCounter");
System.out.println("会话管理器: " + name.getKeyProperty("path"));
System.out.println(" 活跃会话: " + activeSessions);
System.out.println(" 会话计数: " + sessionCounter);
}
}
}
}
5.2 集群健康检查
#!/bin/bash
# cluster-health-check.sh
TOMCAT_NODES=(
"192.168.1.10:8080"
"192.168.1.11:8080"
"192.168.1.12:8080"
)
check_cluster_health() {
echo "=== 集群健康检查 $(date) ==="
healthy_nodes=0
total_nodes=${#TOMCAT_NODES[@]}
for node in "${TOMCAT_NODES[@]}"; do
host=$(echo $node | cut -d: -f1)
port=$(echo $node | cut -d: -f2)
echo -n "检查节点 $node: "
if timeout 5 curl -s http://$node/health > /dev/null; then
echo "健康"
healthy_nodes=$((healthy_nodes + 1))
else
echo "不健康"
fi
done
echo "健康节点: $healthy_nodes/$total_nodes"
if [ $healthy_nodes -eq 0 ]; then
echo "严重: 所有节点不可用"
exit 2
elif [ $healthy_nodes -lt $((total_nodes / 2)) ]; then
echo "警告: 可用节点不足一半"
exit 1
else
echo "集群状态正常"
exit 0
fi
}
check_session_replication() {
echo "=== 会话复制测试 ==="
# 创建会话
session_id=$(curl -s -c /tmp/cookies http://${TOMCAT_NODES[0]}/session/create | grep -o 'JSESSIONID=[^;]*')
if [ ! -z "$session_id" ]; then
echo "会话已创建: $session_id"
# 测试其他节点是否能访问会话
for node in "${TOMCAT_NODES[@]:1}"; do
echo -n "测试节点 $node 会话复制: "
if curl -s -b /tmp/cookies http://$node/session/check | grep -q "session found"; then
echo "成功"
else
echo "失败"
fi
done
else
echo "无法创建测试会话"
fi
rm -f /tmp/cookies
}
case "$1" in
"health") check_cluster_health ;;
"session") check_session_replication ;;
"all")
check_cluster_health
echo
check_session_replication
;;
*) echo "用法: $0 {health|session|all}" ;;
esac
6. 集群部署自动化
6.1 集群部署脚本
#!/bin/bash
# deploy-cluster.sh
CLUSTER_NODES=(
"192.168.1.10"
"192.168.1.11"
"192.168.1.12"
)
TOMCAT_HOME="/opt/tomcat9"
APP_WAR="/tmp/myapp.war"
deploy_to_cluster() {
local war_file="$1"
if [ ! -f "$war_file" ]; then
echo "WAR文件不存在: $war_file"
exit 1
fi
echo "开始集群部署..."
for node in "${CLUSTER_NODES[@]}"; do
echo "部署到节点: $node"
# 停止应用
ssh $node "curl -u admin:admin123 'http://localhost:8080/manager/text/stop?path=/myapp'"
# 卸载应用
ssh $node "curl -u admin:admin123 'http://localhost:8080/manager/text/undeploy?path=/myapp'"
# 上传WAR文件
scp "$war_file" $node:/tmp/
# 部署应用
ssh $node "curl -u admin:admin123 -T /tmp/$(basename $war_file) 'http://localhost:8080/manager/text/deploy?path=/myapp'"
# 检查部署状态
sleep 5
if ssh $node "curl -s http://localhost:8080/myapp/health" | grep -q "OK"; then
echo "节点 $node 部署成功"
else
echo "节点 $node 部署失败"
fi
done
echo "集群部署完成"
}
rolling_update() {
local war_file="$1"
echo "开始滚动更新..."
for node in "${CLUSTER_NODES[@]}"; do
echo "更新节点: $node"
# 从负载均衡器移除节点
echo "从负载均衡器移除节点 $node"
# 等待当前请求完成
sleep 30
# 停止并重新部署
ssh $node "systemctl stop tomcat"
scp "$war_file" $node:$TOMCAT_HOME/webapps/
ssh $node "systemctl start tomcat"
# 等待启动完成
sleep 60
# 健康检查
if ssh $node "curl -s http://localhost:8080/health" | grep -q "OK"; then
echo "节点 $node 更新成功"
# 将节点重新加入负载均衡器
echo "将节点 $node 重新加入负载均衡器"
else
echo "节点 $node 更新失败,回滚..."
# 回滚逻辑
fi
done
}
case "$1" in
"deploy") deploy_to_cluster "$2" ;;
"rolling") rolling_update "$2" ;;
*) echo "用法: $0 {deploy|rolling} <war-file>" ;;
esac
6.2 集群配置生成器
#!/bin/bash
# generate-cluster-config.sh
generate_cluster_config() {
local node_id="$1"
local node_ip="$2"
local cluster_port="$3"
cat > "server-$node_id.xml.template" << EOF
<Engine name="Catalina" defaultHost="localhost" jvmRoute="$node_id">
<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
channelSendOptions="8">
<Manager className="org.apache.catalina.ha.session.DeltaManager"
expireSessionsOnShutdown="false"
notifyListenersOnReplication="true"/>
<Channel className="org.apache.catalina.tribes.group.GroupChannel">
<Membership className="org.apache.catalina.tribes.membership.McastService"
address="228.0.0.4"
port="45564"
frequency="500"
dropTime="3000"/>
<Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
address="$node_ip"
port="$cluster_port"
autoBind="100"
selectorTimeout="5000"
maxThreads="6"/>
<Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
<Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
</Sender>
<Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
<Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
</Channel>
<Valve className="org.apache.catalina.ha.tcp.ReplicationValve" filter=""/>
<Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>
<ClusterListener className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
<ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
</Cluster>
<Host name="localhost" appBase="webapps">
<!-- Host配置 -->
</Host>
</Engine>
EOF
echo "生成配置文件: server-$node_id.xml.template"
}
# 生成多个节点配置
nodes=(
"node1:192.168.1.10:4000"
"node2:192.168.1.11:4000"
"node3:192.168.1.12:4000"
)
for node_config in "${nodes[@]}"; do
IFS=':' read -r node_id node_ip cluster_port <<< "$node_config"
generate_cluster_config "$node_id" "$node_ip" "$cluster_port"
done
echo "所有集群配置文件已生成"
7. 集群故障排除
7.1 常见问题诊断
#!/bin/bash
# cluster-troubleshoot.sh
diagnose_cluster_issues() {
echo "=== 集群故障诊断 ==="
# 1. 检查网络连通性
echo "1. 检查集群节点网络连通性"
for node in "${CLUSTER_NODES[@]}"; do
if ping -c 2 $node > /dev/null 2>&1; then
echo " ✓ $node 网络可达"
else
echo " ✗ $node 网络不可达"
fi
done
# 2. 检查集群端口
echo "2. 检查集群通信端口"
for node in "${CLUSTER_NODES[@]}"; do
if nc -z $node 4000; then
echo " ✓ $node:4000 端口开放"
else
echo " ✗ $node:4000 端口不可达"
fi
done
# 3. 检查组播
echo "3. 检查组播配置"
echo "组播地址: 228.0.0.4:45564"
# 4. 检查防火墙
echo "4. 检查防火墙设置"
if command -v ufw &> /dev/null; then
ufw status | grep -E "(4000|45564)"
elif command -v firewall-cmd &> /dev/null; then
firewall-cmd --list-ports | grep -E "(4000|45564)"
fi
}
analyze_cluster_logs() {
echo "=== 集群日志分析 ==="
# 分析集群相关日志
grep -i "cluster\|member\|replication" $CATALINA_HOME/logs/catalina.out | tail -20
# 检查会话复制错误
grep -i "session.*error\|replication.*fail" $CATALINA_HOME/logs/catalina.out | tail -10
}
case "$1" in
"diagnose") diagnose_cluster_issues ;;
"logs") analyze_cluster_logs ;;
"all")
diagnose_cluster_issues
echo
analyze_cluster_logs
;;
*) echo "用法: $0 {diagnose|logs|all}" ;;
esac
小结
通过本文学习,你应该掌握:
- Tomcat集群的架构和组件
- 基础和高级集群配置方法
- 会话复制和管理技术
- 集群监控和健康检查
- 集群部署自动化技术
- 集群故障排除方法
- 集群性能优化技巧
下一篇文章将介绍Tomcat负载均衡配置。