Tomcat 集群配置

Clustering Configuration

概述

Tomcat集群技术实现了多个Tomcat实例间的会话复制和故障转移,提供高可用性和可扩展性。本文详细介绍Tomcat集群的配置、管理和优化技术。

1. 集群架构概览

1.1 集群拓扑结构

集群架构
      Load Balancer
           │
    ┌──────┼──────┐
    │      │      │
 Tomcat1 Tomcat2 Tomcat3
    │      │      │
    └──────┼──────┘
        Cluster
    (Session Replication)
           │
    Shared Database

1.2 集群组件说明

集群组件
├── Cluster Manager
│   ├── SimpleTcpCluster
│   └── DeltaManager
├── Channel Components
│   ├── Membership Service
│   ├── Receiver
│   ├── Sender
│   └── Interceptors
├── Valves
│   ├── ReplicationValve
│   └── JvmRouteBinderValve
└── Deployer
    └── FarmWarDeployer

2. 基础集群配置

2.1 最小集群配置

<!-- server.xml - 基础集群配置 -->
<Engine name="Catalina" defaultHost="localhost" jvmRoute="tomcat1">

  <!-- 简单TCP集群配置 -->
  <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
           channelSendOptions="8">

    <!-- 会话管理器 -->
    <Manager className="org.apache.catalina.ha.session.DeltaManager"
             expireSessionsOnShutdown="false"
             notifyListenersOnReplication="true"/>

    <!-- 通道配置 -->
    <Channel className="org.apache.catalina.tribes.group.GroupChannel">

      <!-- 成员发现服务 -->
      <Membership className="org.apache.catalina.tribes.membership.McastService"
                  address="228.0.0.4"
                  port="45564"
                  frequency="500"
                  dropTime="3000"/>

      <!-- 接收器 -->
      <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                address="auto"
                port="4000"
                autoBind="100"
                selectorTimeout="5000"
                maxThreads="6"/>

      <!-- 发送器 -->
      <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
        <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
      </Sender>

      <!-- 拦截器 -->
      <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
      <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
    </Channel>

    <!-- 集群阀门 -->
    <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
           filter=""/>
    <Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>

    <!-- 集群监听器 -->
    <ClusterListener className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
    <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
  </Cluster>

  <Host name="localhost" appBase="webapps">
    <!-- Host配置 -->
  </Host>
</Engine>

2.2 多节点集群配置

节点1 (tomcat1) 配置:

<Engine name="Catalina" defaultHost="localhost" jvmRoute="node1">
  <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster">
    <Manager className="org.apache.catalina.ha.session.DeltaManager"/>

    <Channel className="org.apache.catalina.tribes.group.GroupChannel">
      <Membership className="org.apache.catalina.tribes.membership.McastService"
                  address="228.0.0.4"
                  port="45564"/>
      <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                address="192.168.1.10"
                port="4000"/>
      <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
        <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
      </Sender>
    </Channel>
  </Cluster>
</Engine>

节点2 (tomcat2) 配置:

<Engine name="Catalina" defaultHost="localhost" jvmRoute="node2">
  <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster">
    <Manager className="org.apache.catalina.ha.session.DeltaManager"/>

    <Channel className="org.apache.catalina.tribes.group.GroupChannel">
      <Membership className="org.apache.catalina.tribes.membership.McastService"
                  address="228.0.0.4"
                  port="45564"/>
      <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                address="192.168.1.11"
                port="4000"/>
      <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
        <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
      </Sender>
    </Channel>
  </Cluster>
</Engine>

3. 高级集群配置

3.1 静态成员配置

<!-- 静态成员配置(不使用组播) -->
<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster">

  <Channel className="org.apache.catalina.tribes.group.GroupChannel">

    <!-- 静态成员配置 -->
    <Membership className="org.apache.catalina.tribes.membership.StaticMembershipInterceptor">
      <Member className="org.apache.catalina.tribes.membership.StaticMember"
              port="4000"
              host="192.168.1.10"
              domain="cluster"
              uniqueId="{0,1,2,3,4,5,6,7,8,9}"/>
      <Member className="org.apache.catalina.tribes.membership.StaticMember"
              port="4000"
              host="192.168.1.11"
              domain="cluster"
              uniqueId="{0,1,2,3,4,5,6,7,8,10}"/>
      <Member className="org.apache.catalina.tribes.membership.StaticMember"
              port="4000"
              host="192.168.1.12"
              domain="cluster"
              uniqueId="{0,1,2,3,4,5,6,7,8,11}"/>
    </Membership>

    <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
              address="auto"
              port="4000"/>

    <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
      <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
    </Sender>

    <!-- 故障检测 -->
    <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
    <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
  </Channel>
</Cluster>

3.2 加密通信配置

<!-- 加密集群通信 -->
<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster">

  <Channel className="org.apache.catalina.tribes.group.GroupChannel">

    <Membership className="org.apache.catalina.tribes.membership.McastService"
                address="228.0.0.4"
                port="45564"/>

    <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
              address="auto"
              port="4000"/>

    <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
      <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
    </Sender>

    <!-- 加密拦截器 -->
    <Interceptor className="org.apache.catalina.tribes.group.interceptors.EncryptInterceptor"
                 encryptionKey="mySecretKey123456"/>

    <!-- 压缩拦截器 -->
    <Interceptor className="org.apache.catalina.tribes.group.interceptors.GzipInterceptor"/>

    <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
    <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
  </Channel>
</Cluster>

4. 会话管理配置

4.1 应用级会话配置

<!-- web.xml - 启用分布式会话 -->
<web-app>
    <!-- 启用分布式标记 -->
    <distributable/>

    <!-- 会话配置 -->
    <session-config>
        <session-timeout>30</session-timeout>
        <cookie-config>
            <name>JSESSIONID</name>
            <path>/</path>
            <http-only>true</http-only>
            <secure>false</secure>
        </cookie-config>
    </session-config>
</web-app>

4.2 会话管理器配置

<!-- Context级会话管理器 -->
<Context path="/myapp" docBase="myapp.war">

  <!-- Delta会话管理器 -->
  <Manager className="org.apache.catalina.ha.session.DeltaManager"
           expireSessionsOnShutdown="false"
           notifyListenersOnReplication="true"
           maxInactiveInterval="1800"
           sessionIdLength="16"/>

  <!-- 持久化会话管理器 -->
  <!--
  <Manager className="org.apache.catalina.ha.session.PersistentManager"
           saveOnRestart="true"
           maxActiveSession="1000">
    <Store className="org.apache.catalina.ha.session.JDBCStore"
           connectionURL="jdbc:mysql://localhost:3306/sessions"
           connectionName="sessionuser"
           connectionPassword="sessionpass"
           sessionTable="tomcat_sessions"
           sessionAppCol="app_name"
           sessionIdCol="session_id"
           sessionDataCol="session_data"
           sessionValidCol="valid"
           sessionMaxInactiveCol="max_inactive"
           sessionLastAccessedCol="last_access"/>
  </Manager>
  -->
</Context>

5. 集群监控与管理

5.1 集群状态监控

// ClusterMonitor.java
package com.example.cluster;

import org.apache.catalina.ha.ClusterManager;
import org.apache.catalina.ha.CatalinaCluster;
import org.apache.catalina.tribes.Member;

import javax.management.MBeanServer;
import javax.management.ObjectName;
import java.lang.management.ManagementFactory;

public class ClusterMonitor {

    private MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();

    public void monitorCluster() throws Exception {
        ObjectName clusterName = new ObjectName("Catalina:type=Cluster");

        if (mbs.isRegistered(clusterName)) {
            // 获取集群成员信息
            String[] memberNames = (String[]) mbs.getAttribute(clusterName, "memberNames");

            System.out.println("=== 集群状态监控 ===");
            System.out.println("集群成员数: " + memberNames.length);

            for (String memberName : memberNames) {
                System.out.println("成员: " + memberName);
            }

            // 获取会话复制统计
            monitorSessionReplication();
        } else {
            System.out.println("集群未配置或未启动");
        }
    }

    private void monitorSessionReplication() throws Exception {
        ObjectName managerName = new ObjectName("Catalina:type=Manager,*");

        for (ObjectName name : mbs.queryNames(managerName, null)) {
            String className = (String) mbs.getAttribute(name, "className");

            if (className.contains("DeltaManager")) {
                Integer activeSessions = (Integer) mbs.getAttribute(name, "activeSessions");
                Long sessionCounter = (Long) mbs.getAttribute(name, "sessionCounter");

                System.out.println("会话管理器: " + name.getKeyProperty("path"));
                System.out.println("  活跃会话: " + activeSessions);
                System.out.println("  会话计数: " + sessionCounter);
            }
        }
    }
}

5.2 集群健康检查

#!/bin/bash
# cluster-health-check.sh

TOMCAT_NODES=(
    "192.168.1.10:8080"
    "192.168.1.11:8080"
    "192.168.1.12:8080"
)

check_cluster_health() {
    echo "=== 集群健康检查 $(date) ==="

    healthy_nodes=0
    total_nodes=${#TOMCAT_NODES[@]}

    for node in "${TOMCAT_NODES[@]}"; do
        host=$(echo $node | cut -d: -f1)
        port=$(echo $node | cut -d: -f2)

        echo -n "检查节点 $node: "

        if timeout 5 curl -s http://$node/health > /dev/null; then
            echo "健康"
            healthy_nodes=$((healthy_nodes + 1))
        else
            echo "不健康"
        fi
    done

    echo "健康节点: $healthy_nodes/$total_nodes"

    if [ $healthy_nodes -eq 0 ]; then
        echo "严重: 所有节点不可用"
        exit 2
    elif [ $healthy_nodes -lt $((total_nodes / 2)) ]; then
        echo "警告: 可用节点不足一半"
        exit 1
    else
        echo "集群状态正常"
        exit 0
    fi
}

check_session_replication() {
    echo "=== 会话复制测试 ==="

    # 创建会话
    session_id=$(curl -s -c /tmp/cookies http://${TOMCAT_NODES[0]}/session/create | grep -o 'JSESSIONID=[^;]*')

    if [ ! -z "$session_id" ]; then
        echo "会话已创建: $session_id"

        # 测试其他节点是否能访问会话
        for node in "${TOMCAT_NODES[@]:1}"; do
            echo -n "测试节点 $node 会话复制: "

            if curl -s -b /tmp/cookies http://$node/session/check | grep -q "session found"; then
                echo "成功"
            else
                echo "失败"
            fi
        done
    else
        echo "无法创建测试会话"
    fi

    rm -f /tmp/cookies
}

case "$1" in
    "health") check_cluster_health ;;
    "session") check_session_replication ;;
    "all") 
        check_cluster_health
        echo
        check_session_replication
        ;;
    *) echo "用法: $0 {health|session|all}" ;;
esac

6. 集群部署自动化

6.1 集群部署脚本

#!/bin/bash
# deploy-cluster.sh

CLUSTER_NODES=(
    "192.168.1.10"
    "192.168.1.11" 
    "192.168.1.12"
)

TOMCAT_HOME="/opt/tomcat9"
APP_WAR="/tmp/myapp.war"

deploy_to_cluster() {
    local war_file="$1"

    if [ ! -f "$war_file" ]; then
        echo "WAR文件不存在: $war_file"
        exit 1
    fi

    echo "开始集群部署..."

    for node in "${CLUSTER_NODES[@]}"; do
        echo "部署到节点: $node"

        # 停止应用
        ssh $node "curl -u admin:admin123 'http://localhost:8080/manager/text/stop?path=/myapp'"

        # 卸载应用
        ssh $node "curl -u admin:admin123 'http://localhost:8080/manager/text/undeploy?path=/myapp'"

        # 上传WAR文件
        scp "$war_file" $node:/tmp/

        # 部署应用
        ssh $node "curl -u admin:admin123 -T /tmp/$(basename $war_file) 'http://localhost:8080/manager/text/deploy?path=/myapp'"

        # 检查部署状态
        sleep 5
        if ssh $node "curl -s http://localhost:8080/myapp/health" | grep -q "OK"; then
            echo "节点 $node 部署成功"
        else
            echo "节点 $node 部署失败"
        fi
    done

    echo "集群部署完成"
}

rolling_update() {
    local war_file="$1"

    echo "开始滚动更新..."

    for node in "${CLUSTER_NODES[@]}"; do
        echo "更新节点: $node"

        # 从负载均衡器移除节点
        echo "从负载均衡器移除节点 $node"

        # 等待当前请求完成
        sleep 30

        # 停止并重新部署
        ssh $node "systemctl stop tomcat"
        scp "$war_file" $node:$TOMCAT_HOME/webapps/
        ssh $node "systemctl start tomcat"

        # 等待启动完成
        sleep 60

        # 健康检查
        if ssh $node "curl -s http://localhost:8080/health" | grep -q "OK"; then
            echo "节点 $node 更新成功"
            # 将节点重新加入负载均衡器
            echo "将节点 $node 重新加入负载均衡器"
        else
            echo "节点 $node 更新失败,回滚..."
            # 回滚逻辑
        fi
    done
}

case "$1" in
    "deploy") deploy_to_cluster "$2" ;;
    "rolling") rolling_update "$2" ;;
    *) echo "用法: $0 {deploy|rolling} <war-file>" ;;
esac

6.2 集群配置生成器

#!/bin/bash
# generate-cluster-config.sh

generate_cluster_config() {
    local node_id="$1"
    local node_ip="$2"
    local cluster_port="$3"

    cat > "server-$node_id.xml.template" << EOF
<Engine name="Catalina" defaultHost="localhost" jvmRoute="$node_id">

  <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
           channelSendOptions="8">

    <Manager className="org.apache.catalina.ha.session.DeltaManager"
             expireSessionsOnShutdown="false"
             notifyListenersOnReplication="true"/>

    <Channel className="org.apache.catalina.tribes.group.GroupChannel">

      <Membership className="org.apache.catalina.tribes.membership.McastService"
                  address="228.0.0.4"
                  port="45564"
                  frequency="500"
                  dropTime="3000"/>

      <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                address="$node_ip"
                port="$cluster_port"
                autoBind="100"
                selectorTimeout="5000"
                maxThreads="6"/>

      <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
        <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
      </Sender>

      <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
      <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
    </Channel>

    <Valve className="org.apache.catalina.ha.tcp.ReplicationValve" filter=""/>
    <Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>

    <ClusterListener className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
    <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
  </Cluster>

  <Host name="localhost" appBase="webapps">
    <!-- Host配置 -->
  </Host>
</Engine>
EOF

    echo "生成配置文件: server-$node_id.xml.template"
}

# 生成多个节点配置
nodes=(
    "node1:192.168.1.10:4000"
    "node2:192.168.1.11:4000"
    "node3:192.168.1.12:4000"
)

for node_config in "${nodes[@]}"; do
    IFS=':' read -r node_id node_ip cluster_port <<< "$node_config"
    generate_cluster_config "$node_id" "$node_ip" "$cluster_port"
done

echo "所有集群配置文件已生成"

7. 集群故障排除

7.1 常见问题诊断

#!/bin/bash
# cluster-troubleshoot.sh

diagnose_cluster_issues() {
    echo "=== 集群故障诊断 ==="

    # 1. 检查网络连通性
    echo "1. 检查集群节点网络连通性"
    for node in "${CLUSTER_NODES[@]}"; do
        if ping -c 2 $node > /dev/null 2>&1; then
            echo "  ✓ $node 网络可达"
        else
            echo "  ✗ $node 网络不可达"
        fi
    done

    # 2. 检查集群端口
    echo "2. 检查集群通信端口"
    for node in "${CLUSTER_NODES[@]}"; do
        if nc -z $node 4000; then
            echo "  ✓ $node:4000 端口开放"
        else
            echo "  ✗ $node:4000 端口不可达"
        fi
    done

    # 3. 检查组播
    echo "3. 检查组播配置"
    echo "组播地址: 228.0.0.4:45564"

    # 4. 检查防火墙
    echo "4. 检查防火墙设置"
    if command -v ufw &> /dev/null; then
        ufw status | grep -E "(4000|45564)"
    elif command -v firewall-cmd &> /dev/null; then
        firewall-cmd --list-ports | grep -E "(4000|45564)"
    fi
}

analyze_cluster_logs() {
    echo "=== 集群日志分析 ==="

    # 分析集群相关日志
    grep -i "cluster\|member\|replication" $CATALINA_HOME/logs/catalina.out | tail -20

    # 检查会话复制错误
    grep -i "session.*error\|replication.*fail" $CATALINA_HOME/logs/catalina.out | tail -10
}

case "$1" in
    "diagnose") diagnose_cluster_issues ;;
    "logs") analyze_cluster_logs ;;
    "all") 
        diagnose_cluster_issues
        echo
        analyze_cluster_logs
        ;;
    *) echo "用法: $0 {diagnose|logs|all}" ;;
esac

小结

通过本文学习,你应该掌握:

  1. Tomcat集群的架构和组件
  2. 基础和高级集群配置方法
  3. 会话复制和管理技术
  4. 集群监控和健康检查
  5. 集群部署自动化技术
  6. 集群故障排除方法
  7. 集群性能优化技巧

下一篇文章将介绍Tomcat负载均衡配置。

powered by Gitbook© 2025 编外计划 | 最后修改: 2025-08-29 15:40:15

results matching ""

    No results matching ""