Tomcat 负载均衡
Load Balancing
概述
负载均衡是确保Tomcat应用高可用性和可扩展性的关键技术。本文将介绍各种负载均衡方案、配置方法和最佳实践。
1. 负载均衡架构概览
1.1 负载均衡模式
负载均衡架构模式
├── 前端负载均衡
│ ├── 硬件负载均衡器 (F5, A10)
│ ├── 软件负载均衡器 (Nginx, HAProxy)
│ └── 云负载均衡 (ALB, ELB)
├── 应用层负载均衡
│ ├── HTTP负载均衡
│ ├── AJP负载均衡
│ └── 集群负载均衡
└── 数据库负载均衡
├── 读写分离
├── 主从复制
└── 分库分表
1.2 典型架构
Internet
↓
Load Balancer (Nginx/HAProxy)
↓
┌─────────────────────────────────┐
│ Tomcat1 Tomcat2 Tomcat3 │
│ :8080 :8080 :8080 │
└─────────────────────────────────┘
↓
Database Cluster / Shared Storage
2. Nginx负载均衡配置
2.1 基础HTTP负载均衡
# nginx.conf
upstream tomcat_backend {
# 负载均衡策略
least_conn; # 最少连接数算法
# Tomcat服务器列表
server 192.168.1.10:8080 weight=3 max_fails=3 fail_timeout=30s;
server 192.168.1.11:8080 weight=2 max_fails=3 fail_timeout=30s;
server 192.168.1.12:8080 weight=1 max_fails=3 fail_timeout=30s backup;
# 连接保持
keepalive 32;
keepalive_requests 1000;
keepalive_timeout 60s;
}
server {
listen 80;
server_name app.example.com;
location / {
proxy_pass http://tomcat_backend;
# 代理头设置
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# 连接设置
proxy_http_version 1.1;
proxy_set_header Connection "";
# 超时设置
proxy_connect_timeout 5s;
proxy_read_timeout 30s;
proxy_send_timeout 30s;
# 故障转移
proxy_next_upstream error timeout http_500 http_502 http_503;
proxy_next_upstream_tries 3;
proxy_next_upstream_timeout 10s;
}
# 健康检查页面
location /health {
proxy_pass http://tomcat_backend;
proxy_connect_timeout 1s;
proxy_read_timeout 1s;
access_log off;
}
# 静态资源直接服务
location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg)$ {
root /var/www/static;
expires 1y;
access_log off;
}
}
2.2 会话粘性配置
upstream tomcat_backend {
# IP哈希保证会话粘性
ip_hash;
server 192.168.1.10:8080;
server 192.168.1.11:8080;
server 192.168.1.12:8080;
}
# 或者使用cookie粘性
upstream tomcat_backend {
hash $cookie_jsessionid consistent;
server 192.168.1.10:8080;
server 192.168.1.11:8080;
server 192.168.1.12:8080;
}
2.3 高级负载均衡配置
# 动态upstream配置
upstream tomcat_backend {
zone backend 64k;
# 服务器组
server 192.168.1.10:8080 weight=5 max_conns=100;
server 192.168.1.11:8080 weight=3 max_conns=80;
server 192.168.1.12:8080 weight=2 max_conns=60;
# 慢启动
server 192.168.1.13:8080 weight=1 slow_start=30s;
# 备份服务器
server 192.168.1.20:8080 backup;
# 健康检查
check interval=3000 rise=2 fall=3 timeout=1000;
}
# 基于URI的负载均衡
map $request_uri $backend_pool {
~^/api/ api_backend;
~^/admin/ admin_backend;
default web_backend;
}
upstream api_backend {
server 192.168.1.30:8080;
server 192.168.1.31:8080;
}
upstream admin_backend {
server 192.168.1.40:8080;
server 192.168.1.41:8080;
}
upstream web_backend {
server 192.168.1.10:8080;
server 192.168.1.11:8080;
}
server {
listen 80;
server_name app.example.com;
location / {
proxy_pass http://$backend_pool;
# 其他代理设置...
}
}
3. HAProxy负载均衡配置
3.1 基础HAProxy配置
# /etc/haproxy/haproxy.cfg
global
daemon
user haproxy
group haproxy
pidfile /var/run/haproxy.pid
# 日志配置
log 127.0.0.1:514 local0
# SSL配置
tune.ssl.default-dh-param 2048
# 性能调优
nbproc 4
maxconn 4096
defaults
mode http
log global
option httplog
option dontlognull
# 超时设置
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
# 错误处理
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
# 前端配置
frontend tomcat_frontend
bind *:80
bind *:443 ssl crt /etc/ssl/certs/app.example.com.pem
# 重定向HTTP到HTTPS
redirect scheme https if !{ ssl_fc }
# 访问控制
acl network_allowed src 192.168.1.0/24
acl admin_path path_beg /admin
acl api_path path_beg /api
# 路由规则
use_backend admin_backend if admin_path network_allowed
use_backend api_backend if api_path
default_backend web_backend
# 后端配置
backend web_backend
balance roundrobin
option httpchk GET /health HTTP/1.1\r\nHost:\ app.example.com
# 服务器列表
server tomcat1 192.168.1.10:8080 check inter 3s rise 2 fall 3 weight 3
server tomcat2 192.168.1.11:8080 check inter 3s rise 2 fall 3 weight 2
server tomcat3 192.168.1.12:8080 check inter 3s rise 2 fall 3 weight 1
backend api_backend
balance leastconn
option httpchk GET /api/health
server api1 192.168.1.30:8080 check inter 2s rise 2 fall 3
server api2 192.168.1.31:8080 check inter 2s rise 2 fall 3
backend admin_backend
balance source
option httpchk GET /admin/health
server admin1 192.168.1.40:8080 check inter 5s rise 2 fall 3
server admin2 192.168.1.41:8080 check inter 5s rise 2 fall 3 backup
# 统计页面
listen stats
bind *:8404
stats enable
stats uri /stats
stats refresh 30s
stats admin if TRUE
acl network_allowed src 192.168.1.0/24
stats http-request auth unless network_allowed
3.2 会话粘性配置
backend web_backend
balance roundrobin
# Cookie粘性
cookie SERVERID insert indirect nocache
server tomcat1 192.168.1.10:8080 check cookie tomcat1
server tomcat2 192.168.1.11:8080 check cookie tomcat2
server tomcat3 192.168.1.12:8080 check cookie tomcat3
# 或者使用源IP粘性
backend web_backend
balance source
hash-type consistent
server tomcat1 192.168.1.10:8080 check
server tomcat2 192.168.1.11:8080 check
server tomcat3 192.168.1.12:8080 check
4. Apache httpd + mod_jk配置
4.1 mod_jk安装配置
# 安装mod_jk
sudo apt-get install libapache2-mod-jk
# 启用模块
sudo a2enmod jk
4.2 workers.properties配置
# /etc/libapache2-mod-jk/workers.properties
# 定义工作节点
worker.list=loadbalancer
# Tomcat实例1
worker.tomcat1.port=8009
worker.tomcat1.host=192.168.1.10
worker.tomcat1.type=ajp13
worker.tomcat1.lbfactor=3
# Tomcat实例2
worker.tomcat2.port=8009
worker.tomcat2.host=192.168.1.11
worker.tomcat2.type=ajp13
worker.tomcat2.lbfactor=2
# Tomcat实例3
worker.tomcat3.port=8009
worker.tomcat3.host=192.168.1.12
worker.tomcat3.type=ajp13
worker.tomcat3.lbfactor=1
# 负载均衡器配置
worker.loadbalancer.type=lb
worker.loadbalancer.balanced_workers=tomcat1,tomcat2,tomcat3
worker.loadbalancer.sticky_session=True
worker.loadbalancer.method=Request
4.3 Apache虚拟主机配置
# /etc/apache2/sites-available/app.example.com.conf
<VirtualHost *:80>
ServerName app.example.com
DocumentRoot /var/www/app
# JK配置
JkMount /app/* loadbalancer
JkMount /manager loadbalancer
# 静态资源本地处理
JkUnMount /app/static/* loadbalancer
# 健康检查
JkMount /app/health loadbalancer
# 日志配置
ErrorLog ${APACHE_LOG_DIR}/app_error.log
CustomLog ${APACHE_LOG_DIR}/app_access.log combined
# JK状态页面
<Location "/jkstatus">
JkMount jkstatus
Require ip 192.168.1.0/24
</Location>
</VirtualHost>
5. Tomcat集群配置
5.1 集群基础配置
<!-- server.xml集群配置 -->
<Engine name="Catalina" defaultHost="localhost" jvmRoute="tomcat1">
<!-- 集群配置 -->
<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
channelSendOptions="8">
<!-- 管理器配置 -->
<Manager className="org.apache.catalina.ha.session.DeltaManager"
expireSessionsOnShutdown="false"
notifyListenersOnReplication="true"/>
<!-- 通道配置 -->
<Channel className="org.apache.catalina.tribes.group.GroupChannel">
<!-- 成员服务 -->
<Membership className="org.apache.catalina.tribes.membership.McastService"
address="228.0.0.4"
port="45564"
frequency="500"
dropTime="3000"/>
<!-- 接收器 -->
<Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
address="auto"
port="4000"
autoBind="100"
selectorTimeout="5000"
maxThreads="6"/>
<!-- 发送器 -->
<Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
<Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
</Sender>
<!-- 拦截器 -->
<Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
<Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
</Channel>
<!-- 集群阀门 -->
<Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
filter=""/>
<Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>
<!-- 部署器 -->
<Deployer className="org.apache.catalina.ha.deploy.FarmWarDeployer"
tempDir="/tmp/war-temp/"
deployDir="/tmp/war-deploy/"
watchDir="/tmp/war-listen/"
watchEnabled="false"/>
<!-- 集群监听器 -->
<ClusterListener className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
<ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
</Cluster>
<Host name="localhost" appBase="webapps">
<!-- Host配置 -->
</Host>
</Engine>
5.2 应用集群配置
<!-- web.xml中启用会话复制 -->
<web-app>
<!-- 分布式会话标记 -->
<distributable/>
<!-- 会话配置 -->
<session-config>
<session-timeout>30</session-timeout>
<cookie-config>
<name>JSESSIONID</name>
<path>/</path>
<http-only>true</http-only>
<secure>false</secure>
</cookie-config>
</session-config>
</web-app>
6. 健康检查配置
6.1 Tomcat健康检查Servlet
// HealthCheckServlet.java
@WebServlet("/health")
public class HealthCheckServlet extends HttpServlet {
@Override
protected void doGet(HttpServletRequest request,
HttpServletResponse response)
throws ServletException, IOException {
boolean healthy = true;
StringBuilder status = new StringBuilder();
// 检查数据库连接
try {
Context ctx = new InitialContext();
DataSource ds = (DataSource) ctx.lookup("java:comp/env/jdbc/MyDB");
try (Connection conn = ds.getConnection()) {
try (PreparedStatement stmt = conn.prepareStatement("SELECT 1")) {
stmt.executeQuery();
}
}
status.append("DB: OK\n");
} catch (Exception e) {
healthy = false;
status.append("DB: ERROR - ").append(e.getMessage()).append("\n");
}
// 检查内存使用
Runtime runtime = Runtime.getRuntime();
long totalMemory = runtime.totalMemory();
long freeMemory = runtime.freeMemory();
long usedMemory = totalMemory - freeMemory;
double memoryUsage = (double) usedMemory / totalMemory;
if (memoryUsage > 0.9) {
healthy = false;
status.append("MEMORY: WARNING - Usage ").append(String.format("%.2f%%", memoryUsage * 100)).append("\n");
} else {
status.append("MEMORY: OK - Usage ").append(String.format("%.2f%%", memoryUsage * 100)).append("\n");
}
// 检查线程数
ThreadMXBean threadBean = ManagementFactory.getThreadMXBean();
int threadCount = threadBean.getThreadCount();
if (threadCount > 500) {
healthy = false;
status.append("THREADS: WARNING - Count ").append(threadCount).append("\n");
} else {
status.append("THREADS: OK - Count ").append(threadCount).append("\n");
}
// 设置响应
response.setContentType("text/plain");
if (healthy) {
response.setStatus(HttpServletResponse.SC_OK);
response.getWriter().println("STATUS: HEALTHY");
} else {
response.setStatus(HttpServletResponse.SC_SERVICE_UNAVAILABLE);
response.getWriter().println("STATUS: UNHEALTHY");
}
response.getWriter().println(status.toString());
}
}
6.2 高级健康检查
// 详细健康检查
@WebServlet("/health/detailed")
public class DetailedHealthCheckServlet extends HttpServlet {
@Override
protected void doGet(HttpServletRequest request,
HttpServletResponse response)
throws ServletException, IOException {
Map<String, Object> healthStatus = new HashMap<>();
// 应用信息
healthStatus.put("application", getApplicationInfo());
// 系统信息
healthStatus.put("system", getSystemInfo());
// 依赖服务检查
healthStatus.put("dependencies", checkDependencies());
// 转换为JSON
Gson gson = new Gson();
String jsonResponse = gson.toJson(healthStatus);
response.setContentType("application/json");
response.setCharacterEncoding("UTF-8");
response.getWriter().println(jsonResponse);
}
private Map<String, Object> getApplicationInfo() {
Map<String, Object> appInfo = new HashMap<>();
appInfo.put("name", "My Tomcat App");
appInfo.put("version", "1.0.0");
appInfo.put("buildTime", "2023-12-01T10:00:00Z");
appInfo.put("javaVersion", System.getProperty("java.version"));
appInfo.put("tomcatVersion", ServerInfo.getServerInfo());
return appInfo;
}
private Map<String, Object> getSystemInfo() {
Map<String, Object> systemInfo = new HashMap<>();
Runtime runtime = Runtime.getRuntime();
systemInfo.put("processors", runtime.availableProcessors());
systemInfo.put("totalMemory", runtime.totalMemory());
systemInfo.put("freeMemory", runtime.freeMemory());
systemInfo.put("maxMemory", runtime.maxMemory());
ThreadMXBean threadBean = ManagementFactory.getThreadMXBean();
systemInfo.put("threadCount", threadBean.getThreadCount());
systemInfo.put("peakThreadCount", threadBean.getPeakThreadCount());
return systemInfo;
}
private List<Map<String, Object>> checkDependencies() {
List<Map<String, Object>> dependencies = new ArrayList<>();
// 数据库检查
dependencies.add(checkDatabase());
// 外部API检查
dependencies.add(checkExternalAPI());
return dependencies;
}
private Map<String, Object> checkDatabase() {
Map<String, Object> dbCheck = new HashMap<>();
dbCheck.put("name", "MySQL Database");
try {
Context ctx = new InitialContext();
DataSource ds = (DataSource) ctx.lookup("java:comp/env/jdbc/MyDB");
long startTime = System.currentTimeMillis();
try (Connection conn = ds.getConnection()) {
try (PreparedStatement stmt = conn.prepareStatement("SELECT 1")) {
stmt.executeQuery();
}
}
long responseTime = System.currentTimeMillis() - startTime;
dbCheck.put("status", "UP");
dbCheck.put("responseTime", responseTime + "ms");
} catch (Exception e) {
dbCheck.put("status", "DOWN");
dbCheck.put("error", e.getMessage());
}
return dbCheck;
}
private Map<String, Object> checkExternalAPI() {
Map<String, Object> apiCheck = new HashMap<>();
apiCheck.put("name", "External API");
try {
URL url = new URL("http://external-api.example.com/health");
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("GET");
conn.setConnectTimeout(5000);
conn.setReadTimeout(5000);
int responseCode = conn.getResponseCode();
if (responseCode == 200) {
apiCheck.put("status", "UP");
} else {
apiCheck.put("status", "DOWN");
apiCheck.put("responseCode", responseCode);
}
} catch (Exception e) {
apiCheck.put("status", "DOWN");
apiCheck.put("error", e.getMessage());
}
return apiCheck;
}
}
7. 负载均衡监控
7.1 监控脚本
#!/bin/bash
# load-balancer-monitor.sh
NGINX_STATUS_URL="http://localhost/nginx_status"
TOMCAT_SERVERS=(
"192.168.1.10:8080"
"192.168.1.11:8080"
"192.168.1.12:8080"
)
check_nginx_status() {
echo "=== Nginx状态 ==="
curl -s $NGINX_STATUS_URL || echo "Nginx状态页面无法访问"
}
check_tomcat_servers() {
echo "=== Tomcat服务器状态 ==="
for server in "${TOMCAT_SERVERS[@]}"; do
echo -n "检查 $server: "
response=$(curl -s -o /dev/null -w "%{http_code},%{time_total}" \
http://$server/health)
http_code=$(echo $response | cut -d',' -f1)
time_total=$(echo $response | cut -d',' -f2)
if [ "$http_code" = "200" ]; then
echo "UP (${time_total}s)"
else
echo "DOWN (HTTP: $http_code)"
fi
done
}
check_backend_balance() {
echo "=== 负载均衡检查 ==="
# 发送测试请求并统计分发情况
for i in {1..100}; do
curl -s -H "X-Test-Request: $i" http://localhost/test | \
grep -o "Server: [^,]*" >> /tmp/lb_test.log
done
echo "请求分发统计:"
sort /tmp/lb_test.log | uniq -c
rm -f /tmp/lb_test.log
}
generate_report() {
echo "=== 负载均衡监控报告 $(date) ==="
check_nginx_status
echo
check_tomcat_servers
echo
check_backend_balance
}
case "$1" in
"nginx") check_nginx_status ;;
"tomcat") check_tomcat_servers ;;
"balance") check_backend_balance ;;
"report") generate_report ;;
*) echo "用法: $0 {nginx|tomcat|balance|report}" ;;
esac
7.2 自动故障切换
#!/bin/bash
# auto-failover.sh
NGINX_CONFIG="/etc/nginx/conf.d/upstream.conf"
TOMCAT_SERVERS=(
"192.168.1.10:8080"
"192.168.1.11:8080"
"192.168.1.12:8080"
)
check_and_update_upstream() {
local healthy_servers=()
# 检查每个服务器
for server in "${TOMCAT_SERVERS[@]}"; do
if curl -s -f "http://$server/health" > /dev/null; then
healthy_servers+=($server)
echo "✓ $server 健康"
else
echo "✗ $server 不健康"
fi
done
# 如果健康服务器数量变化,更新配置
if [ ${#healthy_servers[@]} -ne ${#TOMCAT_SERVERS[@]} ]; then
echo "更新Nginx upstream配置..."
# 生成新配置
cat > "$NGINX_CONFIG" << EOF
upstream tomcat_backend {
least_conn;
EOF
for server in "${healthy_servers[@]}"; do
echo " server $server;" >> "$NGINX_CONFIG"
done
echo "}" >> "$NGINX_CONFIG"
# 重新加载Nginx
nginx -t && nginx -s reload
echo "配置已更新,健康服务器数量: ${#healthy_servers[@]}"
fi
}
# 持续监控
while true; do
check_and_update_upstream
sleep 30
done
小结
通过本文学习,你应该掌握:
- 各种负载均衡架构和模式
- Nginx负载均衡的详细配置
- HAProxy负载均衡的高级特性
- Apache + mod_jk的AJP负载均衡
- Tomcat集群的配置和管理
- 健康检查机制的实现
- 负载均衡的监控和故障切换
- 会话粘性和会话复制策略
这些技术确保了Tomcat应用的高可用性和可扩展性。