2024-07-15

MongoDB最佳实践：从文档设计到性能优化的全面指南

MongoDB作为领先的NoSQL文档数据库，以其灵活的文档模型、强大的查询能力和水平扩展特性，在现代应用开发中占据重要地位。本文将深入探讨MongoDB的核心技术和最佳实践，帮助开发者构建高性能、可扩展的MongoDB应用。

MongoDB核心概念与架构

文档数据模型

MongoDB采用BSON（Binary JSON）格式存储数据，支持丰富的数据类型和嵌套结构：

// 用户文档示例
{
  "_id": ObjectId("507f1f77bcf86cd799439011"),
  "username": "john_doe",
  "email": "john@example.com",
  "profile": {
    "firstName": "John",
    "lastName": "Doe",
    "age": 30,
    "address": {
      "street": "123 Main St",
      "city": "New York",
      "zipCode": "10001"
    }
  },
  "interests": ["technology", "sports", "music"],
  "createdAt": ISODate("2024-07-15T10:30:00Z"),
  "lastLogin": ISODate("2024-07-15T15:45:00Z"),
  "isActive": true,
  "metadata": {
    "source": "web",
    "campaign": "summer2024"
  }
}

// 订单文档示例
{
  "_id": ObjectId("507f1f77bcf86cd799439012"),
  "orderNumber": "ORD-2024-001",
  "customerId": ObjectId("507f1f77bcf86cd799439011"),
  "items": [
    {
      "productId": ObjectId("507f1f77bcf86cd799439013"),
      "name": "Laptop",
      "price": 999.99,
      "quantity": 1,
      "category": "Electronics"
    },
    {
      "productId": ObjectId("507f1f77bcf86cd799439014"),
      "name": "Mouse",
      "price": 29.99,
      "quantity": 2,
      "category": "Accessories"
    }
  ],
  "totalAmount": 1059.97,
  "status": "pending",
  "shippingAddress": {
    "street": "123 Main St",
    "city": "New York",
    "zipCode": "10001",
    "country": "USA"
  },
  "paymentMethod": "credit_card",
  "orderDate": ISODate("2024-07-15T14:30:00Z"),
  "estimatedDelivery": ISODate("2024-07-20T00:00:00Z")
}

MongoDB架构组件

mongod：MongoDB数据库服务进程
mongos：分片集群路由服务
Config Servers：存储集群元数据
Replica Set：副本集提供高可用性
Sharded Cluster：分片集群实现水平扩展

文档设计最佳实践

嵌入式 vs 引用式设计

嵌入式设计（适用场景）

// 博客文章嵌入评论（一对少量关系）
{
  "_id": ObjectId("..."),
  "title": "MongoDB最佳实践",
  "content": "文章内容...",
  "author": {
    "name": "张三",
    "email": "zhangsan@example.com"
  },
  "comments": [
    {
      "_id": ObjectId("..."),
      "author": "李四",
      "content": "很好的文章",
      "createdAt": ISODate("2024-07-15T10:00:00Z")
    },
    {
      "_id": ObjectId("..."),
      "author": "王五",
      "content": "学到了很多",
      "createdAt": ISODate("2024-07-15T11:00:00Z")
    }
  ],
  "tags": ["MongoDB", "NoSQL", "数据库"],
  "publishedAt": ISODate("2024-07-15T09:00:00Z")
}

// 用户嵌入地址信息（一对一关系）
{
  "_id": ObjectId("..."),
  "username": "john_doe",
  "email": "john@example.com",
  "addresses": [
    {
      "type": "home",
      "street": "123 Main St",
      "city": "New York",
      "zipCode": "10001",
      "isDefault": true
    },
    {
      "type": "work",
      "street": "456 Business Ave",
      "city": "New York",
      "zipCode": "10002",
      "isDefault": false
    }
  ]
}

引用式设计（适用场景）

// 用户文档
{
  "_id": ObjectId("507f1f77bcf86cd799439011"),
  "username": "john_doe",
  "email": "john@example.com",
  "profile": {
    "firstName": "John",
    "lastName": "Doe"
  }
}

// 订单文档（引用用户ID）
{
  "_id": ObjectId("507f1f77bcf86cd799439012"),
  "orderNumber": "ORD-2024-001",
  "customerId": ObjectId("507f1f77bcf86cd799439011"), // 引用用户
  "items": [...],
  "totalAmount": 1059.97,
  "status": "pending"
}

// 产品文档
{
  "_id": ObjectId("507f1f77bcf86cd799439013"),
  "name": "Laptop",
  "description": "高性能笔记本电脑",
  "price": 999.99,
  "category": "Electronics",
  "inventory": 50
}

// 订单项文档（多对多关系）
{
  "_id": ObjectId("507f1f77bcf86cd799439015"),
  "orderId": ObjectId("507f1f77bcf86cd799439012"),
  "productId": ObjectId("507f1f77bcf86cd799439013"),
  "quantity": 1,
  "unitPrice": 999.99,
  "totalPrice": 999.99
}

文档设计原则

原子性考虑：相关数据放在同一文档中
查询模式优化：根据应用查询模式设计文档结构
文档大小限制：单个文档不超过16MB
避免深度嵌套：嵌套层级不宜过深
数组大小控制：避免无限增长的数组

索引设计与优化

索引类型与创建

// 1. 单字段索引
db.users.createIndex({ "email": 1 })  // 升序
db.users.createIndex({ "createdAt": -1 })  // 降序

// 2. 复合索引
db.orders.createIndex({ 
  "customerId": 1, 
  "status": 1, 
  "orderDate": -1 
})

// 3. 多键索引（数组字段）
db.products.createIndex({ "tags": 1 })

// 4. 文本索引
db.articles.createIndex({ 
  "title": "text", 
  "content": "text" 
})

// 5. 地理空间索引
db.locations.createIndex({ "coordinates": "2dsphere" })

// 6. 哈希索引
db.users.createIndex({ "userId": "hashed" })

// 7. 部分索引（条件索引）
db.orders.createIndex(
  { "customerId": 1, "orderDate": -1 },
  { partialFilterExpression: { "status": "active" } }
)

// 8. 稀疏索引
db.users.createIndex(
  { "phoneNumber": 1 },
  { sparse: true }
)

// 9. 唯一索引
db.users.createIndex(
  { "email": 1 },
  { unique: true }
)

// 10. TTL索引（自动过期）
db.sessions.createIndex(
  { "createdAt": 1 },
  { expireAfterSeconds: 3600 }  // 1小时后过期
)

索引优化策略

// 查询计划分析
db.orders.find({ 
  "customerId": ObjectId("..."), 
  "status": "pending" 
}).explain("executionStats")

// 索引使用统计
db.orders.aggregate([
  { $indexStats: {} }
])

// 查找未使用的索引
db.runCommand({ "collStats": "orders", "indexDetails": true })

// 索引大小查询
db.orders.totalIndexSize()

// 重建索引
db.orders.reIndex()

// 后台创建索引
db.orders.createIndex(
  { "customerId": 1, "orderDate": -1 },
  { background: true }
)

复合索引设计原则

ESR规则：Equality（等值查询）、Sort（排序）、Range（范围查询）
选择性原则：高选择性字段放在前面
查询覆盖：尽可能使用覆盖索引
索引前缀：利用复合索引的前缀特性

// ESR规则示例
// 查询：等值查询status，按orderDate排序，范围查询totalAmount
db.orders.createIndex({ 
  "status": 1,        // Equality
  "orderDate": -1,    // Sort
  "totalAmount": 1    // Range
})

// 对应查询
db.orders.find({ 
  "status": "pending",
  "totalAmount": { $gte: 100, $lte: 1000 }
}).sort({ "orderDate": -1 })

查询优化技巧

高效查询模式

// 1. 使用投影减少网络传输
db.users.find(
  { "status": "active" },
  { "username": 1, "email": 1, "_id": 0 }
)

// 2. 限制返回结果数量
db.orders.find({ "status": "pending" })
  .limit(20)
  .skip(0)

// 3. 使用hint强制使用特定索引
db.orders.find({ "customerId": ObjectId("...") })
  .hint({ "customerId": 1, "orderDate": -1 })

// 4. 批量操作
db.orders.insertMany([
  { /* 文档1 */ },
  { /* 文档2 */ },
  // ...
], { ordered: false })  // 无序插入，提高性能

// 5. 使用$in替代多个$or
// 低效
db.products.find({
  $or: [
    { "category": "Electronics" },
    { "category": "Books" },
    { "category": "Clothing" }
  ]
})

// 高效
db.products.find({
  "category": { $in: ["Electronics", "Books", "Clothing"] }
})

// 6. 避免否定查询
// 低效
db.users.find({ "status": { $ne: "inactive" } })

// 高效
db.users.find({ "status": { $in: ["active", "pending", "suspended"] } })

// 7. 使用正则表达式优化
// 低效
db.users.find({ "email": /.*@gmail.com$/ })

// 高效（使用索引）
db.users.find({ "email": /^[^@]+@gmail\.com$/ })

聚合管道优化

// 订单统计聚合示例
db.orders.aggregate([
  // 1. 尽早过滤数据
  {
    $match: {
      "orderDate": {
        $gte: ISODate("2024-01-01"),
        $lt: ISODate("2024-08-01")
      },
      "status": "completed"
    }
  },
  
  // 2. 尽早投影，减少数据传输
  {
    $project: {
      "customerId": 1,
      "totalAmount": 1,
      "orderDate": 1,
      "items.category": 1
    }
  },
  
  // 3. 按客户分组统计
  {
    $group: {
      "_id": "$customerId",
      "totalOrders": { $sum: 1 },
      "totalSpent": { $sum: "$totalAmount" },
      "avgOrderValue": { $avg: "$totalAmount" },
      "firstOrder": { $min: "$orderDate" },
      "lastOrder": { $max: "$orderDate" }
    }
  },
  
  // 4. 添加计算字段
  {
    $addFields: {
      "customerLifetime": {
        $divide: [
          { $subtract: ["$lastOrder", "$firstOrder"] },
          1000 * 60 * 60 * 24  // 转换为天数
        ]
      }
    }
  },
  
  // 5. 排序
  {
    $sort: { "totalSpent": -1 }
  },
  
  // 6. 限制结果
  {
    $limit: 100
  },
  
  // 7. 关联用户信息
  {
    $lookup: {
      "from": "users",
      "localField": "_id",
      "foreignField": "_id",
      "as": "userInfo",
      "pipeline": [
        {
          $project: {
            "username": 1,
            "email": 1,
            "profile.firstName": 1,
            "profile.lastName": 1
          }
        }
      ]
    }
  },
  
  // 8. 展开用户信息
  {
    $unwind: "$userInfo"
  },
  
  // 9. 最终投影
  {
    $project: {
      "customerId": "$_id",
      "username": "$userInfo.username",
      "email": "$userInfo.email",
      "fullName": {
        $concat: [
          "$userInfo.profile.firstName",
          " ",
          "$userInfo.profile.lastName"
        ]
      },
      "totalOrders": 1,
      "totalSpent": 1,
      "avgOrderValue": { $round: ["$avgOrderValue", 2] },
      "customerLifetime": { $round: ["$customerLifetime", 0] },
      "_id": 0
    }
  }
])

// 使用索引提示优化聚合
db.orders.aggregate([
  { $match: { "status": "completed" } }
  // ... 其他阶段
], {
  hint: { "status": 1, "orderDate": -1 }
})

// 允许磁盘使用（处理大数据集）
db.orders.aggregate([
  // ... 聚合管道
], {
  allowDiskUse: true
})

分页查询优化

// 传统分页（性能随页数增加而下降）
function getOrdersPage(pageNumber, pageSize) {
  return db.orders.find()
    .sort({ "orderDate": -1 })
    .skip(pageNumber * pageSize)
    .limit(pageSize)
}

// 基于游标的分页（性能稳定）
function getOrdersPageCursor(lastOrderDate, lastOrderId, pageSize) {
  let query = {};
  
  if (lastOrderDate && lastOrderId) {
    query = {
      $or: [
        { "orderDate": { $lt: lastOrderDate } },
        {
          "orderDate": lastOrderDate,
          "_id": { $lt: lastOrderId }
        }
      ]
    };
  }
  
  return db.orders.find(query)
    .sort({ "orderDate": -1, "_id": -1 })
    .limit(pageSize)
}

// 范围查询分页
function getOrdersByDateRange(startDate, endDate, lastOrderId, pageSize) {
  let query = {
    "orderDate": {
      $gte: startDate,
      $lte: endDate
    }
  };
  
  if (lastOrderId) {
    query["_id"] = { $gt: lastOrderId };
  }
  
  return db.orders.find(query)
    .sort({ "_id": 1 })
    .limit(pageSize)
}

副本集配置与管理

副本集部署

// 初始化副本集
rs.initiate({
  "_id": "myReplicaSet",
  "members": [
    {
      "_id": 0,
      "host": "mongodb1.example.com:27017",
      "priority": 2
    },
    {
      "_id": 1,
      "host": "mongodb2.example.com:27017",
      "priority": 1
    },
    {
      "_id": 2,
      "host": "mongodb3.example.com:27017",
      "priority": 1
    }
  ]
})

// 添加副本集成员
rs.add("mongodb4.example.com:27017")

// 添加仲裁者
rs.addArb("mongodb-arbiter.example.com:27017")

// 设置成员优先级
var config = rs.conf()
config.members[1].priority = 0.5
rs.reconfig(config)

// 设置隐藏成员（用于备份）
var config = rs.conf()
config.members[2].hidden = true
config.members[2].priority = 0
rs.reconfig(config)

// 设置延迟成员（灾难恢复）
var config = rs.conf()
config.members[3].slaveDelay = 3600  // 延迟1小时
config.members[3].priority = 0
config.members[3].hidden = true
rs.reconfig(config)

读写分离配置

// Java驱动示例
MongoClientSettings settings = MongoClientSettings.builder()
    .applyConnectionString(new ConnectionString(
        "mongodb://mongodb1.example.com:27017,mongodb2.example.com:27017,mongodb3.example.com:27017/mydb?replicaSet=myReplicaSet"
    ))
    .readPreference(ReadPreference.secondaryPreferred())
    .readConcern(ReadConcern.MAJORITY)
    .writeConcern(WriteConcern.MAJORITY)
    .build();

MongoClient mongoClient = MongoClients.create(settings);

// 针对特定操作设置读偏好
MongoCollection<Document> collection = database.getCollection("orders")
    .withReadPreference(ReadPreference.secondary());

// 读取最新数据时使用主节点
MongoCollection<Document> primaryCollection = database.getCollection("orders")
    .withReadPreference(ReadPreference.primary());

故障转移测试

# 模拟主节点故障
sudo systemctl stop mongod  # 在主节点执行

# 查看副本集状态
rs.status()

# 强制选举新主节点
rs.stepDown(60)  # 当前主节点下台60秒

# 查看oplog状态
db.oplog.rs.find().sort({"ts": -1}).limit(5)

# 检查副本集延迟
rs.printSlaveReplicationInfo()

分片集群架构

分片集群部署

# 1. 启动配置服务器（副本集）
mongod --configsvr --replSet configReplSet --port 27019 --dbpath /data/configdb

# 2. 初始化配置服务器副本集
mongo --port 27019
rs.initiate({
  "_id": "configReplSet",
  "configsvr": true,
  "members": [
    { "_id": 0, "host": "config1.example.com:27019" },
    { "_id": 1, "host": "config2.example.com:27019" },
    { "_id": 2, "host": "config3.example.com:27019" }
  ]
})

# 3. 启动分片服务器（每个分片都是副本集）
mongod --shardsvr --replSet shard1ReplSet --port 27018 --dbpath /data/shard1

# 4. 启动mongos路由服务
mongos --configdb configReplSet/config1.example.com:27019,config2.example.com:27019,config3.example.com:27019 --port 27017

# 5. 添加分片
mongo --port 27017
sh.addShard("shard1ReplSet/shard1-1.example.com:27018,shard1-2.example.com:27018,shard1-3.example.com:27018")
sh.addShard("shard2ReplSet/shard2-1.example.com:27018,shard2-2.example.com:27018,shard2-3.example.com:27018")

分片键设计

// 1. 启用数据库分片
sh.enableSharding("ecommerce")

// 2. 选择合适的分片键

// 用户ID分片（适合用户相关查询）
sh.shardCollection("ecommerce.users", { "_id": 1 })

// 复合分片键（提高查询性能）
sh.shardCollection("ecommerce.orders", { 
  "customerId": 1, 
  "orderDate": 1 
})

// 哈希分片（均匀分布）
sh.shardCollection("ecommerce.products", { "_id": "hashed" })

// 基于地理位置的分片
sh.shardCollection("ecommerce.stores", { 
  "region": 1, 
  "storeId": 1 
})

// 基于时间的分片（时间序列数据）
sh.shardCollection("analytics.events", { 
  "timestamp": 1, 
  "userId": 1 
})

分片键选择原则

高基数：分片键值应该有足够的唯一性
查询模式：分片键应该包含在大多数查询中
写入分布：避免热点写入
范围查询：支持高效的范围查询
不可变性：分片键值不应该频繁变更

// 分片状态查询
sh.status()

// 查看集合分片信息
db.orders.getShardDistribution()

// 查看块分布
sh.getBalancerState()

// 手动分割块
sh.splitAt("ecommerce.orders", { "customerId": ObjectId("...") })

// 移动块
sh.moveChunk("ecommerce.orders", 
  { "customerId": ObjectId("...") }, 
  "shard2ReplSet"
)

// 平衡器控制
sh.stopBalancer()
sh.startBalancer()

性能监控与调优

性能监控指标

// 1. 数据库统计信息
db.stats()
db.serverStatus()

// 2. 集合统计信息
db.orders.stats()

// 3. 索引统计信息
db.orders.aggregate([{ $indexStats: {} }])

// 4. 当前操作监控
db.currentOp()

// 5. 慢查询分析
db.setProfilingLevel(2, { slowms: 100 })  // 记录超过100ms的操作
db.system.profile.find().sort({ "ts": -1 }).limit(5)

// 6. 连接状态
db.serverStatus().connections

// 7. 内存使用情况
db.serverStatus().mem

// 8. 网络统计
db.serverStatus().network

// 9. 锁统计
db.serverStatus().locks

// 10. WiredTiger存储引擎统计
db.serverStatus().wiredTiger

性能调优配置

# mongod.conf 配置文件
storage:
  dbPath: /data/db
  journal:
    enabled: true
  wiredTiger:
    engineConfig:
      cacheSizeGB: 8  # 设置缓存大小
      journalCompressor: snappy
      directoryForIndexes: true
    collectionConfig:
      blockCompressor: snappy
    indexConfig:
      prefixCompression: true

systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log
  logRotate: reopen

net:
  port: 27017
  bindIp: 0.0.0.0
  maxIncomingConnections: 1000

processManagement:
  fork: true
  pidFilePath: /var/run/mongodb/mongod.pid

operationProfiling:
  slowOpThresholdMs: 100
  mode: slowOp

replication:
  replSetName: myReplicaSet
  oplogSizeMB: 2048

sharding:
  clusterRole: shardsvr

Java应用性能优化

@Configuration
public class MongoConfig {
    
    @Bean
    public MongoClient mongoClient() {
        ConnectionString connectionString = new ConnectionString(
            "mongodb://mongodb1:27017,mongodb2:27017,mongodb3:27017/mydb?replicaSet=myReplicaSet"
        );
        
        MongoClientSettings settings = MongoClientSettings.builder()
            .applyConnectionString(connectionString)
            .applyToConnectionPoolSettings(builder -> 
                builder.maxSize(100)  // 最大连接数
                       .minSize(10)   // 最小连接数
                       .maxWaitTime(2, TimeUnit.SECONDS)
                       .maxConnectionLifeTime(30, TimeUnit.MINUTES)
                       .maxConnectionIdleTime(10, TimeUnit.MINUTES)
            )
            .applyToSocketSettings(builder -> 
                builder.connectTimeout(2, TimeUnit.SECONDS)
                       .readTimeout(5, TimeUnit.SECONDS)
            )
            .readPreference(ReadPreference.secondaryPreferred())
            .readConcern(ReadConcern.MAJORITY)
            .writeConcern(WriteConcern.MAJORITY.withWTimeout(5, TimeUnit.SECONDS))
            .retryWrites(true)
            .retryReads(true)
            .build();
        
        return MongoClients.create(settings);
    }
    
    @Bean
    public MongoTemplate mongoTemplate(MongoClient mongoClient) {
        return new MongoTemplate(mongoClient, "mydb");
    }
}

// 批量操作优化
@Service
public class OrderService {
    
    @Autowired
    private MongoTemplate mongoTemplate;
    
    /**
     * 批量插入订单
     */
    public void batchInsertOrders(List<Order> orders) {
        BulkOperations bulkOps = mongoTemplate.bulkOps(
            BulkOperations.BulkMode.UNORDERED, Order.class
        );
        
        for (Order order : orders) {
            bulkOps.insert(order);
        }
        
        BulkWriteResult result = bulkOps.execute();
        log.info("Batch insert completed: {} inserted", result.getInsertedCount());
    }
    
    /**
     * 批量更新订单状态
     */
    public void batchUpdateOrderStatus(List<String> orderIds, String newStatus) {
        BulkOperations bulkOps = mongoTemplate.bulkOps(
            BulkOperations.BulkMode.UNORDERED, Order.class
        );
        
        for (String orderId : orderIds) {
            Query query = Query.query(Criteria.where("id").is(orderId));
            Update update = Update.update("status", newStatus)
                                 .set("updatedAt", new Date());
            bulkOps.updateOne(query, update);
        }
        
        BulkWriteResult result = bulkOps.execute();
        log.info("Batch update completed: {} modified", result.getModifiedCount());
    }
    
    /**
     * 使用聚合查询优化复杂统计
     */
    public List<CustomerOrderSummary> getCustomerOrderSummary(Date startDate, Date endDate) {
        Aggregation aggregation = Aggregation.newAggregation(
            // 匹配时间范围
            Aggregation.match(Criteria.where("orderDate")
                .gte(startDate).lte(endDate)
                .and("status").is("completed")
            ),
            
            // 按客户分组统计
            Aggregation.group("customerId")
                .count().as("totalOrders")
                .sum("totalAmount").as("totalSpent")
                .avg("totalAmount").as("avgOrderValue")
                .min("orderDate").as("firstOrder")
                .max("orderDate").as("lastOrder"),
            
            // 关联客户信息
            Aggregation.lookup("users", "_id", "_id", "customerInfo"),
            
            // 展开客户信息
            Aggregation.unwind("customerInfo"),
            
            // 投影最终结果
            Aggregation.project()
                .and("_id").as("customerId")
                .and("customerInfo.username").as("username")
                .and("customerInfo.email").as("email")
                .and("totalOrders").as("totalOrders")
                .and("totalSpent").as("totalSpent")
                .and("avgOrderValue").as("avgOrderValue")
                .and("firstOrder").as("firstOrder")
                .and("lastOrder").as("lastOrder"),
            
            // 排序
            Aggregation.sort(Sort.Direction.DESC, "totalSpent"),
            
            // 限制结果
            Aggregation.limit(100)
        );
        
        AggregationResults<CustomerOrderSummary> results = mongoTemplate.aggregate(
            aggregation, "orders", CustomerOrderSummary.class
        );
        
        return results.getMappedResults();
    }
    
    /**
     * 使用投影优化查询
     */
    public List<OrderSummary> getOrderSummaries(String customerId, int page, int size) {
        Query query = Query.query(Criteria.where("customerId").is(customerId))
            .with(Sort.by(Sort.Direction.DESC, "orderDate"))
            .skip(page * size)
            .limit(size);
        
        // 只查询需要的字段
        query.fields()
            .include("orderNumber")
            .include("totalAmount")
            .include("status")
            .include("orderDate")
            .include("items.name")
            .include("items.quantity")
            .exclude("_id");
        
        return mongoTemplate.find(query, OrderSummary.class, "orders");
    }
}

// 数据传输对象
@Data
public class CustomerOrderSummary {
    private String customerId;
    private String username;
    private String email;
    private int totalOrders;
    private BigDecimal totalSpent;
    private BigDecimal avgOrderValue;
    private Date firstOrder;
    private Date lastOrder;
}

@Data
public class OrderSummary {
    private String orderNumber;
    private BigDecimal totalAmount;
    private String status;
    private Date orderDate;
    private List<OrderItemSummary> items;
}

@Data
public class OrderItemSummary {
    private String name;
    private int quantity;
}

数据备份与恢复

备份策略

# 1. mongodump 备份
mongodump --host mongodb1.example.com:27017 \
          --db ecommerce \
          --out /backup/mongodb/$(date +%Y%m%d)

# 2. 副本集备份
mongodump --host mongodb-secondary.example.com:27017 \
          --db ecommerce \
          --out /backup/mongodb/$(date +%Y%m%d)

# 3. 分片集群备份
mongodump --host mongos.example.com:27017 \
          --db ecommerce \
          --out /backup/mongodb/$(date +%Y%m%d)

# 4. 压缩备份
mongodump --host mongodb1.example.com:27017 \
          --db ecommerce \
          --gzip \
          --out /backup/mongodb/$(date +%Y%m%d)

# 5. 增量备份（基于oplog）
mongodump --host mongodb1.example.com:27017 \
          --db local \
          --collection oplog.rs \
          --query '{"ts":{"$gt":{"$timestamp":{"t":1627776000,"i":1}}}}' \
          --out /backup/mongodb/oplog/$(date +%Y%m%d)

恢复操作

# 1. 完整恢复
mongorestore --host mongodb1.example.com:27017 \
             --db ecommerce \
             /backup/mongodb/20240715/ecommerce

# 2. 选择性恢复
mongorestore --host mongodb1.example.com:27017 \
             --db ecommerce \
             --collection orders \
             /backup/mongodb/20240715/ecommerce/orders.bson

# 3. 恢复到不同数据库
mongorestore --host mongodb1.example.com:27017 \
             --db ecommerce_restore \
             /backup/mongodb/20240715/ecommerce

# 4. 压缩备份恢复
mongorestore --host mongodb1.example.com:27017 \
             --gzip \
             /backup/mongodb/20240715

自动化备份脚本

#!/bin/bash
# mongodb_backup.sh

# 配置变量
MONGO_HOST="mongodb1.example.com:27017"
BACKUP_DIR="/backup/mongodb"
RETENTION_DAYS=30
DATE=$(date +%Y%m%d_%H%M%S)
LOG_FILE="/var/log/mongodb_backup.log"

# 创建备份目录
mkdir -p "$BACKUP_DIR/$DATE"

# 记录开始时间
echo "$(date): Starting MongoDB backup" >> "$LOG_FILE"

# 执行备份
mongodump --host "$MONGO_HOST" \
          --gzip \
          --out "$BACKUP_DIR/$DATE" \
          2>> "$LOG_FILE"

if [ $? -eq 0 ]; then
    echo "$(date): Backup completed successfully" >> "$LOG_FILE"
    
    # 压缩备份文件
    cd "$BACKUP_DIR"
    tar -czf "mongodb_backup_$DATE.tar.gz" "$DATE"
    rm -rf "$DATE"
    
    # 清理旧备份
    find "$BACKUP_DIR" -name "mongodb_backup_*.tar.gz" -mtime +$RETENTION_DAYS -delete
    
    echo "$(date): Backup cleanup completed" >> "$LOG_FILE"
else
    echo "$(date): Backup failed" >> "$LOG_FILE"
    exit 1
fi

# 验证备份文件
if [ -f "$BACKUP_DIR/mongodb_backup_$DATE.tar.gz" ]; then
    BACKUP_SIZE=$(du -h "$BACKUP_DIR/mongodb_backup_$DATE.tar.gz" | cut -f1)
    echo "$(date): Backup file created: mongodb_backup_$DATE.tar.gz ($BACKUP_SIZE)" >> "$LOG_FILE"
else
    echo "$(date): Backup file not found" >> "$LOG_FILE"
    exit 1
fi

总结与最佳实践

MongoDB应用核心原则

文档设计优化：根据查询模式设计文档结构，合理使用嵌入和引用
索引策略：创建高效索引，定期分析和优化索引使用
查询优化：使用投影、限制结果集、避免全表扫描
聚合优化：合理设计聚合管道，尽早过滤和投影数据
连接管理：使用连接池，合理配置连接参数

高可用架构建议

副本集部署：至少3个节点，配置合理的优先级和读偏好
分片集群：根据数据量和查询模式选择合适的分片策略
监控告警：建立完善的监控体系，及时发现性能问题
备份恢复：制定定期备份策略，测试恢复流程
容量规划：根据业务增长预测，提前规划扩容方案

性能优化要点

硬件配置：使用SSD存储，充足的内存和CPU资源
网络优化：减少网络延迟，使用高速网络连接
存储引擎：合理配置WiredTiger参数，启用压缩
应用层优化：批量操作、连接池管理、查询缓存

常见问题解决方案

慢查询优化：分析执行计划，创建合适索引
内存不足：调整缓存大小，优化文档结构
写入瓶颈：使用分片，优化写入模式
读取延迟：配置读偏好，使用索引覆盖查询

MongoDB作为现代应用的重要数据存储解决方案，其灵活性和扩展性为开发者提供了强大的支持。通过合理的架构设计、性能优化和运维管理，MongoDB能够满足各种复杂的业务需求，为应用提供高性能、高可用的数据服务。

编外计划 - 日志

To be or not to be,--that is question.