分布式文档存储数据库之MongoDB备份与恢复的实践详解

lbyd0 2020-11-17

  为什么要备份?

备份的目的是对数据做冗余的一种方式,它能够让我们在某种情况下保证最少数据的丢失;之前我们对mongodb做副本集也是对数据做冗余,但是这种在副本集上做数据冗余仅仅是针对系统故障或服务异常等一些非人为的故障发生时,保证数据服务的可用性;它不能够避免人为的误操作;为了使得数据的安全,将数据损失降低到最小,我们必须对数据库周期性的做备份;

  常用备份方法

分布式文档存储数据库之MongoDB备份与恢复的实践详解

提示:上图主要描述了mongodb数据库上常用备份策略,我们可以逻辑备份,逻辑备份是将数据库中的数据导出成语句,通常使用专用工具导出和导入来完成一次备份与恢复;其次我们也可以物理备份,简单讲物理备份就是把数据库文件打包,备份;恢复时直接将对应的数据库文件解压恢复即可;另外一种快速物理备份的方式就是给数据拍快照,拍快照可以将数据保存为当前拍快照时的状态;如果我们要进行恢复直接恢复快照即可;

mongodb逻辑备份和物理备份比较

分布式文档存储数据库之MongoDB备份与恢复的实践详解

提示:从上图描述可以看出总体上来看物理备份效率和恢复效率要高于逻辑;物理备份效率高于逻辑备份,其主要原因是逻辑备份是通过数据库接口将数据读取出来,然后保存为对应格式的文件,而物理备份只需要将数据文件直接打包备份,不需要一条一条的读取数据,然后写入到其他文件,这中间就省去了读写过程,所以物理备份效率高;恢复也是类似的过程,物理恢复也是省去了读写的过程;

  mongodb逻辑备份工具

在mongodb中使用逻辑备份的工具有两组,第一组是mongodump/mongorestore,使用mongodump/mongorestore这组工具来逻辑的备份数据,它备份出来的数据是BSON格式,BSON是一种二进制格式,通常无法使用文本编辑器直接打开查看其内容,对人类的可读性较差,但它的优点是保存的文件体积要小;使用这组命令导出的数据,在恢复是依赖mongodb版本,不同版本导出的BSON格式略有不同,所以恢复时,可能存在版本不同而导致恢复数据失败的情况;另外一组是mongoexport/mongoimport,这组工具导出的数据是json格式的数据,通常我们可以使用文本编辑器打开直接查看,对人类的可读性较好,但体积相对BSON格式的数据要大,恢复时不依赖版本;所以跨版本备份要先查看下对应版本的兼容性,如果兼容使用mongodump/mongorestore,不兼容的话建议使用mongoexport/mongoimport;这里需要注意一点,JSON格式虽然可读性很好,也很通用,但是它只是保留了数据部分,而没有保留索引,账户等基础信息,在使用是应该注意;

使用mongodump备份数据

插入数据

> use testdb
switched to db testdb
> for(i=1;i<=1000;i++) db.test.insert({id:i,name:"test"+i,age:(i%120),classes:(i%25)})
WriteResult({ "nInserted" : 1 })
> show tables
test
> db.test.findOne()
{
 "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
 "id" : 1,
 "name" : "test1",
 "age" : 1,
 "classes" : 1
}
> db.test.count()
1000
>

备份所有数据库

[root@node11 ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -o ./node12_mongodb_full_backup
2020-11-15T21:47:45.439+0800 writing admin.system.users to node12_mongodb_full_backup/admin/system.users.bson
2020-11-15T21:47:45.442+0800 done dumping admin.system.users (4 documents)
2020-11-15T21:47:45.443+0800 writing admin.system.version to node12_mongodb_full_backup/admin/system.version.bson
2020-11-15T21:47:45.447+0800 done dumping admin.system.version (2 documents)
2020-11-15T21:47:45.448+0800 writing testdb.test to node12_mongodb_full_backup/testdb/test.bson
2020-11-15T21:47:45.454+0800 done dumping testdb.test (1000 documents)
[root@node11 ~]# ls
node12_mongodb_full_backup
[root@node11 ~]# ll node12_mongodb_full_backup/
total 0
drwxr-xr-x 2 root root 128 Nov 15 21:47 admin
drwxr-xr-x 2 root root 49 Nov 15 21:47 testdb
[root@node11 ~]# tree node12_mongodb_full_backup/
node12_mongodb_full_backup/
├── admin
│ ├── system.users.bson
│ ├── system.users.metadata.json
│ ├── system.version.bson
│ └── system.version.metadata.json
└── testdb
 ├── test.bson
 └── test.metadata.json
 
2 directories, 6 files
[root@node11 ~]#

提示:-u用于指定用户,-p指定对应用户的密码,-h指定数据库地址,--authenticationDatabase 指定验证用户和密码对应的数据库 -o指定要存放备份文件的目录名称;

只备份单个testdb数据库

[root@node11 ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -o ./node12_testdb
2020-11-15T21:53:36.523+0800 writing testdb.test to node12_testdb/testdb/test.bson
2020-11-15T21:53:36.526+0800 done dumping testdb.test (1000 documents)
[root@node11 ~]# tree ./node12_testdb
./node12_testdb
└── testdb
 ├── test.bson
 └── test.metadata.json
 
1 directory, 2 files
[root@node11 ~]#

提示:-d用户指定要备份的数据库名称;

只备份testdb下的test集合

[root@node11 ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -c test -o ./node12_testdb_test-collection
2020-11-15T21:55:48.217+0800 writing testdb.test to node12_testdb_test-collection/testdb/test.bson
2020-11-15T21:55:48.219+0800 done dumping testdb.test (1000 documents)
[root@node11 ~]# tree ./node12_testdb_test-collection
./node12_testdb_test-collection
└── testdb
 ├── test.bson
 └── test.metadata.json
 
1 directory, 2 files
[root@node11 ~]#

提示:-c用于指定要备份的集合(collection)名称;

压缩备份testdb库

[root@node11 ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb --gzip -o ./node12_mongodb_testdb-gzip
2020-11-15T22:00:52.268+0800 writing testdb.test to node12_mongodb_testdb-gzip/testdb/test.bson.gz
2020-11-15T22:00:52.273+0800 done dumping testdb.test (1000 documents)
[root@node11 ~]# tree ./node12_mongodb_testdb-gzip
./node12_mongodb_testdb-gzip
└── testdb
 ├── test.bson.gz
 └── test.metadata.json.gz
 
1 directory, 2 files
[root@node11 ~]#

提示:可以看到使用压缩,只需要加上--gzip选项即可,备份出来的数据就是.gz后缀结尾的压缩文件;

压缩备份testdb库下的test集合

[root@node11 ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -c test --gzip -o ./node12_mongodb_testdb-test-gzip
2020-11-15T22:01:31.492+0800 writing testdb.test to node12_mongodb_testdb-test-gzip/testdb/test.bson.gz
2020-11-15T22:01:31.500+0800 done dumping testdb.test (1000 documents)
[root@node11 ~]# tree ./node12_mongodb_testdb-test-gzip
./node12_mongodb_testdb-test-gzip
└── testdb
 ├── test.bson.gz
 └── test.metadata.json.gz
 
1 directory, 2 files
[root@node11 ~]#

使用mongorestore恢复数据

在node12上删除testdb

> db
testdb
> db.dropDatabase()
{ "dropped" : "testdb", "ok" : 1 }
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
>

全量恢复所有数据库

[root@node11 ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin --drop ./node12_mongodb_full_backup
2020-11-15T22:07:35.465+0800 preparing collections to restore from
2020-11-15T22:07:35.467+0800 reading metadata for testdb.test from node12_mongodb_full_backup/testdb/test.metadata.json
2020-11-15T22:07:35.475+0800 restoring testdb.test from node12_mongodb_full_backup/testdb/test.bson
2020-11-15T22:07:35.486+0800 no indexes to restore
2020-11-15T22:07:35.486+0800 finished restoring testdb.test (1000 documents, 0 failures)
2020-11-15T22:07:35.486+0800 restoring users from node12_mongodb_full_backup/admin/system.users.bson
2020-11-15T22:07:35.528+0800 1000 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]#

 提示:--drop用于指定,恢复是如果对应数据库或者colleciton存在,则先删除然后在恢复,这样做的目的是保证恢复的数据和备份的数据一致;

验证:登录192.168.0.52:27017查看对应testdb数据库是否恢复?

[root@node11 ~]# mongo -utom -p123456 192.168.0.52:27017/admin
MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("af96cb64-a2a4-4d59-b60a-86ccbbe77e3e") }
MongoDB server version: 4.4.1
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
 https://docs.mongodb.com/
Questions? Try the MongoDB Developer Community Forums
 https://community.mongodb.com
---
The server generated these startup warnings when booting:
 2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
 2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
 2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
 Enable MongoDB's free cloud-based monitoring service, which will then receive and display
 metrics about your deployment (disk utilization, CPU, operation statistics, etc).
 
 The monitoring data will be available on a MongoDB website with a unique URL accessible to you
 and anyone you share the URL with. MongoDB may use this information to make product
 improvements and to suggest MongoDB products and deployment options to you.
 
 To enable free monitoring, run the following command: db.enableFreeMonitoring()
 To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
testdb 0.000GB
> use testdb
switched to db testdb
> show collections
test
> db.test.count()
1000
> db.test.findOne()
{
 "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
 "id" : 1,
 "name" : "test1",
 "age" : 1,
 "classes" : 1
}
>

恢复单个库

删除testdb库

> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
testdb 0.000GB
> use testdb
switched to db testdb
> db.dropDatabase()
{ "dropped" : "testdb", "ok" : 1 }
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
>

 使用mongorestore恢复testdb库

[root@node11 ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb --drop ./node12_testdb/testdb/
2020-11-15T22:29:03.718+0800 The --db and --collection flags are deprecated for this use-case; please use --nsInclude instead, i.e. with --nsInclude=${DATABASE}.${COLLECTION}
2020-11-15T22:29:03.718+0800 building a list of collections to restore from node12_testdb/testdb dir
2020-11-15T22:29:03.719+0800 reading metadata for testdb.test from node12_testdb/testdb/test.metadata.json
2020-11-15T22:29:03.736+0800 restoring testdb.test from node12_testdb/testdb/test.bson
2020-11-15T22:29:03.755+0800 no indexes to restore
2020-11-15T22:29:03.755+0800 finished restoring testdb.test (1000 documents, 0 failures)
2020-11-15T22:29:03.755+0800 1000 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]# mongo -utom -p123456 192.168.0.52:27017/admin
MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("f5e73939-bb87-4d45-bf80-9ff1e7f6f15d") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting:
 2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
 2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
 2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
 Enable MongoDB's free cloud-based monitoring service, which will then receive and display
 metrics about your deployment (disk utilization, CPU, operation statistics, etc).
 
 The monitoring data will be available on a MongoDB website with a unique URL accessible to you
 and anyone you share the URL with. MongoDB may use this information to make product
 improvements and to suggest MongoDB products and deployment options to you.
 
 To enable free monitoring, run the following command: db.enableFreeMonitoring()
 To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
testdb 0.000GB
> use testdb
switched to db testdb
> show tables
test
> db.test.count()
1000
> db.test.findOne()
{
 "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
 "id" : 1,
 "name" : "test1",
 "age" : 1,
 "classes" : 1
}
>

恢复单个集合

删除testdb下的test集合

> db
testdb
> show collections
test
> db.test.drop()
true
> show collections
>

使用mongorestore恢复testdb下的test集合

[root@node11 ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -c test --drop ./node12_testdb_test-collection/testdb/test.bson
2020-11-15T22:36:15.615+0800 checking for collection data in node12_testdb_test-collection/testdb/test.bson
2020-11-15T22:36:15.616+0800 reading metadata for testdb.test from node12_testdb_test-collection/testdb/test.metadata.json
2020-11-15T22:36:15.625+0800 restoring testdb.test from node12_testdb_test-collection/testdb/test.bson
2020-11-15T22:36:15.669+0800 no indexes to restore
2020-11-15T22:36:15.669+0800 finished restoring testdb.test (1000 documents, 0 failures)
2020-11-15T22:36:15.669+0800 1000 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]# mongo -utom -p123456 192.168.0.52:27017/admin      MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("27d15d9e-3fdf-4efc-b871-1ec6716e51e3") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting:
 2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
 2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
 2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
 Enable MongoDB's free cloud-based monitoring service, which will then receive and display
 metrics about your deployment (disk utilization, CPU, operation statistics, etc).
 
 The monitoring data will be available on a MongoDB website with a unique URL accessible to you
 and anyone you share the URL with. MongoDB may use this information to make product
 improvements and to suggest MongoDB products and deployment options to you.
 
 To enable free monitoring, run the following command: db.enableFreeMonitoring()
 To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
testdb 0.000GB
> use testdb
switched to db testdb
> show collections
test
> db.test.count()
1000
> db.test.findOne()
{
 "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
 "id" : 1,
 "name" : "test1",
 "age" : 1,
 "classes" : 1
}
>

 使用压缩文件恢复数据库

删除testdb数据库

> db
testdb
> db.dropDatabase()
{ "dropped" : "testdb", "ok" : 1 }
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
>

使用mongorestore工具加载压缩文件恢复数据库

[root@node11 ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb --gzip --drop ./node12_mongodb_testdb-gzip/testdb/
2020-11-15T22:39:55.313+0800 The --db and --collection flags are deprecated for this use-case; please use --nsInclude instead, i.e. with --nsInclude=${DATABASE}.${COLLECTION}
2020-11-15T22:39:55.313+0800 building a list of collections to restore from node12_mongodb_testdb-gzip/testdb dir
2020-11-15T22:39:55.314+0800 reading metadata for testdb.test from node12_mongodb_testdb-gzip/testdb/test.metadata.json.gz
2020-11-15T22:39:55.321+0800 restoring testdb.test from node12_mongodb_testdb-gzip/testdb/test.bson.gz
2020-11-15T22:39:55.332+0800 no indexes to restore
2020-11-15T22:39:55.332+0800 finished restoring testdb.test (1000 documents, 0 failures)
2020-11-15T22:39:55.332+0800 1000 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]# mongo -utom -p123456 192.168.0.52:27017/admin      MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("73d98c33-f8f7-40e3-89bd-fda8c702e407") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting:
 2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
 2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
 2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
 Enable MongoDB's free cloud-based monitoring service, which will then receive and display
 metrics about your deployment (disk utilization, CPU, operation statistics, etc).
 
 The monitoring data will be available on a MongoDB website with a unique URL accessible to you
 and anyone you share the URL with. MongoDB may use this information to make product
 improvements and to suggest MongoDB products and deployment options to you.
 
 To enable free monitoring, run the following command: db.enableFreeMonitoring()
 To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
testdb 0.000GB
> use testdb
switched to db testdb
> show collections
test
> db.test.count()
1000
> db.test.findOne()
{
 "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
 "id" : 1,
 "name" : "test1",
 "age" : 1,
 "classes" : 1
}
>

提示:使用mongorestore恢复单个库使用-d选线指定要恢复的数据库,恢复单个集合使用-c指定集合名称即可,以及使用压缩文件恢复加上对应的--gzip选项即可,总之,备份时用的选项在恢复时也应当使用对应的选项,这个mongodump备份使用的选项没有特别的不同;

使用mongoexport备份数据

新建peoples数据库,并向peoples_info集合中插入数据

> use peoples
switched to db peoples
> for(i=1;i<=10000;i++) db.peoples_info.insert({id:i,name:"peoples"+i,age:(i%120),classes:(i%25)})
WriteResult({ "nInserted" : 1 })
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
peoples 0.000GB
testdb 0.000GB
> db.peoples_info.count()
10000
> db.peoples_info.findOne()
{
 "_id" : ObjectId("5fb13f35012870b3c8e3c895"),
 "id" : 1,
 "name" : "peoples1",
 "age" : 1,
 "classes" : 1
}
>

使用mongoexport工具peoples库下的peoples_info集合

[root@node11 ~]# mongoexport -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d peoples -c peoples_info --type json -o ./peoples-peopels_info.json
2020-11-15T22:54:18.287+0800 connected to: mongodb://192.168.0.52:27017/
2020-11-15T22:54:18.370+0800 exported 10000 records
[root@node11 ~]# ll
total 1004
-rw-r--r-- 1 root root 1024609 Nov 15 22:54 peoples-peopels_info.json
[root@node11 ~]# head -n 1 peoples-peopels_info.json
{"_id":{"$oid":"5fb13f35012870b3c8e3c895"},"id":1.0,"name":"peoples1","age":1.0,"classes":1.0}
[root@node11 ~]#

提示:使用--type可以指定导出数据文件的格式,默认是json格式,当然也可以指定csv格式;这里还需要注意mongoexport这个工具导出数据必须要指定数据库和对应集合,它不能直接对整个数据库下的所有集合做导出;只能单个单个的导;

导出csv格式的数据文件

[root@node11 ~]# mongoexport -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d peoples -c peoples_info --type csv -o ./peoples-peopels_info.csv
2020-11-15T22:58:30.495+0800 connected to: mongodb://192.168.0.52:27017/
2020-11-15T22:58:30.498+0800 Failed: CSV mode requires a field list
[root@node11 ~]# mongoexport -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d peoples -c peoples_info --type csv -f id,name,age -o ./peoples-peopels_info.csv 
2020-11-15T22:59:26.090+0800 connected to: mongodb://192.168.0.52:27017/
2020-11-15T22:59:26.143+0800 exported 10000 records
[root@node11 ~]# head -n 1 ./peoples-peopels_info.csv
id,name,age
[root@node11 ~]# head ./peoples-peopels_info.csv 
id,name,age
1,peoples1,1
2,peoples2,2
3,peoples3,3
4,peoples4,4
5,peoples5,5
6,peoples6,6
7,peoples7,7
8,peoples8,8
9,peoples9,9
[root@node11 ~]#

提示:导出指定格式为csv时,必须用-f选项指定导出的字段名称,分别用逗号隔开;

将数据导入到node11的mongodb上

导入json格式数据

[root@node11 ~]# systemctl start mongod.service
[root@node11 ~]# ss -tnl
State Recv-Q Send-Q  Local Address:Port   Peer Address:Port  
LISTEN 0 128   *:22     *:*   
LISTEN 0 100  127.0.0.1:25     *:*   
LISTEN 0 128  127.0.0.1:27017     *:*   
LISTEN 0 128   :::22     :::*   
LISTEN 0 100   ::1:25     :::*   
[root@node11 ~]# ll
total 1200
-rw-r--r-- 1 root root 198621 Nov 15 22:59 peoples-peopels_info.csv
-rw-r--r-- 1 root root 1024609 Nov 15 22:54 peoples-peopels_info.json
[root@node11 ~]# mongoimport -d testdb -c peoples_info --drop peoples-peopels_info.json
2020-11-15T23:05:03.004+0800 connected to: mongodb://localhost/
2020-11-15T23:05:03.005+0800 dropping: testdb.peoples_info
2020-11-15T23:05:03.186+0800 10000 document(s) imported successfully. 0 document(s) failed to import.
[root@node11 ~]#

提示:导入数据时可以任意指定数据库以及集合名称;

验证:查看node11上的testdb库下是否有peoples_info集合?集合中是否有数据呢?

[root@node11 ~]# mongo
MongoDB shell version v4.4.1
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("4e3a00b0-8367-4b3a-9a77-e61d03bb1b3d") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting:
 2020-11-15T23:03:39.669+08:00: ***** SERVER RESTARTED *****
 2020-11-15T23:03:40.681+08:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
 2020-11-15T23:03:40.681+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
 2020-11-15T23:03:40.681+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
 Enable MongoDB's free cloud-based monitoring service, which will then receive and display
 metrics about your deployment (disk utilization, CPU, operation statistics, etc).
 
 The monitoring data will be available on a MongoDB website with a unique URL accessible to you
 and anyone you share the URL with. MongoDB may use this information to make product
 improvements and to suggest MongoDB products and deployment options to you.
 
 To enable free monitoring, run the following command: db.enableFreeMonitoring()
 To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
testdb 0.000GB
> use testdb
switched to db testdb
> show collections
peoples_info
> db.peoples_info.count()
10000
> db.peoples_info.findOne()
{
 "_id" : ObjectId("5fb13f35012870b3c8e3c895"),
 "id" : 1,
 "name" : "peoples1",
 "age" : 1,
 "classes" : 1
}
>

导入csv格式数据到node12上的testdb库下的test1集合中去

[root@node11 ~]# mongoimport -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -c test1 --type csv --headerline --file ./peoples-peopels_info.csv
2020-11-15T23:11:42.595+0800 connected to: mongodb://192.168.0.52:27017/
2020-11-15T23:11:42.692+0800 10000 document(s) imported successfully. 0 document(s) failed to import.
[root@node11 ~]#

 提示:导入csv格式的数据需要明确指定类型为csv,然后使用--headerline指定不导入第一行列名,--file使用用于指定csv格式文件的名称;

验证:登录node12的mongodb,查看testdb库下是否有test1集合?对应集合是否有数据呢?

[root@node11 ~]# mongo -utom -p123456 192.168.0.52:27017/admin
MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("72a07318-ac04-46f9-a310-13b1241d2f77") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting:
 2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
 2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
 2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
 Enable MongoDB's free cloud-based monitoring service, which will then receive and display
 metrics about your deployment (disk utilization, CPU, operation statistics, etc).
 
 The monitoring data will be available on a MongoDB website with a unique URL accessible to you
 and anyone you share the URL with. MongoDB may use this information to make product
 improvements and to suggest MongoDB products and deployment options to you.
 
 To enable free monitoring, run the following command: db.enableFreeMonitoring()
 To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
peoples 0.000GB
testdb 0.000GB
> use testdb
switched to db testdb
> show collections
test
test1
> db.test1.count()
10000
> db.test1.findOne()
{
 "_id" : ObjectId("5fb1452ef09b563b65405f7c"),
 "id" : 1,
 "name" : "peoples1",
 "age" : 1
}
>

提示:可以看到testdb库下的test1结合就没有classes字段信息了,这是因为我们导出数据时没有指定要导出classes字段,所以导入的数据当然也是没有classes字段信息;以上就是mongodump/mongorestore和mongoexport/mongoimport工具的使用和测试;

全量备份加oplog实现恢复mongodb数据库到指定时间点的数据

在mongodump备份数据时,我们可以使用--oplog选项来记录开始dump数据到dump数据结束后的中间一段时间mongodb数据发生变化的日志;我们知道oplog就是用来记录mongodb中的集合写操作的日志,类似mysql中的binlog;我们可以使用oplog将备份期间发生变化的数据一起恢复,这样恢复出来的数据才是我们真正备份时的所有数据;

模拟备份时,一边插入数据,一边备份数据

test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> for(i=1;i<=1000000;i++) db.test3.insert({id:i,name:"test3-oplog"+i,commit:"test3"+i})

在另外一边同时对数据做备份

[root@node11 ~]# rm -rf *
[root@node11 ~]# ll
total 0
[root@node11 ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin --oplog -o ./alldatabase
2020-11-15T23:51:40.606+0800 writing admin.system.users to alldatabase/admin/system.users.bson
2020-11-15T23:51:40.606+0800 done dumping admin.system.users (4 documents)
2020-11-15T23:51:40.607+0800 writing admin.system.version to alldatabase/admin/system.version.bson
2020-11-15T23:51:40.608+0800 done dumping admin.system.version (2 documents)
2020-11-15T23:51:40.609+0800 writing testdb.test1 to alldatabase/testdb/test1.bson
2020-11-15T23:51:40.611+0800 writing testdb.test3 to alldatabase/testdb/test3.bson
2020-11-15T23:51:40.612+0800 writing testdb.test to alldatabase/testdb/test.bson
2020-11-15T23:51:40.612+0800 writing peoples.peoples_info to alldatabase/peoples/peoples_info.bson
2020-11-15T23:51:40.696+0800 done dumping peoples.peoples_info (10000 documents)
2020-11-15T23:51:40.761+0800 done dumping testdb.test3 (54167 documents)
2020-11-15T23:51:40.803+0800 done dumping testdb.test (31571 documents)
2020-11-15T23:51:40.966+0800 done dumping testdb.test1 (79830 documents)
2020-11-15T23:51:40.972+0800 writing captured oplog to
2020-11-15T23:51:40.980+0800  dumped 916 oplog entries
[root@node11 ~]# ll
total 0
drwxr-xr-x 5 root root 66 Nov 15 23:51 alldatabase
[root@node11 ~]# tree alldatabase/
alldatabase/
├── admin
│ ├── system.users.bson
│ ├── system.users.metadata.json
│ ├── system.version.bson
│ └── system.version.metadata.json
├── oplog.bson
├── peoples
│ ├── peoples_info.bson
│ └── peoples_info.metadata.json
└── testdb
 ├── test1.bson
 ├── test1.metadata.json
 ├── test3.bson
 ├── test3.metadata.json
 ├── test.bson
 └── test.metadata.json
 
3 directories, 13 files
[root@node11 ~]#

提示:可以看到现在备份就多了一个oplog.bson;

查看oplog.bson中第一行记录的数据和第二行记录的数据

[root@node11 ~]# ls
alldatabase
[root@node11 ~]# cd alldatabase/
[root@node11 alldatabase]# ls
admin oplog.bson peoples testdb
[root@node11 alldatabase]# bsondump oplog.bson > /tmp/oplog.bson.tmp
2020-11-15T23:55:04.801+0800 916 objects found
[root@node11 alldatabase]# head -n 1 /tmp/oplog.bson.tmp
{"op":"i","ns":"testdb.test3","ui":{"$binary":{"base64":"7PmE47CASOiQZt5sMGDZKw==","subType":"04"}},"o":{"_id":{"$oid":"5fb14e8c01fff06b2b50a2ac"},"id":{"$numberDouble":"54101.0"},"name":"test3-oplog54101","commit":"test354101"},"ts":{"$timestamp":{"t":1605455500,"i":1880}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605455500608"}},"v":{"$numberLong":"2"}}
[root@node11 alldatabase]# tail -n 1 /tmp/oplog.bson.tmp
{"op":"i","ns":"testdb.test3","ui":{"$binary":{"base64":"7PmE47CASOiQZt5sMGDZKw==","subType":"04"}},"o":{"_id":{"$oid":"5fb14e8c01fff06b2b50a63f"},"id":{"$numberDouble":"55016.0"},"name":"test3-oplog55016","commit":"test355016"},"ts":{"$timestamp":{"t":1605455500,"i":2795}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605455500961"}},"v":{"$numberLong":"2"}}
[root@node11 alldatabase]#

提示:可以看到oplog中记录了id为54101-55016数据,这也就说明了我们开始dump数据时,到dump结束后,数据一致在发生变化,所以我们dump下来的数据是一个中间状态的数据;这里需要说明一点使用mongodump --oplog选项时,不能指定库,因为oplog是对所有库,而不针对某个库记录,所以--oplog只有在备份所有数据库生效;

删除testdb数据库,然后基于我们刚才dump的数据做数据恢复

test_replset:PRIMARY> show dbs
admin 0.000GB
config 0.000GB
local 0.014GB
peoples 0.000GB
testdb 0.019GB
test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> db.dropDatabase()
{
 "dropped" : "testdb",
 "ok" : 1,
 "$clusterTime" : {
  "clusterTime" : Timestamp(1605456134, 4),
  "signature" : {
   "hash" : BinData(0,"cRAdXcUj5c48Q77rCJ1DeeF10u8="),
   "keyId" : NumberLong("6895378399531892740")
  }
 },
 "operationTime" : Timestamp(1605456134, 4)
}
test_replset:PRIMARY> show dbs
admin 0.000GB
config 0.000GB
local 0.014GB
peoples 0.000GB
test_replset:PRIMARY>

使用mongorestore恢复数据

[root@node11 ~]# ls
alldatabase
[root@node11 ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin --oplogReplay --drop ./alldatabase/
2020-11-16T00:06:32.049+0800 preparing collections to restore from
2020-11-16T00:06:32.053+0800 reading metadata for testdb.test1 from alldatabase/testdb/test1.metadata.json
2020-11-16T00:06:32.060+0800 reading metadata for testdb.test3 from alldatabase/testdb/test3.metadata.json
2020-11-16T00:06:32.064+0800 reading metadata for testdb.test from alldatabase/testdb/test.metadata.json
2020-11-16T00:06:32.064+0800 restoring testdb.test1 from alldatabase/testdb/test1.bson
2020-11-16T00:06:32.074+0800 restoring testdb.test3 from alldatabase/testdb/test3.bson
2020-11-16T00:06:32.093+0800 restoring testdb.test from alldatabase/testdb/test.bson
2020-11-16T00:06:32.098+0800 reading metadata for peoples.peoples_info from alldatabase/peoples/peoples_info.metadata.json
2020-11-16T00:06:32.110+0800 restoring peoples.peoples_info from alldatabase/peoples/peoples_info.bson
2020-11-16T00:06:32.333+0800 no indexes to restore
2020-11-16T00:06:32.333+0800 finished restoring peoples.peoples_info (10000 documents, 0 failures)
2020-11-16T00:06:32.766+0800 no indexes to restore
2020-11-16T00:06:32.766+0800 finished restoring testdb.test (31571 documents, 0 failures)
2020-11-16T00:06:33.023+0800 no indexes to restore
2020-11-16T00:06:33.023+0800 finished restoring testdb.test3 (54167 documents, 0 failures)
2020-11-16T00:06:33.370+0800 no indexes to restore
2020-11-16T00:06:33.370+0800 finished restoring testdb.test1 (79830 documents, 0 failures)
2020-11-16T00:06:33.370+0800 restoring users from alldatabase/admin/system.users.bson
2020-11-16T00:06:33.416+0800 replaying oplog
2020-11-16T00:06:33.850+0800 applied 916 oplog entries
2020-11-16T00:06:33.850+0800 175568 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]#

提示:恢复是需要使用--oplogReplay选项来指定重放oplog.bson中的内容;从上面恢复日志可以看到从oplog中恢复了916条数据;也就是说从dump数据开始的那一刻开始到dump结束期间有916条数据发生变化;

验证:连接数据库,看看对应的testdb库下的test3集合恢复了多少条数据?

test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> show tables
test
test1
test3
test_replset:PRIMARY> db.test3.count()
55016
test_replset:PRIMARY>

提示:可以看到test3集合恢复了55016条数据;刚好可以和oplog.bson中的最后一条数据的id对应起来;

备份oplog.rs实现指定恢复到某个时间节点

为了演示容易看出效果,我这里从新将数据库清空,关闭了认证功能

插入数据

test_replset:PRIMARY> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> for(i=1;i<=100000;i++) db.test.insert({id:(i+10000),name:"test-oplog"+i,commit:"test"+i})

同时备份数据,这次不加--oplog选项

[root@node11 ~]# ll
total 0
[root@node11 ~]# mongodump -h node12:27017 -o ./alldatabase
2020-11-16T09:38:00.921+0800 writing admin.system.version to alldatabase/admin/system.version.bson
2020-11-16T09:38:00.923+0800 done dumping admin.system.version (1 document)
2020-11-16T09:38:00.924+0800 writing testdb.test to alldatabase/testdb/test.bson
2020-11-16T09:38:00.960+0800 done dumping testdb.test (16377 documents)
[root@node11 ~]# ll
total 0
drwxr-xr-x 4 root root 33 Nov 16 09:38 alldatabase
[root@node11 ~]# tree ./alldatabase
./alldatabase
├── admin
│ ├── system.version.bson
│ └── system.version.metadata.json
└── testdb
 ├── test.bson
 └── test.metadata.json
 
2 directories, 4 files
[root@node11 ~]#

提示:我们在一边插入数据,一边备份数据,从上面的被日志可以看到,我们备份testdb库下的test集合16377条数据,很显然这不是testdb.test集合的所有数据;我们备份的只是部分数据;正常情况等数据插入完成以后,testdb.test集合应该有100000条数据;

验证:查看testdb.test集合是否有100000条数据?

test_replset:PRIMARY> db
testdb
test_replset:PRIMARY> show collections
test
test_replset:PRIMARY> db.test.count()
100000
test_replset:PRIMARY>

模拟误操作删除testdb.test集合所有数据

test_replset:PRIMARY> db
testdb
test_replset:PRIMARY> show collections
test
test_replset:PRIMARY> db.test.remove({})
WriteResult({ "nRemoved" : 100000 })
test_replset:PRIMARY>

提示:现在我们不小心把testdb.test集合给删除了,现在如果用之前的备份肯定只能恢复部分数据,怎么办呢?我们这个时候可以导出oplog.rs集合,这个集合就是oplog存放数据的集合,它位于local库下;

备份local库中的oplog.rs集合

[root@node11 ~]# ll
total 0
drwxr-xr-x 4 root root 33 Nov 16 09:38 alldatabase
[root@node11 ~]# mongodump -h node12:27017 -d local -c oplog.rs -o ./oplog-rs
2020-11-16T09:43:38.594+0800 writing local.oplog.rs to oplog-rs/local/oplog.rs.bson
2020-11-16T09:43:38.932+0800 done dumping local.oplog.rs (200039 documents)
[root@node11 ~]# ll
total 0
drwxr-xr-x 4 root root 33 Nov 16 09:38 alldatabase
drwxr-xr-x 3 root root 19 Nov 16 09:43 oplog-rs
[root@node11 ~]# tree ./oplog-rs
./oplog-rs
└── local
 ├── oplog.rs.bson
 └── oplog.rs.metadata.json
 
1 directory, 2 files
[root@node11 ~]#

提示:oplog存放在local库下的oplog.rs集合中,以上操作就是备份所有的oplog;现在我们准备好一个oplog,但是现在还不能直接恢复,如果直接恢复,我们的误操作也会跟着一起重放没有任何意义,现在我们需要找到误操作的时间点,然后在恢复;

在oplog中查找误删除的时间

[root@node11 ~]# bsondump oplog-rs/local/oplog.rs.bson |egrep "\"op\":\"d\""|head -n 3
{"op":"d","ns":"testdb.test","ui":{"$binary":{"base64":"C3FH7g1eSWWwHd2AZEJhiw==","subType":"04"}},"o":{"_id":{"$oid":"5fb1d7eb25343e833cbaa146"}},"ts":{"$timestamp":{"t":1605490915,"i":1}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605490915399"}},"v":{"$numberLong":"2"}}
{"op":"d","ns":"testdb.test","ui":{"$binary":{"base64":"C3FH7g1eSWWwHd2AZEJhiw==","subType":"04"}},"o":{"_id":{"$oid":"5fb1d7eb25343e833cbaa147"}},"ts":{"$timestamp":{"t":1605490915,"i":2}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605490915400"}},"v":{"$numberLong":"2"}}
{"op":"d","ns":"testdb.test","ui":{"$binary":{"base64":"C3FH7g1eSWWwHd2AZEJhiw==","subType":"04"}},"o":{"_id":{"$oid":"5fb1d7eb25343e833cbaa148"}},"ts":{"$timestamp":{"t":1605490915,"i":3}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605490915400"}},"v":{"$numberLong":"2"}}
2020-11-16T09:46:20.363+0800 100074 objects found
2020-11-16T09:46:20.363+0800 write /dev/stdout: broken pipe
[root@node11 ~]#

提示:我们要恢复到第一次删除前的数据,我们就选择第一条日志中的$timestamp字段中的{"t":1605490915,"i":1};这个就是我们第一次删除的时间信息;

复制oplog.rs.bson到备份的数据目录为oplog.bson,模拟出使用--oplog选项备份的备份环境

[root@node11 ~]# cp ./oplog-rs/local/oplog.rs.bson ./alldatabase/oplog.bson
[root@node11 ~]#

在使用mongorestore进行恢复数据,指定恢复到第一次删除数据前的时间点所有数据

[root@node11 ~]# mongorestore -h node12:27017 --oplogReplay --oplogLimit "1605490915:1" --drop ./alldatabase/
2020-11-16T09:51:19.658+0800 preparing collections to restore from
2020-11-16T09:51:19.668+0800 reading metadata for testdb.test from alldatabase/testdb/test.metadata.json
2020-11-16T09:51:19.693+0800 restoring testdb.test from alldatabase/testdb/test.bson
2020-11-16T09:51:19.983+0800 no indexes to restore
2020-11-16T09:51:19.983+0800 finished restoring testdb.test (16377 documents, 0 failures)
2020-11-16T09:51:19.983+0800 replaying oplog
2020-11-16T09:51:22.657+0800 oplog 537KB
2020-11-16T09:51:25.657+0800 oplog 1.12MB
2020-11-16T09:51:28.657+0800 oplog 1.72MB
2020-11-16T09:51:31.657+0800 oplog 2.32MB
2020-11-16T09:51:34.657+0800 oplog 2.92MB
2020-11-16T09:51:37.657+0800 oplog 3.51MB
2020-11-16T09:51:40.657+0800 oplog 4.11MB
2020-11-16T09:51:43.657+0800 oplog 4.71MB
2020-11-16T09:51:46.657+0800 oplog 5.30MB
2020-11-16T09:51:49.657+0800 oplog 5.90MB
2020-11-16T09:51:52.657+0800 oplog 6.46MB
2020-11-16T09:51:55.657+0800 oplog 7.04MB
2020-11-16T09:51:58.657+0800 oplog 7.61MB
2020-11-16T09:52:01.657+0800 oplog 8.20MB
2020-11-16T09:52:04.657+0800 oplog 8.77MB
2020-11-16T09:52:07.657+0800 oplog 9.36MB
2020-11-16T09:52:10.657+0800 oplog 9.96MB
2020-11-16T09:52:13.657+0800 oplog 10.6MB
2020-11-16T09:52:16.656+0800 oplog 11.2MB
2020-11-16T09:52:19.657+0800 oplog 11.8MB
2020-11-16T09:52:22.657+0800 oplog 12.4MB
2020-11-16T09:52:25.657+0800 oplog 13.0MB
2020-11-16T09:52:28.657+0800 oplog 13.6MB
2020-11-16T09:52:31.657+0800 oplog 14.2MB
2020-11-16T09:52:34.657+0800 oplog 14.8MB
2020-11-16T09:52:37.657+0800 oplog 15.4MB
2020-11-16T09:52:40.657+0800 oplog 16.0MB
2020-11-16T09:52:43.657+0800 oplog 16.6MB
2020-11-16T09:52:46.657+0800 oplog 17.2MB
2020-11-16T09:52:49.657+0800 oplog 17.8MB
2020-11-16T09:52:52.433+0800 skipping applying the config.system.sessions namespace in applyOps
2020-11-16T09:52:52.433+0800 applied 100008 oplog entries
2020-11-16T09:52:52.433+0800 oplog 18.4MB
2020-11-16T09:52:52.433+0800 16377 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]#

提示:从上面的恢复日志可以看到oplog恢复了100008条,备份的16377条数据也成功恢复;

验证:查看testdb.test集合是否恢复?数据恢复了多少条呢?

test_replset:PRIMARY> show dbs
admin 0.000GB
config 0.000GB
local 0.010GB
testdb 0.004GB
test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> show tables
test
test_replset:PRIMARY> db.test.count()
100000
test_replset:PRIMARY>

 提示:可以看到testdb.test集合恢复了100000条数据;

以上就是mongodb的备份与恢复相关话题的实践;

相关推荐