Monday, January 19, 2015

Monitor and then Understand Volume Mirroring in MapR

This article explains how to monitor the MapR volume mirroring and then help you understand it.
The tests are done in MapR 4.0.2 sandbox which you can download here.

1. Create the source and mirror volumes and put 10GB file in source volume.

maprcli volume create  -name srcvol -path /srcvol
maprcli volume create  -name mirvol1 -source srcvol@demo.mapr.com  -path /mirvol1 -type mirror
cd /mapr/demo.mapr.com/srcvol
dd if=/dev/zero of=10g.binary bs=65536 count=163840

2. Before starting mirroring, open some sessions to monitor. 

Note: "mrconfig" needs to run on all nodes or the nodes you are interested in. Since I am using single node sandbox, it does not matter.
Once below sessions are ready, then start mirroring.
maprcli volume mirror start -name mirvol1

2.1 Get the container ID pairs between source and mirror volumes.

maprcli  dump volumeinfo -volumename mirvol1 -json |egrep "ContainerId|CreatorContainerId|NameContainer"
   "ContainerId":2243,
   "NameContainer":"true",
   "CreatorContainerId":2242,
   "ContainerId":2249,
   "NameContainer":"false",
   "CreatorContainerId":2248,
   "ContainerId":2250,
   "NameContainer":"false",
   "CreatorContainerId":2245,
   "ContainerId":2251,
   "NameContainer":"false",
   "CreatorContainerId":2244,
   "ContainerId":2252,
   "NameContainer":"false",
   "CreatorContainerId":2247,
   "ContainerId":2253,
   "NameContainer":"false",
   "CreatorContainerId":2246,
From above output, we know that container 2243 of "mirvol" is mirrored from container 2242 of "srcvol", and it is name container. And so on.
Take a look at the containers of "srcvol" using the same command:
maprcli  dump volumeinfo -volumename srcvol -json |egrep "ContainerId|NameContainer"
   "ContainerId":2242,
   "NameContainer":"true",
   "CreatorContainerId":0,
   "ContainerId":2244,
   "NameContainer":"false",
   "CreatorContainerId":0,
   "ContainerId":2245,
   "NameContainer":"false",
   "CreatorContainerId":0,
   "ContainerId":2246,
   "NameContainer":"false",
   "CreatorContainerId":0,
   "ContainerId":2247,
   "NameContainer":"false",
   "CreatorContainerId":0,
   "ContainerId":2248,
   "NameContainer":"false",
   "CreatorContainerId":0,
So we know that "srcvol" has 6 containers: 2242,2244,2245,2246,2247,2248;
"mirvol1" has also 6 corresponding containers: 2243, 2249, 2250, 2251, 2252, 2253.

2.2 Show mirroring percentage of completion for "mirvol1".

maprcli volume info -name mirvol1 -json
When mirroring is in progress:
   "mirrorSrcVolume":"srcvol",
   "mirrorSrcVolumeId":157247757,
   "mirrorSrcCluster":"demo.mapr.com",
   "mirrorDataSrcVolume":"srcvol",
   "mirrorDataSrcVolumeId":157247757,
   "mirrorDataSrcCluster":"demo.mapr.com",
   "lastSuccessfulMirrorTime":1421708885449,
   "mirror-percent-complete":31,
   "mirrorId":2,
   "nextMirrorId":3,
   "mirrorstatus":1
When mirroring completed:
   "mirrorSrcVolume":"srcvol",
   "mirrorSrcVolumeId":157247757,
   "mirrorSrcCluster":"demo.mapr.com",
   "mirrorDataSrcVolume":"srcvol",
   "mirrorDataSrcVolumeId":157247757,
   "mirrorDataSrcCluster":"demo.mapr.com",
   "lastSuccessfulMirrorTime":1421709330678,
   "mirror-percent-complete":100,
   "mirrorId":3,
   "nextMirrorId":3,
   "mirrorstatus":0

2.3 Show running threads information on each node.

/opt/mapr/server/mrconfig info threads |egrep "InodeResyncWA|ContainerRestoreWA"
Sample output is:
Thread:ContainerRestoreWA Thread (On Replica workarea:6e9d8000 line:1727 additional_info:srccid 256000062 replicacid 2249 SourceNode 10.250.0.85:5660
Thread:ContainerRestoreWA Thread (On Replica workarea:6dd9a000 line:1727 additional_info:srccid 256000059 replicacid 2250 SourceNode 10.250.0.85:5660
Thread:ContainerRestoreWA Thread (On Replica workarea:6d65a000 line:1727 additional_info:srccid 256000058 replicacid 2251 SourceNode 10.250.0.85:5660
Thread:ContainerRestoreWA Thread (On Replica workarea:6d5ee000 line:1727 additional_info:srccid 256000061 replicacid 2252 SourceNode 10.250.0.85:5660
Thread:ContainerRestoreWA Thread (On Replica workarea:6d624000 line:1727 additional_info:srccid 256000060 replicacid 2253 SourceNode 10.250.0.85:5660
Thread:InodeResyncWA Thread (On Source) workarea:6dab0000 line:1126 additional_info:srccid 256000058, replicacid 2251, inode 256000058.48.262688
Thread:InodeResyncWA Thread (On Source) workarea:6ed42000 line:1826 additional_info:srccid 256000058, replicacid 2251, inode 256000058.43.262678
Thread:InodeResyncWA Thread (On Source) workarea:6edb4000 line:1126 additional_info:srccid 256000058, replicacid 2251, inode 256000058.42.262676
Thread:InodeResyncWA Thread (On Source) workarea:6ed68000 line:1126 additional_info:srccid 256000058, replicacid 2251, inode 256000058.41.262674
Thread:InodeResyncWA Thread (On Source) workarea:6eabc000 line:1826 additional_info:srccid 256000059, replicacid 2250, inode 256000059.46.262322
"replicacid" are the CIDs(Container ID) of mirror volume -- "mirvol1".
However "srccid" are the CIDs of SNAPSHOT of source volume , not the CIDs of source volume "srcvol". I will explain this later.

2.4 Show resync progress of each container on each node.

/opt/mapr/server/mrconfig cntr resyncprogress
It will show outputs of the source volumes on each node only when resync is in progress.
Resync Progress Info
--------------------
Cid: 256000058, Snapshot Cid: 256000058, Vol Id: 157247757, Location: Source, Peer Addr: 127.0.0.1:5660
 ResyncType: Mirror Volume Resync, Status: Resync In Progress, Total Inodes: 255, Resync Complete: 254

Cid: 256000059, Snapshot Cid: 256000059, Vol Id: 157247757, Location: Source, Peer Addr: 127.0.0.1:5660
 ResyncType: Mirror Volume Resync, Status: Resync In Progress, Total Inodes: 255, Resync Complete: 254

Cid: 256000060, Snapshot Cid: 256000060, Vol Id: 157247757, Location: Source, Peer Addr: 127.0.0.1:5660
 ResyncType: Mirror Volume Resync, Status: Resync In Progress, Total Inodes: 255, Resync Complete: 254

Cid: 256000061, Snapshot Cid: 256000061, Vol Id: 157247757, Location: Source, Peer Addr: 127.0.0.1:5660
 ResyncType: Mirror Volume Resync, Status: Resync In Progress, Total Inodes: 255, Resync Complete: 254

Cid: 256000062, Snapshot Cid: 256000062, Vol Id: 157247757, Location: Source, Peer Addr: 127.0.0.1:5660
 ResyncType: Mirror Volume Resync, Status: Resync In Progress, Total Inodes: 255, Resync Complete: 254
Here the "CID"s are also the CIDs of SNAPSHOT of source volume.
"Vol Id: 157247757" is for source volume "srcvol".

2.5 Find out the snapshot of source volume when mirroring is in progress.

maprcli volume snapshot list -volume srcvol
For example:
cumulativeReclaimSizeMB  creationtime                  ownername  snapshotid  snapshotname                                volumeid   volumename  ownertype  volumepath
0                        Mon Jan 19 15:14:52 PST 2015  root       256000051   mirrorsrcsnap.6464203.19-Jan-2015-15-14-52  157247757  srcvol      1          /srcvol

2.6 Find out the snapshot of mirror volume when mirroring is in progress.

maprcli volume snapshot list -volume mirvol1
For example:
cumulativeReclaimSizeMB  creationtime                  ownername  snapshotid  snapshotname                     expirytime                    volumeid  volumename  ownertype  volumepath
1299                     Mon Jan 19 15:14:55 PST 2015  root       256000052   mirrorsnap.19-Jan-2015-15-14-55  Mon Jan 26 15:14:55 PST 2015  6464203   mirvol1     1          /mirvol1

2.7 List all the containers of snapshot of source and mirror volumes.

/opt/mapr/server/mrconfig info volume snapshot srcvol mirrorsrcsnap.6464203.19-Jan-2015-15-14-52
Volume snapshot containers
256000057:2242
256000058:2244
256000059:2245
256000060:2246
256000061:2247
256000062:2248
/opt/mapr/server/mrconfig info volume snapshot mirvol1 mirrorsnap.19-Jan-2015-15-14-55
Volume snapshot containers
256000063:2243
256000064:2249
256000065:2250
256000066:2251
256000067:2252
256000068:2253

3. Review cldb.log and then understand the how volume mirroring is doing.

This is timeline analysis based on cldb.log for this mirroring operation.

3.1 Mirroring starts.

2015-01-19 15:14:50,725 INFO CLDBServer [RPC-8]: VolumeUpdate: VolName: mirvol1Starting mirror op STATE_UPDATE

3.2 Snapshot of source volume is created.

2015-01-19 15:14:52,347 INFO VolumeMirrorInfo [pool-4-thread-5]: 
Creating source snapshot mirrorsrcsnap.6464203.19-Jan-2015-15-14-52 of volume srcvol@demo.mapr.com 
src vol id 157247757 src root cid 2242 for mirroring of mirvol1@demo.mapr.com
And then snapshot volume is created.
2015-01-19 15:14:53,514 INFO VolumeMirrorInfo [pool-4-thread-5]: 
Created Source volume snapshotmirrorsrcsnap.6464203.19-Jan-2015-15-14-52 snapshotId 256000051 
for src volume srcvol@demo.mapr.com mirror volume mirvol1@demo.mapr.com dataSrcSnapCreateTimeMillis 1421709292420

3.3 Snapshot of mirror volume is created.

2015-01-19 15:14:55,762 INFO VolumeMirrorInfo [pool-4-thread-8]:
Creating destination snapshot mirrorsnap.19-Jan-2015-15-14-55 of volume mirvol1@demo.mapr.com dest vol id 6464203 dest rw cid 2243 snap expiry time 1422314095762
2015-01-19 15:14:56,782 INFO VolumeMirrorInfo [pool-4-thread-8]: 
Created destination volume snapshotmirrorsnap.19-Jan-2015-15-14-55 snapshotId 256000052 for src volume srcvol@demo.mapr.com mirror volume mirvol1@demo.mapr.com retry count 0

3.4 Resync between snapshots of source and mirror volume.

2015-01-19 15:14:57,983 INFO VolumeMirrorInfo [pool-4-thread-10]: 
Update mirror status of volume mirvol1@demo.mapr.com newstate STATE_MIRROR_RESYNC_INPROGRESS mirrorid 2 nextMirrorId 3 completed successfully.  
srcsnapshot name mirrorsrcsnap.6464203.19-Jan-2015-15-14-52 destsnap name mirrorsnap.19-Jan-2015-15-14-55 destsnap id 256000052 srcsnap id 256000051

3.5 Resync completed between snapshots.

2015-01-19 15:15:28,619 INFO VolumeMirrorInfo [VolumeMirrorThread0]: 
Completed resync of containers of volume mirvol1@demo.mapr.com

3.6 Mirror volume completes roll-forward using the synced snapshot.

2015-01-19 15:15:28,731 INFO VolumeMirrorInfo [pool-4-thread-4]: 
Update mirror status of volume mirvol1@demo.mapr.com newstate STATE_MIRROR_ROLLFORWARD_INPROGRESS mirrorid 2 nextMirrorId 3 completed successfully.  
srcsnapshot name mirrorsrcsnap.6464203.19-Jan-2015-15-14-52 destsnap name mirrorsnap.19-Jan-2015-15-14-55 destsnap id 256000052 srcsnap id 256000051

2015-01-19 15:15:28,976 INFO VolumeMirrorInfo [VolumeMirrorThread0]: 
Completed rollforward of containers of volume mirvol1@demo.mapr.com

3.7 Remove source snapshot.

2015-01-19 15:15:29,040 INFO CLDBServer [RPC-1]: SnapshotRemove: 
Removing snapshot mirrorsrcsnap.6464203.19-Jan-2015-15-14-52 with snapshotid 256000051 for volume srcvol

3.8 Mirroring completed.

2015-01-19 15:15:30,777 INFO VolumeMirrorInfo [VolumeMirrorThread0]: 
Mirroring successfully completed for volume mirvol1@demo.mapr.com From srcvol@demo.mapr.com

Takeaways:
  • In all, MapR volume mirroring process is to resync the containers between source and mirror snapshots, and then roll forward mirror volume. 
  • Source snapshot will be removed after resync is done.
  • "mrconfig cntr resyncprogress" and "mrconfig info threads" show the CIDs of source snapshot, not the source volume.

No comments:

Post a Comment

Popular Posts