Network RAID1

The Linux SCSI Target Wiki

(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
{{Issues | rewrite = November 2010 | cleanup = November 2010 }}
{{Issues | rewrite = November 2010 | merge = November 2010 }} Network Raid 1 Mirror: Network Raid 1 Mirror:

Revision as of 02:08, 26 November 2010 Network Raid 1 Mirror:


What is a iSCSI Network Raid 1 Mirror..?

A network raid 1 mirror allow for two or more iSCSI target machines to become physically redundant to complete hardware or storage array failure for LIO-NR1 mirrored volumes.

What does the current prototype look like..?

The current prototype contains four (4) Xen paravirtualized machines (two T/I VMs doing LIO-NR1 and two Initiator VMs running ext3/ocfs2 tests) running across 2 physical dom0 machines with 2x socket 2x core x86_64 with 8 GB of memory.

The two LIO-NR1 T/I virtual machines are have no local storage (other than a Xen block device for the root filesystem), and are accessing storage through Open/iSCSI to core LIO targets. On both LIO-NR1 nodes, volumes are created on top of available SCSI block devices. On then the primary LIO-NR1 node, the LIO-NR1 array is built with:

  mdadm --create /dev/md0 --level=1 --raid-devices=2 -bitmap=internal /dev/LIO-NR1-Elements/NR1-Local-Element --write-mostly /dev/LIO-NR1-Elements/NR1-Remote-Element
 [root@bbtest2 ~]# cat /proc/mdstat 
   Personalities : [raid1] 
   md0 : active raid1 dm-2[0] dm-3[1](W)
       10477504 blocks [2/2] [UU]
       bitmap: 1/160 pages [4KB], 32KB chunk
   unused devices: <none>

From there, a new volume group (LIO-NR1-VOL) is created and a new volume (NR1-PRIMARY-VOL) on the LIO-NR1 array (/dev/md0).

 [root@bbtest2 ~]# lvs -v
   Finding all logical volumes
 LV                 VG               #Seg Attr   LSize  Maj Min KMaj KMin Origin Snap%  Move Copy%  Log LV UUID                               
 NR1-Local-Element  LIO-NR1-Elements    1 -wimao 10.00G 253   2 253  2                                  Qu7YhW-vdWo-IZPd-yDxP-sEbm-xM8L-y96RPD
 NR1-Remote-Element LIO-NR1-Elements    1 -wimao  9.99G 253   3 253  3                                  EEQewk-dhCW-UoMY-LgIK-QV8C-5Zlx-0Hppxc
 NR1-PRIMARY-VOL    LIO-NR1-VOL         1 -wimao  9.98G 253   4 253  4                                  JYElqI-kJOD-QwRo-A68B-s6X9-jw6g-Jfyy1p
 LogVol00           VolGroup00          1 -wi-ao  3.75G  -1  -1 253  0                                  69PKY5-5nIM-7TZX-vjoh-4sRJ-pALn-QDzNTq
 LogVol01           VolGroup00          1 -wi-ao  1.00G  -1  -1 253  1                                  UwLWHP-J3Iv-s03T-q1gk-nXbE-hvBd-If7hg0

These iSCSI volumes and LIO-NR1 volumes need to be accessable on boot by LIO-Primary, and from there, the LVM UUID is passed into a virtual iBlock (BIO Sync Ack) or FILEIO (buffered Ack) with the Storage Engine of LIO-Target.

  [root@bbtest2 ~]# target-ctl listluninfo tpgt=1
       -----------------------------[LUN Info for iSCSI TPG 1]-----------------------------
       Status: ACTIVATED  Execute/Left/Max Queue Depth: 0/32/32  SectorSize: 512  MaxSectors: 128
       iBlock device: dm-4  LVM UUID: JYElqI-kJOD-QwRo-A68B-s6X9-jw6g-Jfyy1p
       Major: 253 Minor: 4  CLAIMED: IBLOCK
       Type: Direct-Access     ANSI SCSI revision: 02  Unit Serial: JYElqI-kJOD-QwRo-A68B-s6X9-jw6g-Jfyy1p  DIRECT  EXPORTED
       iSCSI Host ID: 0 iSCSI LUN: 0  Active Cmds: 0  Total Bytes: 10716446720
       ACLed iSCSI Initiator Node(s):
       0 -> 0
       0 -> 0

For testing purposes, all four virtual machines disk images are located on iSCSI storage on their respective host virtualization machines. This storage is coming from one of the core LIO target nodes, and is MD RAID6 SATA with lvm2 on top of the array.

For typical production systems, we expect people to be using entire software or hardware RAID arrays, or Linux v2.6 lvm2 block devices.

What is a T/I Repeater node..?

This as physical or virtual machine that is running both iSCSI Target and Initiator stacks.

What RAID1 code does LIO-NR1 use..?

LIO-NR1 uses Linux MD RAID1 with an internal write intent bitmap and write mostly element flag.

The use of an internal bitmap for tracking changed blocks allows failed LIO-N1 primary and secondary nodes to recover quickly in the face of node failure.

The use of the write mostly element flag is used on the primary LIO-NR1 node's remote iSCSI volume which represents secondary LIO-NR1 node's local storage. This is done to ensure that READ ops coming from frontend iSCSI initiators are issued to the primary LIO-NR1 node's local storage.

What are the plans for production usage of LIO-NR1..?

The production plans for are to run with LIO-NR1 on Dom0 from software SATA RAID6+LVM, Hardware RAID5+LVM, and Software SAS RAID10+LVM with Linux/HA. As the prototype so far has proved very stable testing possible failure scenarios, getting LIO-NR1 into Dom0 testing is the next step

What iSCSI Initiators can be used with LIO to create T/I Repeater Nodes..?

The current developments for a stable LIO-NR1 have been done with LIO Target and Open-iSCSI running DomU under Xen.

Why did you choose to use DomU for the current prototype..?

Basically for ease of development.

There is also plans in the near future to provide this ability inside of LIO-VM itself using Open/iSCSI for testing and educational purposes.

You can also import Host OS local iSCSI storage through a virtualization hypervisor into LIO-VM for a multi-OS T/I repeater node.

What about performance of the current setup..?

Running LIO-NR1 on Dom0 will definately increase performance.

Using LVM volume block devices on the DomU Primary and Secondary T/I and VMs as elements of /dev/md0 on the LIO-NR1 machines seems to be a bit slower than raw SCSI block devices. We then create a LVM volume ( NR1-PRIMARY-VOL in the prototype) on top of /dev/md0 and this is the storage object that is exported to frontside iSCSI Initiators.

There is also a concern that using an internal write intent bitmap (which is pretty much a requirement for production) with MD has performance implications.

What about latency..?

Having dedicated 1 Gb/sec or 10 Gb/sec ports between LIO-NR1 nodes running jumbo frames for dedicated traffic on Dom0 should help improve latency and performance by reducing the number of interrupts produced by networking hardware.

Also, using dedicated CPU affinity for LIO-Target threads on Dom0 is something that should be considered for production

What about growing the amount of LIO-NR1 storage available for frontend iSCSI initiators..?

There are at least two ways of doing this:

Where can I download the actual LIO-NR1 Xen DomU images..?

Where can I find the scripts to test this myself in Xen DomU.?

Personal tools
Google AdSense