VHACS

From Linux-iSCSI
Jump to: navigation, search
VHACS-VM x86_64 across 2x physical node (12 Ghz 45 nanometer) x86_64 across 2x open platforms (Linux x86_64 / Linux i386) with 8 active 5 & 2.5 GB storage clouds.
VHACS-VM x86_64 across 2x physical node (12 Ghz 45 nanometer) x86_64 across 2x open platforms (Linux x86_64 / Linux i386) with 8 active 5 & 2.5 GB storage clouds.
VHACS 4x Physical Node "Bare-Metal" x86_64 VHACS with 4x 5GB clouds running with server/client enabled.
VHACS 4x Physical Node "Bare-Metal" x86_64 VHACS with 4x 5GB clouds running with server/client enabled.

Contents

What is VHACS..?

VHACS (pronounced vee-hacks) is a Cloud Storage implementation running on Linux v2.6. VHACS is an acrynom for Virtualization, High Availibility, and Cluster Storage. VHACS is a combination of at least eight (8) long term OSS/Linux based projects, along with a CLI management interface for controlling VHACS nodes, clouds, and vservers within the VHACS cluster.

The original author of VHACS is "Jerome Martin" - tramjoe.merin@gmail.com - code under GPL v2

What features does VHACS have in its current form..?

VHACS implements a M+N (Active+Spares) model that provides:

Server side VHACS cloud: Active/Active High Availability with Local synchronous Data Replication with iSCSI Target Export.

Client Side VHACS cloud: iSCSI Initiator using EXT3 mounts for VSERVER usage.

What is the easiest way for me to try out VHACS..?

The VHACS VM Alpha images are now available (see VHACS-VM) so that folks can get an idea of how the admin-level interface workds. Using two VM images for testing out the VHACS cloud will initially be the easiest method to try things out, and once we can get everything packages.

What technologies does VHACS use..?

Prototype Platform: Debian Etch v4 on x86_64 with v2.6.22.16 kdb or 2.6.22-4-vserver kernels

CLUSTER:

  • ) Pacemaker The scalable High-Availability cluster resource manager formerly part of Heartbeat
  • ) OpenAIS The OpenAIS Standards Based Cluster Framework is an OSI Certified implementation of the Service Availability Forum Application Interface Specification (AIS)

SERVER:

CLIENT:

What has been tested so far..?

In the 2 node cluster configuration running on multi-socket single core x86_64, running 32 active VHACS clouds (both client and server) of 1 GB and 100 MB sizes in the current test bed. The latter is used for multi-cloud ops, eg: 'vhacs storage -S yourVHACScloud01-4' would put those 4 into STANDBY.

What are the current limitations..?

In order to scale to the number of cluster RA's requires to monitor 32 cloud clusters, the decision was made to convert VHACS v0.6.0 from heartbeat to OpenAIS. As of June 26th, 2008, almost all major functionality is now up and running with OpenAIS+pacemaker.

Also, as we are exporting DRBD's struct block_device via LIO-Target IBLOCK directly, this means that DRBD is mapped 1 <-> 1 between the iSCSI Targetname+TargetPortalGroupTag tuple. Using volumes on top of the DRBD block device and then exporting these from LIO-Target IBLOCK is also another option for increasing cloud density and reducing the total number of required kernel threads.

Speaking of kernel threads, there are ~256 for a 32 cloud cluster on a fully loaded node running BOTH roles (see below). Also, there are 128 cluster RAs for this same multi-role fully loaded VHACS cluster node.

How are the network interfaces within the VHACS design allocated..?

In the v0.8.15 release, there are currently two:

lio-drbd-ruler:~# cat /etc/vhacs.conf | grep IFNAME

  1. STORAGE_IFNAME network interface to use for accessing the storage network

STORAGE_IFNAME = "eth2"

  1. HEARTBEAT_IFNAME network interface to be used for cluster communications

HEARTBEAT_IFNAME = "eth2"

Also, the OpenAIS Totem broadcast address information is also defined at the top of /etc/ais/openais.conf:

       totem {
               version: 2
               secauth: off
               threads: 0
               interface {
                       ringnumber: 0
                       bindnetaddr: 192.168.0.0
                       mcastaddr: 224.0.0.1
                       mcastport: 5405
               }
       }

For VHACS v1.0, there will be an additional IFNAME defined, REPLICATION_IFNAME for replication traffic using DRBD between VHACS nodes.

Can I run different STORAGE_IFNAME and HEARTBEAT_IFNAME interfaces in the current version of VHACS..?

Most certainly! In most simple example, this consists of:

  • ) Having two (2) network interfaces on each node in the VHACS cluster. They should be running on a different local subnet or network range from each other. Also, in the current release of the STORAGE_IFNAME and HEARTBEAT_IFNAME values MUST be the same on both machines, using eth0 for STORAGE_IFNAME and eth1 for HEARTBEAT_IFNAME on BOTH machines.

The current setup using two (2) network bridges looks something like:

  • ) Having 192.168.0.0/eth0 for STORAGE_IFNAME:
       vhacs-node0: 192.168.0.*/eth0 via DHCP
       vhacs-node1: 192.168.0.*/eth0 via DHCP
  • ) Having 10.10.0.0/eth1 for HEARTBEAT_IFNAME:
       vhacs-node0: 10.10.0.15/eth1 via static IP
       vhacs-node1: 10.10.0.20/eth1 via static IP

For the 2 port example with VHACS, please have a look at VHACS-VM#Can_I_run_different_STORAGE_IFNAME_and_HEARTBEAT_IFNAME_interfaces_in_the_current_release_of_VHACS-VM...3F.

What are VHACS roles..?

VHACS roles are assigned to nodes in the VHACS cluster. Both, only one or zero roles can be assigned to each node in the VHACS cluster. These roles are defined as:

  • ) storage: This VHACS node can provide the SERVER side cloud of the VHACS cluster
  • ) vhost: This VHACS node can provide the CLIENT side cloud of the VHACS cluster

Does this mean that ONLY VHACS server/clients can communicate, and not other 3rd party clients..?

No, because VHACS uses Traditional iSCSI on the server side of the cloud, any iSCSI initiator can take advantage of the VHACS server side cloud.

What about providing the VHACS cloud with other non iSCSI storage fabrics..?

Yes, as work continues for v3.0.0 LIO-Core upstream (see LIO-Target), using LIO-Core's mature set of plugins for accessing every possible physical/virtual past/present/future storage device in target mode will be made available on fabrics other than RFC-3720. There are a couple of options here wrt where to get started, iSER/IB, iSER/iWARP, FCoE, AoE and SRP just to name a few.

What does a running prototype look like..?

       lio-drbd-ruler:~# vhacs cluster -M
        __________________________________________________________________________________________________________________
       |                      |                      |                      |                      |                      |
       | NODE                 | HA STATUS            | FREE STORAGE         | STORAGE ROLE         | VHOST ROLE           |
       |______________________|______________________|______________________|______________________|______________________|
       |                      |                      |                      |                      |                      |
       | (A)lio-drbd-viking   | online               | 44.36G/68.36G        | 0 exported           | 2 mounted            |
       | (A)lio-drbd-sabbath  | online               | 68.36G/68.36G        | N/A                  | N/A                  |
       | (A)lio-drbd-ruler    | online               | 50.53G/74.53G        | 16 exported          | 14 mounted           |
       |______________________|______________________|______________________|______________________|______________________|
        _________________________________________________________________________________________________________________
       |                  |                  |                  |                  |                  |                  |
       | STORAGE          | DRBD:0           | DRBD:1           | DRBD TARGET      | ISCSI MOUNT      | FREE SPACE       |
       |__________________|__________________|__________________|__________________|__________________|__________________|
       |                  |                  |                  |                  |                  |                  |
       | (A)liocloud0     |(S)lio-drbd-viking| (P)lio-drbd-ruler| (S)lio-drbd-ruler|(S)lio-drbd-viking| 940M/1008M (98%) |
       | (A)liocloud1     |(S)lio-drbd-viking| (P)lio-drbd-ruler| (S)lio-drbd-ruler|(S)lio-drbd-viking| 940M/1008M (98%) |
       | (A)liocloud7     |(S)lio-drbd-viking| (P)lio-drbd-ruler| (S)lio-drbd-ruler| (S)lio-drbd-ruler| 940M/1008M (98%) |
       | (A)liocloud8     |(S)lio-drbd-viking| (P)lio-drbd-ruler| (S)lio-drbd-ruler| (S)lio-drbd-ruler| 940M/1008M (98%) |
       | (A)morecloud0    |(S)lio-drbd-viking| (P)lio-drbd-ruler| (S)lio-drbd-ruler| (S)lio-drbd-ruler| 940M/1008M (98%) |
       | (A)morecloud1    |(S)lio-drbd-viking| (P)lio-drbd-ruler| (S)lio-drbd-ruler| (S)lio-drbd-ruler| 940M/1008M (98%) |
       | (A)morecloud2    | (P)lio-drbd-ruler|(S)lio-drbd-viking| (S)lio-drbd-ruler| (S)lio-drbd-ruler| 940M/1008M (98%) |
       | (A)morecloud3    |(S)lio-drbd-viking| (P)lio-drbd-ruler| (S)lio-drbd-ruler| (S)lio-drbd-ruler| 940M/1008M (98%) |
       | (A)westcloud0    |(S)lio-drbd-viking| (P)lio-drbd-ruler| (S)lio-drbd-ruler| (S)lio-drbd-ruler| 1.9G/2.0G (98%)  |
       | (A)westcloud1    | (P)lio-drbd-ruler|(S)lio-drbd-viking| (S)lio-drbd-ruler| (S)lio-drbd-ruler| 1.9G/2.0G (98%)  |
       | (A)westcloud2    |(S)lio-drbd-viking| (P)lio-drbd-ruler| (S)lio-drbd-ruler| (S)lio-drbd-ruler| 1.9G/2.0G (98%)  |
       | (A)westcloud3    | (P)lio-drbd-ruler|(S)lio-drbd-viking| (S)lio-drbd-ruler| (S)lio-drbd-ruler| 1.9G/2.0G (98%)  |
       | (A)eastcloud0    |(S)lio-drbd-viking| (P)lio-drbd-ruler| (S)lio-drbd-ruler| (S)lio-drbd-ruler| 1.9G/2.0G (98%)  |
       | (A)eastcloud1    | (P)lio-drbd-ruler|(S)lio-drbd-viking| (S)lio-drbd-ruler| (S)lio-drbd-ruler| 1.9G/2.0G (98%)  |
       | (A)eastcloud2    |(S)lio-drbd-viking| (P)lio-drbd-ruler| (S)lio-drbd-ruler| (S)lio-drbd-ruler| 1.9G/2.0G (98%)  |
       | (A)eastcloud3    |(S)lio-drbd-viking| (P)lio-drbd-ruler| (S)lio-drbd-ruler| (S)lio-drbd-ruler| 1.9G/2.0G (98%)  |
       |__________________|__________________|__________________|__________________|__________________|__________________|


What does the interface look like..?

       halfdome:~# vhacs 
       usage: /usr/sbin/vhacs cluster|node|storage|vserver [options]
       cluster: cluster-level admin functions in a vhacs cluster.
               Run /usr/sbin/vhacs cluster to get specific usage information.
       node:    node-level admin functions in a vhacs cluster.
               Run /usr/sbin/vhacs node to get specific usage information.
       storage:   storage-level admin functions in a vhacs cluster.
               Run /usr/sbin/vhacs storage to get specific usage information.
       vserver: vserver-level admin functions in a vhacs cluster.
               Run /usr/sbin/vhacs vserver to get specific usage information.
       halfdome:~# vhacs cluster
       usage: 
         For full description, try :
         vhacs cluster -h|--help
         With all options, use -V LEVEL to increase verbosity
         vhacs cluster -c|--check
         vhacs cluster -I|--init [NODES]
         vhacs cluster -m|--monitor
         vhacs cluster -M|--monitor1
         vhacs cluster [NODES] -E|--exec COMMAND|-
         vhacs cluster [NODES] -P|--exec COMMAND|-
       syntax for NODES argument:
         foobar        just the node named foobar
         foobar1,foobar2,foobar3
                       run the subcommand recursively for all listed nodes
         foobar1-3     equivalent to foobar1,foobar2,foobar3
         foobar1-3,foobar5
                       equivalent to foobar1,foobar2,foobar3,foobar5
         ALL           special node name that converts to the list of 
                       all nodes in the hearbeat cluster local node is 
                       in, if any
       halfdome:~# vhacs node
       usage: 
         For full description, try :
         vhacs node -h|--help
         vhacs node -s|--setrole ROLES NODES
         vhacs node -d|--delrole ROLES NODES
         vhacs node -l|--list
         vhacs node -i|--info NODES
         vhacs node -S|--standby NODES
         vhacs node -A|--active NODES
       syntax for NODES argument:
         foobar        just the node named foobar
         foobar1,foobar2,foobar3
                       run the subcommand recursively for all listed nodes
         foobar1-3     equivalent to foobar1,foobar2,foobar3
         foobar1-3,foobar5
                       equivalent to foobar1,foobar2,foobar3,foobar5
         ALL           special name that converts to the list of all nodes
       syntax for ROLES argument:
         vhost        the node can mount remote storage and will run resources
                      like virtual machines off it
         storage      the node can host physical disk partitions part
                      of user-created storage
         vhost,storage
                      the node can do both
         ALL          equivalent to vhost,storage
       halfdome:~# vhacs storage
       usage: 
         For full description, try :
         vhacs storage -h|--help
         With all options, use -V LEVEL to increase verbosity
         vhacs storage -c|--create STORAGES -s|--size SIZE [-n|--nodes DRBD_NODES]
         vhacs storage -D|--destroy STORAGES
         vhacs storage -u|--unfail STORAGES
         vhacs storage -r|--restart STORAGES
         vhacs storage -l|--list
         vhacs storage -L|--listbig
         vhacs storage -m|--monitor
         vhacs storage -M|--monitor1
         vhacs storage -i|--info STORAGES
         vhacs storage -S|--standby STORAGES
         vhacs storage -A|--active STORAGES
         vhacs storage -p|--prefers NODES STORAGES
       syntax for STORAGES argument:
         foobar        just the storage named foobar
         foobar1,foobar2,foobar3
                       run the subcommand recursively for all listed storages
         foobar1-3     equivalent to foobar1,foobar2,foobar3
         foobar1-3,foobar5
                       equivalent to foobar1,foobar2,foobar3,foobar5
         ALL           special name that converts to the list of all storages
 
       syntax for NODES argument:
         nodefoobar    migrate storage mount on node nodefoobar if possible and
                       assign scores so that this is the preferred node in the 
                                       future for mounting storages.
         node1-2,node4 try to migrate storage on node1, then node2, then node4
                       and assign scores so that the nodes will be preferred
                       in that order for future migrations
       syntax for DRBD_NODES argument:
         Same as above, but this is used when creating a storage to set your
         preferred nodes to be used for hosting the disk backend.
         The ALL keyword here has no special meaning.
       halfdome:~# vhacs vserver
       Not implemented yet.
Personal tools