SAN Box Used via iSCSI - Funtoo Linux

download SAN Box Used via iSCSI - Funtoo Linux

of 34

Transcript of SAN Box Used via iSCSI - Funtoo Linux

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    1/34

    From Funtoo Linux

    1 Introduction

    2 The SAN box

    2.1 Hardware and setup considerations

    2.2 Operating system choice and storage pool configuration

    2.3 Networking considerations2.4 Architectural overview

    3 Preliminary BIOS operations on the SAN Box3.1 BIOS Upgrade

    3.2 SAN box BIOS setup

    4 Phase 1: Setting up Solaris 11 on the SAN box

    4.1 Activating SSH to work from a remote machine

    4.1.1 DNS servers configuration4.1.2 Network configuration

    4.1.3 Bringing up SSH

    4.1.4 Enabling some extra virtual terminals

    4.1.5 Be a Watt-saver!

    5 Phase 2: Creating the SAN storage pool on the SAN box

    5.1 Enabling the hotplug service

    5.2 Creating the SAN storage pool (RAID-Z1)5.3 Torturing the SAN pool

    5.3.1 Removal of one disk5.3.2 Returning to normal operation

    5.3.3 Removal of two disks

    5.4 Creating zvols

    6 Phase 3: Bringing a bit of iSCSI magic

    6.1 iSCSI concepts overview

    6.1.1 Initiator, target and LUN

    6.1.2 IQN vs EUI

    6.2 Making the SAN box zvols accessible through iSCSI

    6.2.1 Installing pre-requisites6.2.2 Creating the Logical Units (LU)

    6.2.3 Mapping concepts6.2.4 Setting up the LU mapping

    6.2.5 Restricting the listening interface of a target (portals and portal groups)6.2.6 Testing from the SAN Box

    7 Phase 4: Mounting a remote volume from a Funtoo Linux box

    7.1 Installing requirements7.2 Linux kernel configuration

    7.3 Configuring the Open iSCSI initiator

    7.4 Spawning iscsid

    7.5 Querying the SAN box for available targets

    7.6 Connecting to the remote target

    7.7 Using the remote LU7.8 Unmounting the iSCSI disk and logging out from the target

    7.9 Taking and restoring a zvol snapshot7.10 Automating the connection to a remote iSCSI disk

    8 Optimizations / security

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    4 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    2/34

    8.1 Activating stakeholders authentication (CHAP)8.2 Configuring the SAN box to send iSCSI blocks in round-robin

    8.2.1 Activating multi-pathing

    8.3 Other ways to gain transmission speed

    8.3.1 Changing the logical unit block size

    8.3.2 Using Ethernet jumbo frames

    9 Final words

    10 Footnotes and references

    In this lab we are experimenting a Funtoo Linux machine that mounts a remote volume located on a Solaris machine through

    iSCSI. The remote volume resides on a RAID-Z ZFS pool (zvol).

    The SAN box is composed of the following:

    Jetway NF81 Mini-ITX Motherboard:

    AMD eOntario (G-Series) T56N Dual Core 1.6 GHz APU @ 18W

    2 x DDR3-800/1066 Single Channel SO-DIMM slots (up to 8 GB)

    Integrated ATI Radeon HD 6310 Graphics

    2 x Realtek RTL8111E PCI-E Gigabit Ethernet

    5 x SATA3 6Gb/s Connectors

    LVDS and Inverter connectors (used to connect a LCD panel directly on the motherboard)

    Computer case: Chenbro ES34069, for a compact case it gives a big deal:4 *hot swappable* hard drive bays plus 1 internal bay for a 2.5 drive. The case is compartmented and two fans

    ventilates the compartment where the 4 hard drives resides (drives remain cool)Lockable front door preventing access to the hard drives bays (mandatory if the machine is in an open place and if

    you have little devils at home who love playing with funny things in your back, this can saves you the bereavement

    of several terabytes of data after hearing a "Dad, what is that stuff?" :-) )

    4x Western Digital Caviar Green 3TB (SATA-II) - storage pool

    1x OCZ Agility 2 60GB (SATA-II) - system disk

    2x 4GB Kingston DDR-3 1066 SO-DIMM memory module - storage area, will be divided in several virtual volumes

    exported via iSCSI.

    1x external USB DVD drive (Samsung S084 choosen for the aesthetics, personal choice)

    For the record ghere is what the CPU/chipset looks like:

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    4 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    3/34

    Operating system: Oracle Solaris Express 11 (T56N APU has hardware acceleration for AES, according to Wikipedia...

    Crypto Framework has support for it, Solaris 10 and later required)Storage: all of the 4 WD will be used to create a big ZFS RAID-Z1 zpool (RAID-5-on-steroids, extremely robust in terms

    preserving and automatically restoring the data integrity) with several emulated virtual block devices (zvols) exported viaiSCSI.

    Solaris 11 Express licencing terms changed since the acquisition of Sun: check the OTN

    Licence terms to see if you can use or not use OSE 11 in your context. As a general rule

    of thumb: if your SAN box will hold strictly familial data (i.e. you don't use it for

    supporting commercial activities / generating financial profits) you should be entitled to

    freely use OSE 11. However we are not not lawers nor Oracle commercial partners so

    always check with Oracle to see if they allow you to freely use OSE 11 or if they require

    you to buy a support contract in your context.

    Each of the two wthernet NIC will be connected to two different (linked) Gigabits switches. For the record, the used switches

    here are TrendNet TEG-S80Dg, directly connected on a 120V plug no external PSU

    Before going on with the installation of Solaris, we have to do two operations:

    Upgrade the BIOS to its latest revision (our NF81 motherboard came with an BIOS revision)

    Setup the BIOS

    At date of writing, JetWay published a newer BIOS revision (A03) for the motherboard (http://www.jetwaycomputer.com

    /NF81.html) , unfortunately the board has no built-in flash utilities accessible at machine startup so only solution is to build a

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    4 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    4/34

    bootable DOS USB Key using Qemu (unetbootin with a FreeDOS failed has been tried but no joy):

    Flashing a BIOS is always a risky operation, always connect the computer to a UPS

    protected power plug and let the flash to complete

    On your Funtoo box:

    Emerge the package app-emulation/qemuDownload Balder (http://www.finnix.org/Balder) (minimalistic FreeDOS image), at date of writing balder10.img is

    the only version available

    Plug your USB key, note how the kernel recognize it (e.g. /dev/sdz) and backup any valuable data on it.

    With the partitiong tool of your choice (cfdisk, fdisk...) create a single partition on your USB Key, change it type to

    0xF (Win95 FAT32) and mark it as being bootableRun qmemu and make it boot on the Balder image and make it use your USB has an hard-drive => qemu -boot a

    -fda balder10.img -hda /dev/sdz (your USB key is seen as drive C: and the Balder image as floppy drive A:)

    In the boot menu shown in the Qemu window select option "3 Start FreeDOS for i386 + FDXMS", after a fraction

    of second a extremly very well known prompt coming from the antiquity of computer-scienc welcomes you :)It is now time to install the boot code on the "hard drive" and copy all file contained in the Balder image. To achieve

    that execute in sequence sys c: followed by xcopy /E /N a: c:Close QEmu and test your key by running Qemu again: qemu /dev/sdz

    Do you see Balder boot menu? Perfect, close Qemu againMount your USB key within the VFS: mount /dev/sdz /mnt/usbkeyOn the Motherboard manufacturer website, download the update archive (in the case of JetWay, get xxxxxxxx.zip

    not xxxxxxxx_w.zip, that later is for flashing from a Windows instance). In general BIOS update archives comes as

    a ZIP archive containing not only a BIOS image but also the adequate tools (for MS-DOS ;) to reflash the BIOS

    EEPROM. This is not the case with the JetWay NF81 but some motherboard models require you to change a jumper

    position on the motherboard before flashing the BIOS EEPROM (read-only/read-write).

    Extract in the above mount-point the files contained in the BIOS update ZIP archive.

    Now unmount the USB key - DO NOT UNPLUG THE KEY WITHOUT UNMOUNTING IT FIRST YOU

    WILL GET CORRUPTED FILES ON IT.

    On the SAN Box:Plug the USB key on the SAN Box, turn the machine on and go in the BIOS setting to make sure the first bootdevice is the USB key. For the the JetWay NF81 (more or less UEFI BIOS?): if you see your USB Key twice in the

    boot devices list with one instance name starting with "UEFI...", make sure this name is selected first else the system

    won't consider the USB as a bootable device.

    Save your changes and exit from the BIOS setup program, when you USB key is probed the BIOS should boot on it

    In the Balder boot menu, choose again "3 Start FreeDOS for i386 + FDXMS"

    At the FreeDOS C:> prompt, just run the magic command FLASH.BAT (this DOS batch file runs the BIOS flashing

    with the adequate arguments)

    when the flashing finishes power cycle the machine (in our case, for reason, the flashing utility didn't return at the

    prompt)

    Go again in the BIOS Settings and load the default settings (highly recommended).

    Pays attention to the following settings:

    Main:

    Check the date and time it should be set on your local date and time

    SATA Config: No drives will be listed here but the one connected on the 5th SATA port (assuming you have enabled

    the SATA-IDE combined node, if SATA-IDE combined mode has been disabled no drive will be showed up). The

    option is a bit misleading, it should have been called "IDE Drives"...

    Advanced:

    CPU Config -> C6 mode: enabled (C6 not shown in powertop?)

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    4 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    5/34

    Shutdown temperature: 75C (you can choose a bit lower)

    Chipset - Southbridge

    Onchip SATA chancel: EnabledOnchip SATA type: AHCI (mandatory to be able to use hot plug and this also enables the drive Native Command

    Queuing (NCQ)

    SATA-IDE Combined mode: Disabled (this option affects *only* the 5th SATA port as being an IDE one even is

    AHCI has been enabled for "Onchip SATA type", undocumented in the manual :-( )

    Boot: set the device priority to the USB DVD-ROM drive first then the 2.5 OCZ SSD drive

    Ok this is not especially Funtoo related but the following notes can definitely saves you several hours. The whole process of

    setting up a Solaris operating system won't be explained because it far beyond the scope of this article but here is some pertinent

    information for SAN setup thematic. A few points:

    Everything we need (video drivers untested Xorg won't be used) is properly recognized out of the box especially the

    Ethernet RTL8111E chipset (prtconf -D report the kernel is using a driver for the RTL8168 this is normal, both chipsets

    are very close so an alias has been put in /etc/driver_aliases).

    3TB drives are properly recognized and used under Solaris, no need of the HBA adapter provided by Western Digital

    We won't partition the 4 Western Dgigital drives with GPT we will use the directly the whole devices

    According to Wikipedia, AES hardware acceleration of the AMD T56N APU is supported by the Solaris CryptographicFramework (you won't see it has an hardware crypto provider)

    Gotcha: canadian-french keyboard layout gives an azerty keyboard on the console :(

    Solaris 11 Express comes with a text-only CD-ROM [1]. Because machine serves a storage box with no screen connected on it

    most of time and working from a 80x25 console is not very comfortable (especially with an incorrect keyboard mapping), the

    best starting point is to enable the network and activate SSH.

    DNS servers configuration

    The SAN box has 2 NICs which will be used as 2 different active interfaces with no fail-over/link aggregation configuration. The

    DNS servers have been configure to resolve the SAN box in a round robin manner. Depending on the version of ISC BIND you

    use:

    ISC BIND version 8:

    Set the multiple-cnames option to yes (else you will get an error).Define your direct resolution like below:

    ( . . . )sanbox- i f 0 I N A 192. 168. 1. 100sanbox- i f 1 I N A 192. 168. 1. 101( . . . )sanbox I N CNAME sanbox- i f 0sanbox I N CNAME sanbox- i f 1

    ISC BIND version 9

    You MUST use multiple A resource records

    Define your direct resolution like below:

    ( . . . )sanbox- i f 0 I N A 192. 168. 1. 100sanbox- i f 1 I N A 192. 168. 1. 101sanbox I N A 192. 168. 1. 100sanbox I N A 192. 168. 1. 101( . . . )

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    4 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    6/34

    Network configuration

    Let's configure the network on the SAN Box. If you are used to Linux, you will see that the process in the Solaris world the

    process is a bit similar with some substential differences however. First, let's see what are the NICs recognized by the Solaris

    kernel at system startup:

    # dl admshow- physLI NK MEDI A STATE SPEED DUPLEX DEVI CEr ge0 Et hernet unknown 0 unknown r ge0r ge1 Et hernet unknown 0 unknown r ge1

    Both Ethernet interfaces are connected to a switch (have their link LEDs lit) but they still need to be "plumbed" (plumbing is

    persistent between reboots):

    # i f conf i g rge0 pl umb# i f conf i g rge1 pl umb# dl admshow- physLI NK MEDI A STATE SPEED DUPLEX DEVI CEr ge0 Ether net up 1000 f ul l r ge0r ge1 Ether net up 1000 f ul l r ge1

    Once plumbed, both interfaces will also be listed when doing an ifconfig:

    # i f conf i g - al o0: f l ags=2001000849 mt u 8232 i ndex 1

    i net 127. 0. 0. 1 net mask f f 000000r ge0: f l ags=1000843 mt u 1500 i ndex 2

    i net 0. 0. 0. 0 netmask 0ether 0:30: 18:a1: 73: c6

    r ge1: f l ags=1000842 mt u 1500 i ndex 4i net 0. 0. 0. 0 netmask 0ether 0:30: 18:a1: 73: c6

    Time to assign a static IP address to both interfaces (we won't use link aggregation or fail-over configuration in that case):

    # i f conf i g rge0 192. 168. 1. 100/24 up# i f conf i g rge1 192. 168. 1. 101/24 up

    Notice the up the end, if you forget it Solaris will assign an IP address to the NIC but will consider it being inactive (down).

    Checking again:

    # i f conf i g - al o0: f l ags=2001000849 mt u 8232 i ndex 1

    i net 127. 0. 0. 1 net mask f f 000000r ge0: f l ags=1000843 mt u 1500 i ndex 2

    i net 192. 168. 1. 100 netmask ff f f f f 00 broadcast 192. 168. 1. 255ether 0:30: 18:a1: 73: c6

    r ge1: f l ags=1000842 mt u 1500 i ndex 4i net 0. 0. 0. 0 netmask 0ether 0:30: 18:a1: 73: c6

    To see rge0 and rge1 automatically brough up at system startup, we need to defile 2 files names /etc/hostname.rgeX containing

    the NIC IP address:

    # echo " 192. 168. 1. 100/ 24" > / et c/ host name. r ge0# echo " 192. 168. 1. 100/ 24" > / et c/ host name. r ge1

    Now let's add a default route and make it persistent across reboots:

    # echo 192. 168. 1. 1 > / etc/ def aul t r out er# route add `cat / etc/ def aul t r out er`# pi ng 192. 168. 1. 1192. 168. 1. 1 i s al i ve

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    4 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    7/34

    Once everything is in place, reboot the SAN box and see if both NICs are up with an IP address assigned and with a default

    route:

    # i f conf i g - al o0: f l ags=2001000849 mt u 8232 i ndex 1

    i net 127. 0. 0. 1 net mask f f 000000r ge0: f l ags=1000843 mt u 1500 i ndex 2

    i net 192. 168. 1. 100 netmask ff f f f f 00 broadcast 192. 168. 1. 255ether 0:30: 18:a1: 73: c6

    r ge1: f l ags=1000843 mt u 1500 i ndex 4i net 192. 168. 1. 101 netmask ff f f f f 00 broadcast 192. 168. 1. 255ether 0:30: 18:a1: 73: c6

    ( . . . )

    # netst at - r n

    Rout i ng Tabl e: I Pv4Dest i nat i on Gateway Fl ags Ref Use I nter f ace

    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -def aul t 192. 168. 1. 1 UG 2 682127. 0. 0. 1 127. 0. 0. 1 UH 2 172 l o0192. 168. 1. 0 192. 168. 1. 101 U 3 127425 r ge1192. 168. 1. 0 192. 168. 1. 100 U 3 1260 r ge0

    A final step for this paragraph: name resolution configuration. The environment used here has DNS servers (primary 192.168.1.2

    secondary 192.168.1.1) so /etc/resolv.conf looks like this:

    search myt est domai n. l annameser ver 192. 168. 1. 2nameser ver 192. 168. 1. 1

    By default the hosts resolution is configured (/etc/nsswitch.conf) to use the DNS servers first then the file /etc/hosts so make sure

    that /etc/nsswitch.conf has the following line:

    host s: dns

    Make sure name resolution is in order:

    # nsl ookup sanbox-i f 1Ser ver : 192. 168. 1. 2

    Addr ess: 192. 168. 1. 2#53

    Name: sanbox- i f 1. myt est domai n. l anAddr ess: 192. 168. 1. 100

    # nsl ookup sanbox-i f 0Ser ver : 192. 168. 1. 2Addr ess: 192. 168. 1. 2#53

    Name: sanbox- i f 1. myt est domai n. l anAddr ess: 192. 168. 1. 101

    # nsl ookup 192. 168. 1. 100Ser ver : 192. 168. 1. 2Addr ess: 192. 168. 1. 2#53

    101. 1. 168. 192. i n- addr. ar pa name = sanbox- i f 0. myt est domai n. l an.

    # nsl ookup 192. 168. 1. 101

    Ser ver : 192. 168. 1. 2Addr ess: 192. 168. 1. 2#53

    101. 1. 168. 192. i n- addr. ar pa name = sanbox- i f 1. myt est domai n. l an.

    Does round-robin canonical name resolution work?

    # nsl ookup sanboxSer ver : 192. 168. 1. 2Addr ess: 192. 168. 1. 2#53

    Name: sanbox. myt est domai n. l anAddr ess: 192. 168. 1. 100Name: sanbox. myt est domai n. l an

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    4 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    8/34

    Addr ess: 192. 168. 1. 101

    # nsl ookup sanboxSer ver : 192. 168. 1. 2Addr ess: 192. 168. 1. 2#53

    Name: sanbox. myt est domai n. l anAddr ess: 192. 168. 1. 101Name: sanbox. myt est domai n. l anAddr ess: 192. 168. 1. 100

    Perfect! Two different requests gives a list in a different order. Now reboot the box and check everything is in order:

    Both network interfaces automatically brought up with an IP address assigned

    The default route is set to the right gateway

    Name resolution is setup (see /etc/resolv.conf)

    Bringing up SSH

    This is very straightforward:

    Check if the service is disabled (on a fresh install it is):

    # svcs - a | grep ssh

    di sabl ed 8:07: 01 svc: / network/ ssh:def aul t

    Edit /etc/sshd/ssd_config (nano and vi are included by Solaris) to match your preferences. Do not bother with remote root

    access, RBAC security policcies of Solaris does not allow it and you must logon with a normal user account prior gaining

    root privileges with su -.

    Now enable the SSH server and check its status

    # svcadm enabl e ssh# svcs - a | grep sshonl i ne 8: 07: 01 svc: / network/ ssh:def aul t

    If you get maintenance instead of online it means the service encountered an errorsomewhere, usually it is due to a typo in the the configuration file. You can get some

    information with svcs -x ssh

    SSH is now ready and listen for connections (here coming from any IPv4/IPv6 address hence the two lines):

    # nets t at - an | grep ssh# net st at - an | grep "22"*. 22 *. * 0 0 128000 0 LI STEN*. 22 *. * 0 0 128000 0 LI STEN

    Enabling some extra virtual terminals

    Although the machine is very likely to be used via some SSH remote connections, it can however be useful to have multiple

    virtual consoles enabled on the box (just like in Linux you will be able to switch in between using the magic key combination

    Ctrl-Alt-F). By default Solaris gives you a single console but here is how to enable some extra virtual terminals:

    Enable the service vtdaemon and make sure it runs:

    # scadm enabl e vt daemon# svcs - l vt daemon

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    4 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    9/34

    f mr i svc: / syst em/ vtdaemon: def aul tname vt daemon f or vi r t ual consol e secure swi t chenabl ed t r uest at e onl i nenext_ st ate nonest at e_t i me J une 19, 2011 09: 21: 53 AM EDTl ogf i l e / var/ svc/ l og/ system- vtdaemon: def aul t. l ogrestart er svc: / system/ svc/r est art er: def aul tcont r act _i d 97dependency r equi r e_al l / none svc:/ system/ consol e- l ogi n: defaul t ( onl i ne)

    Another form of service checking (long form) has been used here just for the sake ofdemonstrating.

    Now enable five more virtual consoles (vt2 to vt7) and enable hotkeys property of vtdaemon else you won't be able toswitch between the virtual consoles:

    # f or i i n 2 3 4 5 6 ; do svcadmenabl e consol e-l ogi n: vt$i ; done# svccf g - s vt daemon set prop opt i ons/ hot keys=t r ue

    If you want to disable terminal the screen auto-locking capability of your newly added virtual terminals :

    # svccf g - s vt daemon set prop opt i ons/ secure=f al se

    Refresh the configuration parameters of vtdaemon (mandatory!) and restart it:

    # svcadm r ef r esh vt daemon# svcadm r est art vt daemon

    The service vtdaemon and the virtual consoles are now up and running:

    # svcs - a | grep "vt "onl i ne 9:43: 14 svc: / syst em/ vtdaemon: def aul tonl i ne 9: 43: 14 svc: / system/ consol e-l ogi n: vt3onl i ne 9: 43: 14 svc: / system/ consol e-l ogi n: vt2onl i ne 9: 43: 14 svc: / system/ consol e-l ogi n: vt5onl i ne 9: 43: 14 svc: / system/ consol e-l ogi n: vt6onl i ne 9: 43: 14 svc: / system/ consol e-l ogi n: vt4

    console-login:vtX won't be enabled and online if vtdaemon itself is not enabled and

    online.

    Now try to switch between virtual terminals with Ctrl-Alt-F2 to Ctrl-Alt-F6 (virtual terminals do auto-lock when you switch in

    between if you did not set vtdaemon property options/secure tofalse). You can return to the console with Ctrl-Alt-F2.

    Be a Watt-saver!

    When idle with all of the disk spun up, the SAN Box consumes near 40W at the plug when idle and near 60W when dealing with

    a lot of I/O activities. 40W may not seem a big figure as it represents a big 1kWh/day and 365kWh/year (for your wallet at a rate

    of 10/kWh, an expense of $36.50/year). Not a big deal but preserving every bit of natural resources for Earth sustainability is

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    4 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    10/34

    welcomed and the San Box can easily do its part, especially when you are out of your house for several hours or days :-) Solaris

    has device power management features and they are easy to configure:

    Enforcing power management is mandatory on many non Oracle/Sun/Fujitsu machines

    because Solaris will fail to detect the machine device management capabilities, hence

    leaving this feature disabled. Of course this has the consequence of not having the the

    drives being automatically spun down by Solaris when not solicited for a couple of

    minutes.

    First run the format command (results can differ in your case):

    # f or matSear chi ng f or di sks. . . done

    AVAI LABLE DI SK SELECTI ONS:

    0. c7t 0d0 / pci @0, 0/ pci 1002, 4393@11/ di sk@0, 0

    1. c7t 1d0 / pci @0, 0/ pci 1002, 4393@11/ di sk@1, 0

    2. c7t 2d0 / pci @0, 0/ pci 1002, 4393@11/ di sk@2, 0

    3. c7t 3d0 / pci @0, 0/ pci 1002, 4393@11/ di sk@3, 0

    4. c7t 4d0 / pci @0, 0/ pci 1002, 4393@11/ di sk@4, 0

    Speci f y di sk ( ent er i t s number) :

    Note the disks physical pathes (here pci@0,0/pci1002,4393@11/disk@0,0 to pci@0,0/pci1002,4393@11/disk@3,0 are of

    interest) and exit by pressing Ctrl-C.

    Second, edit the file/etc/power.confand look for the following line:

    aut opm aut o

    Change it for:

    aut opm enabl e

    Third at the end of the same file, add one device-thresholds line per device storage you want to bring down (device-

    thresholds needs the physical path of the device) followed by the delay before reaching the device standby mode. In our

    case we want to wait 10 minutes so it gives:

    devi ce- t hreshol ds / pci @0, 0/ pci 1002, 4393@11/ di sk@0, 0 10mdevi ce- t hreshol ds / pci @0, 0/ pci 1002, 4393@11/ di sk@1, 0 10mdevi ce- t hreshol ds / pci @0, 0/ pci 1002, 4393@11/ di sk@2, 0 10mdevi ce- t hreshol ds / pci @0, 0/ pci 1002, 4393@11/ di sk@3, 0 10m

    Third run pmconfig (no arguments) to make this new configuration active

    This (theoretically) puts the 4 hard drives in standby mode after 10 minutes of inactivity. It is possible that your drives come with

    preconfigured values preventing them from being spun down before a factory-set delay or being spun at all (experiments with

    our WD Caviar Green drives are put in standby with a factory-set delay of ~8 minutes).

    With the hardware used to build the SAN Box, nothing more than 27W is drained from the power plug when all of the

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    11/34

    drives are in standby and when the CPU is idle (32% more energy savings compared to the original 40W) :-).

    At date of writing (June 2011) Solaris does not support adding additional hard-drives to

    an existing to a Z-pool created as a RAID-Z array to increase its size. It is however

    possible to increase the storage space by replacing of all of the drives composing aRAID-Z array one after another (wait for the completion of the re-silvering process after

    each drive replacement).

    The very first step of this section is not creating the Z-Pool but, instead checking if hotplug service is enabled and online as the

    computer case we use has hot-pluggable SATA bays:

    # svcs - l hotpl ugf mri svc: / system/ hotpl ug: defaul tname hot pl ug daemonenabl ed t r uest at e onl i nenext_ st ate nonest at e_t i me J une 19, 2011 10: 22: 22 AM EDTl ogf i l e / var / svc / l og/ system- hotp l ug: def aul t . l ogrestart er svc: / system/ svc/r est art er: def aul tcont r act _i d 61dependency r equi r e_al l / none svc:/ system/ devi ce/ l ocal ( onl i ne)dependency r equi r e_al l / none svc:/ system/ f i l esyst em/ l ocal ( onl i ne)

    In the case the service is shown as disabled, just enable it by doing svcadm enable hotplug and check if it is stated as being

    online after that. When this service is online, the Solaris kernel is aware whenever a disk is be plugged-in or removed from the

    computer frame (your SATA bays must support hot plugging, this is the case here).

    RAID-Z pools can be created in two flavors:

    RAID-Z1 (~RAID-5): supports the failure of only one disk at the price of "sacrificing" the availability one disk for storage(4x 3 Tb gives 9 Tb of storage)

    RAID-Z2 (~RAID-6): supports the failure of up to two disks at the price of "sacrificing" the availability two disks for

    storage (4x 3 Tb gives 6 Tb of storage) and requiring more computational power from the CPU.

    A good trade-off here between computational power/storage (main priority)/reliability is to use RAID-Z1.

    The first step is to identify the logical device name of the drives that will be used for our RAID-Z1 storage pool. If you have

    configured the power-management of the SAN box (see above paragraphs), you should have noticed that the format command

    returns something (e.g. c7t0d0) just before the drive identification string. This is the logical device name we will need

    to create the storage pool!

    0. c7t 0d0 / pci @0, 0/ pci 1002, 4393@11/ di sk@0, 0

    1. c7t 1d0 / pci @0, 0/ pci 1002, 4393@11/ di sk@1, 0

    2. c7t 2d0 / pci @0, 0/ pci 1002, 4393@11/ di sk@2, 0

    3. c7t 3d0

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    12/34

    / pci @0, 0/ pci 1002, 4393@11/ di sk@3, 0

    As each of the SATA drives are connected on their own SATA bus, Solaris see them

    as 4 different "SCSI targets" plugged in the same "SCSI HBA" (one LUN per target

    hence the d0the zpool command will automatically create on each of the given disks a GPT label

    covering the whole disk for you. GPT labels overcome the 2.19TB limit and are

    redundant structures (one copy is put at the beginning of the disk, one exact copy at

    the end of the disk).

    Creating the storage pool (named san here) as a RAID-Z1 array is simply achieved with the following incantation (of course

    adapt to your case... here we use c7t0d0, c7t1d0, c7t2d0 and c7t3d0):

    # zpool cr eat e san rai dz c7t0d0 c7t 1d0 c7t 2d0 c7t 3d0

    Now check status and say hello to your brand new SAN storage pool (notice it has been mounted with its name directly under

    the VFS root):

    # zpool st atus sanpool : san

    st ate: ONLI NEscan: none request ed

    conf i g:

    NAME STATE READ WRI TE CKSUMsan ONLI NE 0 0 0

    r ai dz1- 0 ONLI NE 0 0 0c7t 0d0 ONLI NE 0 0 0c7t 1d0 ONLI NE 0 0 0c7t 2d0 ONLI NE 0 0 0c7t 3d0 ONLI NE 0 0 0

    err ors: No known data er r ors

    # # df - hFi l esystem Si ze Used Avai l Use% Mounted on( . . . )san 8. 1T 45K 8. 1T 1% / san

    The above just says: everything is in order, the pool is fully functional (not in a DEGRADED state).

    Not mandatory but at this point to do will do some (evils) tests.

    Removal of one disk

    What would happen if a disk should die? Too see, simply remove one of the disks from its bay without doing anything else :-).

    The Solaris kernel should see a drive has just been removed:

    # dmes g( . . . )

    J un 19 12: 03: 25 xxxxx genuni x: [ I D 408114 ker n. i nf o] / pci @0, 0/ pci 1002, 4393@11/ di sk@2, 0 ( sd3) r emoved

    Now check the array status:

    # zpool st atus sanpool : san

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    13/34

    st ate: ONLI NEscan: none request ed

    conf i g:

    NAME STATE READ WRI TE CKSUMsan ONLI NE 0 0 0

    r ai dz1- 0 ONLI NE 0 0 0c7t 0d0 ONLI NE 0 0 0c7t 1d0 ONLI NE 0 0 0c7t 2d0 ONLI NE 0 0 0c7t 3d0 ONLI NE 0 0 0

    err ors: No known data er r ors( . . . )

    Nothing?... This is absolutely normal if no write operations occurred on the pool from the time you have removed the hard

    drive. If you create some activity you will see the pool switch in degraded state:

    # dd i f =/ dev/ r andom of =/ san/ t est . dd count =1000# zpool st atus san

    pool : sanst at e: DEGRADED

    st at us: One or more devi ces has been r emoved by t he admi ni st r at or .Suf f i c i ent repl i cas exi st f or t he pool t o cont i nue functi oni ng i n adegraded st at e.

    act i on: Onl i ne t he devi ce usi ng ' zpool onl i ne' or r epl ace t he devi ce wi t h' zpool repl ace' .

    scan: none request edconf i g:

    NAME STATE READ WRI TE CKSUMsan DEGRADED 0 0 0

    r ai dz1- 0 DEGRADED 0 0 0c7t 0d0 ONLI NE 0 0 0c7t 1d0 ONLI NE 0 0 0c7t 2d0 REMOVED 0 0 0c7t 3d0 ONLI NE 0 0 0

    err ors: No known data er r ors

    Returning to normal operation

    Now put back the drive in its bay and see what happens (wait a couple of seconds, the time for the drive for being spun up):

    # dmes g( . . . )

    J un 19 12: 23: 12 xxxxxxx SATA devi ce det ect ed at por t 2J un 19 12: 23: 12 xxxxxxx sat a: [ I D 663010 ker n. i nf o] / pci @0, 0/ pci 1002, 4393@11 :J un 19 12: 23: 12 xxxxxxx sat a: [ I D 761595 ker n. i nf o] SATA di sk devi ce at por t 2J un 19 12: 23: 12 xxxxxxx sat a: [ I D 846691 ker n. i nf o] model WDC WD30EZRS-00J 99B0J un 19 12: 23: 12 xxxxxxx sat a: [ I D 693010 ker n. i nf o] f i r mwar e 80. 00A80J un 19 12: 23: 12 xxxxxxx sat a: [ I D 163988 ker n. i nf o] ser i al number WD- ******** *****J un 19 12: 23: 12 xxxxxxx sat a: [ I D 594940 ker n. i nf o] suppor t ed f eat ur es:J un 19 12: 23: 12 xxxxxxx sat a: [ I D 981177 ker n. i nf o] 48- bi t LBA, DMA, Nat i ve Command Queuei ng, SMART, SMART sel f - t estJ un 19 12: 23: 12 xxxxxxx sat a: [ I D 643337 ker n. i nf o] SATA Gen2 si gnal i ng speed ( 3. 0Gbps)J un 19 12: 23: 12 xxxxxxx sat a: [ I D 349649 ker n. i nf o] Suppor t ed queue dept h 32J un 19 12: 23: 12 xxxxxxx sat a: [ I D 349649 ker n. i nf o] capaci t y = 5860533168 sec t or s

    Solaris now sees the drive again! What about the storage pool?

    # zpool st atus sanpool : san

    st at e: DEGRADEDst at us: One or more devi ces has been r emoved by t he admi ni st r at or .

    Suf f i c i ent repl i cas exi st f or t he pool t o cont i nue functi oni ng i n adegraded st at e.

    act i on: Onl i ne t he devi ce usi ng ' zpool onl i ne' or r epl ace t he devi ce wi t h' zpool repl ace' .

    scan: none request edconf i g:

    NAME STATE READ WRI TE CKSUMsan DEGRADED 0 0 0

    r ai dz1- 0 DEGRADED 0 0 0c7t 0d0 ONLI NE 0 0 0c7t 1d0 ONLI NE 0 0 0c7t 2d0 REMOVED 0 0 0

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    14/34

    c7t 3d0 ONLI NE 0 0 0

    err ors: No known data er r ors

    Still in degraded state, correct. ZFS does not take any initiative when a new drive is connected and it is up to you to handle the

    operations. Let's start by asking the list of the SATA port on the motherboard (we can safely ignore sata0/5 in our case, our

    motherboard chipset has support for 6 SATA ports but only 5 of them have physical connectors on the motherboard PCB):

    # cf gadm- a sat aAp_I d Type Recept acl e Occupant Condi t i on

    sata0/ 0:: dsk/ c7t 0d0 di sk connect ed conf i gur ed oksata0/ 1:: dsk/ c7t 1d0 di sk connect ed conf i gur ed oksat a0/ 2 di sk connect ed unconf i gur ed unknownsata0/ 3:: dsk/ c7t 3d0 di sk connect ed conf i gur ed oksata0/ 4:: dsk/ c7t 4d0 di sk connect ed conf i gur ed oksat a0/ 5 sat a- por t empt y unconf i gur ed ok

    To return to normal state, just "configure" the drive this is accomplished by:

    # cf gadm - c conf i gure sata0/ 2# cf gadm- a sat aAp_I d Type Recept acl e Occupant Condi t i onsata0/ 0:: dsk/ c7t 0d0 di sk connect ed conf i gur ed oksata0/ 1:: dsk/ c7t 1d0 di sk connect ed conf i gur ed oksata0/ 2:: dsk/ c7t 2d0 di sk connect ed conf i gur ed oksata0/ 3:: dsk/ c7t 3d0 di sk connect ed conf i gur ed ok

    sata0/ 4:: dsk/ c7t 4d0 di sk connect ed conf i gur ed ok

    What happens on the storage pool side?

    # zpool st atus sanpool : san

    st ate: ONLI NEscan: r esi l ver ed 24. 6M i n 0h0m wi t h 0 err ors on Sun J un 19 12: 45: 30 2011

    conf i g:

    NAME STATE READ WRI TE CKSUMsan ONLI NE 0 0 0

    r ai dz1- 0 ONLI NE 0 0 0c7t 0d0 ONLI NE 0 0 0c7t 1d0 ONLI NE 0 0 0c7t 2d0 ONLI NE 0 0 0

    c7t 3d0 ONLI NE 0 0 0

    err ors: No known data er r ors

    Fully operational again! Note that the system took the initiative of re-silvering the storage pool for you :-) This is very quick here

    because the pool is empty but it can take several minutes or hours to complete. Here we just pulled back the drive, ZFS is

    intelligent enough scan what the drive contain and know it has to attach the drive the "san" storage pool. In the case we replace

    the drive with a brand new blank one, we would have explicitly tell to replace the drive (Solaris will resilver the array but you

    will still see it in a degraded state). Assuming the old and new device are assigned to the same drive bay (here giving c7t2d0):

    # zpool r epl ace c7t 2d0 c7t 2d0

    Removal of two disks

    What would happen if two disks dies? Do not do this but for the sake of demonstration we have removed two disks from their

    bays. When the second drive is removed, the following appears on the console (also visible with dmesg)

    SUNW- MSG- I D: ZFS- 8000- HC, TYPE: Er r or , VER: 1, SEVERI TY: Maj orEVENT- TI ME: Sun J un 19 13: 19: 44 EDT 2011PLATFORM: To- be- f i l l ed- by- O. E. M. , CSN: To- be- f i l l ed- by- O. E. M. , HOSTNAME: urani umzf s- di agnosi s, REV: 1. 0EVENT- I D: 139ee67e- e294- 6709- 9225- bf f 20ea7418eDESC: The ZFS pool has experi enced curr ent l y unrecover abl e I / O

    f ai l ures. Ref er to htt p:/ / sun. com/ msg/ ZFS- 8000- HC f or more i nfor mati on.AUTO- RESPONSE: No aut omat ed res ponse wi l l be taken.I MPACT: Read and wr i t e I / Os cannot be ser vi ced.

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    15/34

    REC- ACTI ON: Make sur e t he af f ect ed devi ces ar e connect ed, t hen run' zpool c lear ' .

    Any current I/O operation is suspended (and the processes process waiting after them are put in uninterruptible sleep state)....

    Querying the pool status gives:

    # zpool st atus sanpool : san

    st at e: UNAVAI Lst atus: One or more devi ces are f aul t ed i n r esponse to I O f ai l ures.acti on: Make sur e t he af f ected devi ces are connected, t hen r un ' zpool cl ear ' .

    see: htt p: / / www. sun. com/ msg/ ZFS-8000- HCscan: r esi l ver ed 24. 6M i n 0h0m wi t h 0 err ors on Sun J un 19 12: 45: 30 2011

    conf i g:

    NAME STATE READ WRI TE CKSUMsan UNAVAI L 0 0 0 i nsuf f i ci ent r epl i cas

    r ai dz1- 0 UNAVAI L 0 0 0 i nsuf f i ci ent r epl i casc7t 0d0 ONLI NE 0 0 0c7t 1d0 REMOVED 0 0 0c7t 2d0 REMOVED 0 0 0c7t 3d0 ONLI NE 0 0 0

    err ors : 2 dat a err ors , use ' - v' f or a l i s t

    How we will use the storage pool? Well, this is not a critical question although it remains important because you can make a zvol

    grow on the fly (of course this make sens if the filesystem on it has also the capability to grow). For the demonstration we will

    use the following layout:

    san/os 2Tb - Operating systems local mirror (Several Linux, Solaris and *BSD)san/homes - 1Tb - Home directories of the other Linux boxes

    test - 20 Gb - Scrapable test data

    To use the pool space in a clever manner, we will use allocation a space allocation policy known as thin provisioning (also known

    as sparse volume, notice the -s in the commands):

    # zf s create - s - V 2. 5t san/ os# zf s cr eat e - s - V 500g san/ home- host 1# zf s cr eat e - s - V 20g san/ t est# z f s l i s t( . . . )san 3. 10T 4. 91T 44. 9K / sansan/ home- host 1 516G 5. 41T 80. 8K -san/ os 2. 58T 7. 48T 23. 9K -san/ t est 20. 6G 4. 93T 23. 9K -

    Notice the last column in the last example: it shows no path (only a dash!) And doing ls -la in our SAN pool reports:

    # l s - l a / sant o t al 4drwxr- xr- x 2 r oot r oot 2 2011- 06- 19 13: 59 .drwxr- xr- x 25 root r oot 27 2011- 06- 19 11: 33 . .

    This is absolutely correct, zvols are a bit different of standard datasets. If you sneak what lies under /dev you will notice a

    pseudo-directory named zvol. Let's see what it contains:

    # l s - l / dev/zvolt o t al 0drwxr - xr - x 3 r oot sys 0 2011- 06- 19 10: 22 dskdrwxr- xr- x 4 r oot sys 0 2011- 06- 19 10: 34 r dsk

    Ah! Interesting, what lies beneath? Let's take the red pill and dive one level deeper (we choose arbitrarily rdsk, dsk will show

    similar results):

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    16/34

    # l s - l / dev/ zvol / rdskt o t al 0drwxr- xr- x 6 root sys 0 2011-06-19 10:34 rpool 1drwxr - xr - x 2 r oot sys 0 2011- 06- 19 14: 04 san

    rpool1 correspond to the system pool (your Solaris installation in itself lives in a zpool, so you can do snapshots and rollbacks

    with it!) and san is our brand new storage pool. What is inside the san directory?

    # l s - l / dev/ zvol / rdsk/ san

    l r wxrwxrwx 1 root r oot 0 2011- 06- 19 14: 11 home-host 1 -> . . / . . / . . / . . / / devi ces/ pseudo/zf s@0: 3, r awl r wxrwxrwx 1 r oot r oot 0 2011- 06- 19 14: 11 os - > . . / . . / . . / . . / / devi ces/ pseudo/zf s@0: 4, rawl r wxrwxrwx 1 r oot r oot 0 2011- 06- 19 14: 11 t est - > . . / . . / . . / . . / / devi ces/ pseudo/zf s@0: 5, r aw

    Wow! The three virtual zvols we have created are just seen just as if there were three physical physical disks but indeed

    they lies in a storage pool created as a RAID-Z1 array. "Just like" really means what it says i.e. you can do everything with

    them just as you would do with real disks (format won't see them however because if just scans /dev/rdsk and not /dev/zpool

    /rdsk).

    iSCSI will bring the cheerbeery on the sundae!

    Now the real fun begins!

    Solaris 11 uses a new COMSTAR implementation which differs from the COMSTAR

    implementation found in Solaris 10. This new version brings some slight changes like not

    supporting the iscsishare property on zvols.

    Before going further, you must know iSCSI a bit and several fundamental concepts (it is not complicated but you need to get a

    precise idea of it). First, iSCSI is exactly what its name suggest: a protocol to carry SCSI commands over a network . In the

    TCP/IP world, iSCSI relies on TCP (and not UDP) thus, splitting/retransmission/reordering of iSCSI packet are transparently

    done by the magic of the TCP/IP stack. Should the network connection break and as being a stateful protocol, TCP will be aware

    of the situation. So no gambling on data integrity here: iSCSI disks are as reliable as if they were DAS (Directly Attached

    Storage) peripherals. Of course using TCP does not avoid packet tampering and man-in-the-middle attacks and just like any other

    protocol relying on TCP.

    Altough the iSCSI is a typical client-server architecture, the stakeholders are given the

    name nodes. A node can act either as a server either as client or can mix the roles (use

    remote iSCSI disks and provides its own iSCSI disk to others iSCSI nodes)

    Because your virtual SCSI cable is nothing more than a network stream, you can do whatever you want starting by applying QoS

    (Quality of Service) policies to it in your routers, encrypting it or sniff it if you are curious about the details. The drawback for

    the end-user of an iSCSI disk is the network speed: slow or overloaded networks will make I/O operations from/to iSCSI devices

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    17/34

    extremely irritating... This is not necessarily an issue for our SAN box used in a home gigagbit LAN with a few machines but it

    will become an issue if you intend to share your iSCSI disk over the Internet or use remote ISCSI disk over the Internet.

    Initiator, target and LUN

    It is important to figure out some concepts before going further (if you are used with SCSI, iSCSI will sound familiar). Here are

    some formal definitions coming from stmfadm(1M):

    Initiator: a device responsible for issuing SCSI I/O commands to a SCSI target and logical unit.

    Target: a device responsible for receiving SCSI I/O commands for a logical unit.logical unit number (LUN): A device within a target responsible for executing SCSI I/O commands

    The initiator and the target are respectively the client and a server sides: an iSCSI initiator establishes connection to an iSCSI

    target (located on a remote iSCSI node). A target can contain one or more logical devices which executes the SCSI I/O

    commands.

    Initiator and targets can :

    be either pure software solutions (just like here) or hardware solutions (iSCSI HBA, costs a couple of hundreds of dollars)

    with dedicated electronic boards to off-load the computers CPUs. In the case of pure software initiator/and targets several

    open source/commercial solutions exist like Open iSCSI (http://www.open-iscsi.org) (initiator), Enterprise iSCSI Target

    (http://iscsitarget.sourceforge.net) (target), COMSTAR (target -- Solaris specific) and several others.

    use authentication mechanisms (CHAP). It is possible to do either one-way authentication (the target authenticates theinitiator) or two-way authentication (the initiator and the target mutually authenticate themselves).

    To make a little analogy: Imagine you are a king (iSCSI "client" node) who need to establish communication channels with

    friendly warlord (LUN) in remote castle (iSCSI "server" node):

    You will send a messenger (initiator) who will travel trough dangerous lands (network) for you1.

    Your messenger will knock at the door D1 (target) of the remote castle (server).2.

    The guard at the door tries to determine if your messenger has really sent by you (CHAP authentication).3.

    Once authenticated your messenger who only knows his interlocutor by a nickname (LUN) says "I need to speak to the

    someone called here the one-eyed". The guard knows that the real name (physical device) of this famous the one-eyedand

    can make your messenger getting in touch with him.

    4.

    A bit schematic but things work exactly in that way. Oh something the previous does not illustrate: aside of emulating a remote

    SCSI disk, some iSCSI implementations have the capability to forward the command stream to a directly attached SCSI storage

    device in total transparency.

    IQN vs EUI

    Each initiator and target is given unique identifier, just like a World Wide Name (WWN) in the Fiber Channel world. This

    identifier adopt three different formats:

    iSCSI Qualified Name (IQN): Described in RFC 3720 (http://tools.ietf.org/html/rfc3720) , an IQN identifier is composed of

    the following

    Extended-unique identifier (EUI):

    T11 Network Address Authority: Described in RFC 3980 (http://tools.ietf.org/html/rfc3980) , NAA identifiers brings iSCSIidentifiers compatible with naming conventions used in Fibre Channel (FC) and Serial Attached SCSI (SAS) storage

    technologies.

    IQN remains probably the most common seen identifier format. An IQN identifier can be up to 223 ASCII characters long and is

    composed of the following four parts:

    the litteral iqnthe date the naming authority took the ownership of its internet domain name

    the internet domain (reversed order) of the naming autorityoptional: A double colon ":" prefixing an arbitrary storage designation

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    18/34

    For example, all COMSTAR IQN identifiers starts with: iqn.1986-03.com.sun where as open iSCSI IQN starts with

    iqn.2005-03.org.open-iscsi. Follwing are all valid IQNs:

    iqn.2005-03.org.open-iscsi:d8d1606ef0cb

    iqn.2005-03.org.open-iscsi:storage-01

    iqn.1986-03.com.sun:01:0003ba2d0f67.46e47e5d

    In the case of COMSTAR, the optional part of IQN can have two forms (http://wikis.sun.com/display/StorageDev

    /IQN+Name+Format) :

    :01:.[.] (used by iSCSI initiators)

    :02:. (used by iSCSI targets)

    Where:

    : the 12 lowercase hexadecimal characters composing a 6 bytes MAC address (MAC = EFGHIF -->

    45464748494a)

    : the lowercase hexadecimal representation of the number of seconds elapsed since January 1, 1970 at the

    time the name is created

    : an extension that provides a name that is meant to be meaningful to users.

    : a unique identifier consisting of 36 hexadecimal characters in the format xxxxxxxx xxxx xxxx xxxx

    xxxxxxxxxxxx

    : a name used by the target.

    The following has been tested under Solaris Express 11, various tutorial around gives

    explanations for Solaris 10. Both versions have important differences (e.g. Solaris 11 has

    no more support for the shareiscsi property and uses a new COMSTAR implementation).

    Installing pre-requisites

    The first step is to install the iSCSI COMSTAR packages (they are not present in Solaris 11 Express, you must install them):

    # pkg i nsta l l i scs i / t argetPackages t o i nst al l : 1

    Cr eat e boot envi r onment : NoServi ces t o restart : 1

    DOWNLOAD PKGS FI LES XFER ( MB)Compl et ed 1/ 1 14/ 14 0. 2/ 0. 2

    PHASE ACTI ONSI nstal l Phase 48/ 48

    PHASE I TEMSPackage St at e Updat e Phase 1/ 1I mage St at e Updat e Phase 2/ 2

    Now enable the iSCSI Target service, if you have a look at its dependencies you will see it depends on the SCSI Target Mode

    Framework (STMF):

    # svcs - l i scsi / ta rgetf mri svc: / net work/ i scsi / t arget : def aul tname i scsi t argetenabl ed f al sest at e di sabl ednext_ st ate nonest at e_t i me J une 19, 2011 06: 42: 56 PM EDT

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    19/34

    restart er svc: / system/ svc/r est art er: def aul tdependency r equi r e_any/ err or svc: / mi l est one/ net work ( onl i ne)dependency r equi r e_al l / none svc:/ system/ st mf : def aul t ( di sabl ed)

    # svcadm - r enabl e i scsi / t arget# svcs - l i scsi / ta rgetf mri svc: / net work/ i scsi / t arget : def aul tname i scsi t argetenabl ed t r uest at e onl i nenext_ st ate nonest at e_t i me J une 19, 2011 06: 56: 57 PM EDTl ogf i l e / var / svc / l og/ network- i scs i - target : def aul t . l ogrestart er svc: / system/ svc/r est art er: def aul t

    dependency r equi r e_any/ err or svc: / mi l est one/ net work ( onl i ne)dependency r equi r e_al l / none svc:/ system/ st mf : def aul t ( onl i ne)

    Do services are online? Good! Notice the -r it just asks "enable all of the dependencies".

    Creating the Logical Units (LU)

    You have two ways of creating LUs:

    stmfadm (used here, provides more control)

    sdbadm

    The very first step of exporting our zvols is to create their correponding LUs. This is accomplished by:

    # st mf adm create-l u / dev/ zvol / r dsk/ san/ t estLogi cal uni t cr eated: 600144F0200ACB0000004E0505940007# st mf adm create-l u / dev/ zvol / r dsk/ san/ osLogi cal uni t c r eat ed: 600144F0200ACB0000004E05059A0008# st mf adm cr eat e-l u / dev/ zvol / r dsk/ san/ homesLogi cal uni t c r eat ed: 600144F0200ACB0000004E0505A40009

    Now just for the sake of demonstration, we will control with two different commands:

    # sbdadm l i s t - l u

    Found 3 LU( s)

    GUI D DATA SI ZE SOURCE- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -600144f 0200acb0000004e0505940007 2748779069440 / dev/ zvol / r dsk/ san/ os600144f 0200acb0000004e05059a0008 536870912000 / dev/ zvol / r dsk/ san/ home- host 1600144f 0200acb0000004e0505a40009 21474836480 / dev/ zvol / r dsk/ san/ t est

    # stmf adm l i s t - l uLU Name: 600144F0200ACB0000004E0505940007LU Name: 600144F0200ACB0000004E05059A0008LU Name: 600144F0200ACB0000004E0505A40009

    # stmf adm l i s t - l u - vLU Name: 600144F0200ACB0000004E0505940007

    Operati onal St atus: Onl i nePr ovi der Name : sbdAl i as : / dev/ zvol / rdsk/san/ osVi ew Ent r y Count : 1Dat a Fi l e : / dev/zvol / rdsk/ san/ osMeta Fi l e : not setSi ze : 2748779069440Bl ock Si ze : 512Management URL : not setVendor I D : SUNPr oduct I D : COMSTARSeri al Num : not setWr i t e Protect : Di sabl ed

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    20/34

    Wr i t eback Cache : Enabl edAccess St ate : Acti ve

    LU Name: 600144F0200ACB0000004E05059A0008Operati onal St atus: Onl i nePr ovi der Name : sbdAl i as : / dev/ zvol / r dsk/ san/ home-host 1Vi ew Ent r y Count : 1Data Fi l e : / dev/ zvol / r dsk/ san/ home-host 1Meta Fi l e : not setSi ze : 536870912000Bl ock Si ze : 512Management URL : not setVendor I D : SUNPr oduct I D : COMSTAR

    Seri al Num : not setWr i t e Protect : Di sabl edWr i t eback Cache : Enabl edAccess St ate : Acti ve

    LU Name: 600144F0200ACB0000004E0505A40009Operati onal St atus: Onl i nePr ovi der Name : sbdAl i as : / dev/ zvol / rdsk/san/ testVi ew Ent r y Count : 1Dat a Fi l e : / dev/zvol / rdsk/ san/ t estMeta Fi l e : not setSi ze : 21474836480Bl ock Si ze : 512Management URL : not setVendor I D : SUNPr oduct I D : COMSTARSeri al Num : not setWr i t e Protect : Di sabl edWr i t eback Cache : Enabl edAccess St ate : Acti ve

    You can change some properties like the alias name if you wish

    # st mf admmodi f y- l u - p al i as=oper at i ng- sys t ems 600144F0200ACB0000004E0505940007# st mf adml i st - l u - v 600144F0200ACB0000004E0505940007LU Name: 600144F0200ACB0000004E0505940007

    Operati onal St atus: Onl i nePr ovi der Name : sbdAl i as : operati ng- systemsVi ew Ent r y Count : 0Dat a Fi l e : / dev/zvol / rdsk/ san/ osMeta Fi l e : not setSi ze : 2748779069440Bl ock Si ze : 512Management URL : not setVendor I D : SUN

    Pr oduct I D : COMSTARSeri al Num : not setWr i t e Protect : Di sabl edWr i t eback Cache : Enabl edAccess St ate : Acti ve

    GUID is the key you will use to manipulate the LUs (if you want to delete one for example).

    Mapping concepts

    To be able to be seen by iSCSI initators we must do an operation called "mapping". To map our brand new LUs we can adopt

    two strategies:

    Make a LU visible to ALL iSCSI initiators an EVERY port (simple to setup but poor in terms of security)

    Make a LU visible to certain iSCSI initiators via certain ports (selective mapping)

    Under COMSTAR, a LU mapping is done via something called a view. A view is a table consisting of several view entries each

    one of those entries consisting in the association of a unique {target group (tg), initiator group (ig), Logical Unit Number (LUN)}

    triplet to a Logical Unit (of course, a single LU can be used by several triplets). Solaris SCSI target mode framework command

    line utilities are smart enough to detect and refuse duplicate {tg,ig,lun} entries, so you worry about them. The terminology used

    in the COMSTAR manual pages is a bit confusing because it uses "host group" instead of "initiator group" but one and the other

    are the same if you read stmfadm(1M) carefully.

    Is target group a set of several targets and host (initiator) group a set of initiators? Yes absolutely with the following nuance: an

    initiator can not be a member of two hosts groups. The LUN is just an arbitrary ordinal reference to LU.

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    21/34

    How the view will work? let's say we have a view defined like this (this is a technically not exact because we use targets and

    initiators and not groups of them):

    Target Initiator LUN GUID of Logical unit

    iqn.2015-01.com.myiscsi.server01:storageC iqn.2015-01.com.myorg.host04 13 600144F0200ACB0000004E05059A0008

    iqn.2015-01.com.myiscsi.server01:storageC iqn.2015-01.com.myorg.host04 17 600144F0200ACB0000004E0505940007

    iqn.2015-01.com.myiscsi.server03:storageA iqn.2015-01.com.myorg.host01 12 600144F0200ACB0000004E05059A0008

    Internally, this iSCSI server knows what are the real devices (or zvols) hidden behind 600144F0200ACB0000004E0505940007

    and 600144F0200ACB0000004E05059A0008 (see the previous paragraph).

    Suppose the initiatoriqn.2015-01.com.myorg.host04 establishes a connection the target

    iqn.2015-01.com.myiscsi.server01:storageC, it will be presented 2 LUNs (13 and 17). On the iSCSI client machine each one of

    those will appear as distinct SCSI drives (Linux would show them as for example /dev/sdg and /dev/sdh where as Solaris would

    present their logical path as /dev/c4t1d0 and /dev/c4t2d0). However the same initiator connecting to target

    iqn.2015-01.com.myiscsi.server03:storageA will have no LUNs in its line of sight.

    Setting up the LU mapping

    We start by creating a target, this is simply accomplished by:

    # i t adm creat e-t argetTar get i qn. 1986- 03. com. sun: 02: 2e5f aacb- 4bdf - 4f 7f - e643- ebc8bf 856603 successf ul l y cr eat ed# stmf adm l i st - t arget - v

    Tar get : i qn. 1986- 03. com. sun: 02: 2e5f aacb- 4bdf - 4f 7f - e643- ebc8bf 856603Operati onal St atus: Onl i neProvi der Name : i scsi tAl i as : -Prot ocol : i SCSISessi ons : 0

    The trailing UUIDs will be different for you

    Second, we must create the target group (view entries are composed of targets groups and initiators groups) arbitrarily named

    sbtg-01:

    # st mf adm cr eat e-t g sbt g-01

    Third, we add the target to the brand new target group:

    # st mf adm add- t g- member - g sbtg- 01 i qn. 1986- 03. com. sun: 02: 2e5f aacb- 4bdf - 4f 7f - e643- ebc8bf 856603st mf adm: STMF t arget must be of f l i ne

    ooops.... To be added a target must be put offline first:

    # st mf adm of f l i ne- t arget i qn. 1986- 03. com. sun: 02: 2e5f aacb- 4bdf- 4f7f - e643- ebc8bf 856603# st mf adm add- t g- member - g sbtg- 01 i qn. 1986- 03. com. sun: 02: 2e5f aacb- 4bdf - 4f 7f - e643- ebc8bf 856603# st mf adm onl i ne- t arget i qn. 1986- 03. com. sun: 02: 2e5f aacb- 4bdf - 4f7f - e643- ebc8bf 856603# stmf adm l i s t - tg - v

    Tar get Gr oup: sbt g- 01Member : i qn. 1986- 03. com. sun: 02: 2e5f aacb- 4bdf - 4f 7f - e643- ebc8bf 856603

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    22/34

    Now the turn of the initiator group (aka the host group, terminology is a bit confusing). To make thins simple we will use two

    initiators:

    One located on the SAN Box it self1.

    One located on a remote Funtoo Linux box2.

    The first initiator is already present on the SAN box because it has been automatically created at the iSCSI packages installation.

    To see it:

    # i scs i adm l i s t i ni t i ato r - nodeI ni t i at or node name: i qn. 1986- 03. com. sun: 01: 809a71be02f f . 4df 93a1eI ni t i ator node al i as : urani um

    Logi n Par ameter s (Def aul t / Conf i gur ed) :Header Di gest : NONE/ -Dat a Di gest : NONE/ -

    Authenti cat i on Type: NONERADI US Ser ver : NONERADI US Access: di sabl edTunabl e Parameter s ( Def aul t / Conf i gur ed) :

    Sessi on Logi n Response Ti me: 60/ -Maxi mum Connect i on Ret r y Ti me: 180/ -Logi n Retr y Ti me I nt erval : 60/ -

    Conf i gur ed Sessi ons: 1

    It is possible to create additional initiators on Solaris with itadm create-initiator

    The second initiator (Open iSCSI is used, location can differ in your case if you use something else) IQN can be found on the

    Funtoo box in the file /etc/iscsi/initiatorname.iscsi:

    # cat / et c/ i scsi / i ni t i atorname. i scsi | grep - e " I ni t i at orName"I ni t i at orName=i qn. 2011- 06. l an. mydomai n. wors kt at i on01: openi scsi - a902bcc1d45e4795580c06b1d66b2eaf

    To create a group encompassing both initiators (again the name sbhg-01 is arbitrary):

    # st mf adm cr eate- hg sbhg- 01# st mf adm add- hg- member - g s bhg- 01 i qn. 1986- 03. com. sun: 01: 809a71be02f f . 4df 93a1e i qn. 2011- 06. l an. mydomai n. wor skt at i on01: open# stmf adm l i st - hg -vHost Gr oup: sbhg- 01

    Member: i qn. 1986-03. com. sun: 01: 809a71be02f f . 4df 93a1eMember: i qn. 2011-06. l an. mydomai n. wor skt at i on01: openi scs i - a902bcc1d45e4795580c06b1d66b2eaf

    In this simple case we have only one initiator group and one target groups so we will show all of our LU trough it:

    # st mf adm add- vi ew - n 10 - t sbt g- 01 - h sbhg-01 600144F0200ACB0000004E0505940007# st mf adm add- vi ew - n 11 - t sbt g- 01 - h sbhg-01 600144F0200ACB0000004E05059A0008# st mf adm add- vi ew - n 12 - t sbt g- 01 - h sbhg-01 600144F0200ACB0000004E0505A40009

    If -n would not specified, stmfadm will automatically assign a LU number (LUN).Here again, 10 11 and 12 are arbitrary numbers (we didn't started with 0 just for the

    sake of the demonstration )

    Checking on one LU gives:

    # st mf adml i st - vi ew - l 600144F0200ACB0000004E0505940007

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    23/34

    Vi ew Ent r y: 0Host group : sbhg- 01Tar get group : sbtg- 01LUN : 10

    It is absolutely normal to get single result here because a single entry concerning that LU has been created. If the LU was

    referenced in another {tg,ig,lun} triplet you should have seen twice.

    Good news everything is in order! The bad one: we have not finished (yet) :-)

    Restricting the listening interface of a target (portals and portal groups)

    As you may have noticed in the previous paragraph, once created a target is listening for incoming connection on ALL available

    interfaces and this is not always suitable. Imagine you have a server 1 Gbits/s NIC and 10 Gbit/s NIC, you may want to bind

    certain targets to your 1 Gbit/s NIC and the others to your 10 Gbit/s. Is it possible to gain control over the NIC the target bnids

    on? The answer is:YES!

    This is accomplished via target portal groups. The mechanic is very similar to what was done before: we create a target portal

    group (named tpg-localhosthere)

    # i t adm creat e-t pg tpg- l ocal host 127. 0. 0.1: 3625# i tadm l i s t - tpg - v

    TARGET PORTAL GROUP PORTAL COUNT

    t pg- l ocal host 1port al s: 127.0. 0.1: 3625# i t adm modi f y- t arget - t t pg- l ocal host i qn. 1986- 03. com. sun: 02: a02da0f 0-195f- c9ea-ce9a- 82ec96fa36cb# i t adm l i st - t arget - v i qn. 1986- 03. com. sun: 02: a02da0f0- 195f - c9ea- ce9a- 82ec96f a36cb

    TARGET NAME STATE SESSI ONSi qn. 1986- 03. com. sun:02: a02da0f 0- 195f - c9ea- ce9a- 82ec96f a36cb onl i ne 0

    al i as : -aut h: none ( defaul t s)t argetchapuser: -t argetchapsecr et: unsett pg- t ags: t pg- l ocal host = 2

    Notice what lies in the field tpg-tags...

    Testing from the SAN Box

    Every bits are in place, now it is time to do some tests. Before an initiator exchange I/O command with a LUN localized in a

    target it must know about what the given target contents. This process is known as discovering.

    Static discovery: this one is not really a discovery mechanism it just a manual binding of an iSCSI target to a particular

    NIC (IP Address+TCP port)

    SendTargets (dynamic mechanism): SendTargets is a simple iSCSI discovery protocol integrated to the iSCSI specification

    (see appendix D of RFC 3720 (http://www.ietf.org/rfc/rfc3720.txt) ).

    Internet Storage Name Service - iSNS(dynamic mechanism): Defined in RFC 4171 (http://www.ietf.org/rfc/rfc4171.txt) ,

    iSNS is bit more sophisticated than SendTarget (the protocol can handle state modification of an iSCSI node such as when

    a tagrgets becomes offline for example). Quoting RFC 4171 (http://www.ietf.org/rfc/rfc4171.txt) , "iSNS facilitates a

    seamless integration of IP and Fibre Channel networks due to its ability to emulate Fibre Channel fabric services and tomanage both iSCSI and Fibre Channel devices."

    In a first move, we will use the static discovery configuration to test the zvols accessibility through a loopback connection from

    the SAN box to itself. First we enable the static discovery:

    # i scsi adm modi f y di scover y - - st ati c enabl e# i scsi adm l i s t di scoveryDi scovery:

    St at i c : enabl edSend Tar get s: di sabl edi SNS: di sabl ed

    Second, we manually trace a path to the "remote" iSCSI target located on the SAN box itself:

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    24/34

    # i scsi adm add st ati c- conf i g i qn. 1986- 03. com. sun: 02: 2e5faacb- 4bdf - 4f7f - e643- ebc8bf856603, 127. 0. 0. 1

    A bit silent at first glance but let's be curious a bit and check what the kernel said at the exact moment the static discovery

    configuration has been entered:

    J un 24 17: 53: 38 **** sc si : [ I D 583861 ker n. i nf o] sd10 at sc si _vhci 0: uni t - addr ess g600144f 0200acb0000004e0505940007: f _t pgsJ un 24 17: 53: 38 **** genuni x: [ I D 936769 ker n. i nf o] sd10 i s / sc si _vhci / di sk@g600144f 0200acb0000004e0505940007J un 24 17: 53: 38 **** genuni x: [ I D 408114 ker n. i nf o] / sc si _vhci / di sk@g600144f 0200acb0000004e0505940007 ( sd10) onl i neJ un 24 17: 53: 38 **** genuni x: [ I D 483743 ker n. i nf o] / sc si _vhci / di sk@g600144f 0200acb0000004e0505940007 ( sd10) mul t i pat h st atJ un 24 17: 53: 38 **** sc si : [ I D 583861 ker n. i nf o] sd11 at sc si _vhci 0: uni t - addr ess g600144f 0200acb0000004e05059a0008: f _t pgsJ un 24 17: 53: 38 **** genuni x: [ I D 936769 ker n. i nf o] sd11 i s / sc si _vhci / di sk@g600144f 0200acb0000004e05059a0008J un 24 17: 53: 38 **** genuni x: [ I D 408114 ker n. i nf o] / sc si _vhci / di sk@g600144f 0200acb0000004e05059a0008 ( sd11) onl i neJ un 24 17: 53: 38 **** genuni x: [ I D 483743 ker n. i nf o] / sc si _vhci / di sk@g600144f 0200acb0000004e05059a0008 ( sd11) mul t i pat h st atJ un 24 17: 53: 38 **** sc si : [ I D 583861 ker n. i nf o] sd12 at sc si _vhci 0: uni t - addr ess g600144f 0200acb0000004e0505a40009: f _t pgsJ un 24 17: 53: 38 **** genuni x: [ I D 936769 ker n. i nf o] sd12 i s / sc si _vhci / di sk@g600144f 0200acb0000004e0505a40009J un 24 17: 53: 38 **** genuni x: [ I D 408114 ker n. i nf o] / sc si _vhci / di sk@g600144f 0200acb0000004e0505a40009 ( sd12) onl i neJ un 24 17: 53: 38 **** genuni x: [ I D 483743 ker n. i nf o] / sc si _vhci / di sk@g600144f 0200acb0000004e0505a40009 ( sd12) mul t i pat h st at

    Wow, a lot of useful information here:

    it says that the target iqn.1986-03.com.sun:02:2e5faacb-4bdf-4f7f-e643-ebc8bf856603 is online1.

    it shows the 3 LUNs (disks) for that target (did you noticed the numbers 10, 11 and 12 ? They are the same we used when

    defining the view)

    2.

    it says it "multi path" is in a "degraded" state. At this point this is absolutely normal as we have only one path to the

    "remote" 3 disks.

    3.

    And the ultimate test: what wouldformatsay? Test it!

    # f or matSear chi ng f or di sks. . . done

    c0t 600144F0200ACB0000004E0505940007d0: conf i gur ed wi t h capaci t y of 2560. 00GB

    AVAI LABLE DI SK SELECTI ONS:0. c0t 600144F0200ACB0000004E0505A40009d0

    / scs i _vhci / di sk@g600144f 0200acb0000004e0505a400091. c0t 600144F0200ACB0000004E05059A0008d0

    / scs i _vhci / di sk@g600144f 0200acb0000004e05059a00082. c0t 600144F0200ACB0000004E0505940007d0

    / scs i _vhci / di sk@g600144f 0200acb0000004e0505940007

    3. c7t 0d0 / pci @0, 0/ pci 1002, 4393@11/ di sk@0, 04. c7t 1d0

    / pci @0, 0/ pci 1002, 4393@11/ di sk@1, 05. c7t 2d0

    / pci @0, 0/ pci 1002, 4393@11/ di sk@2, 06. c7t 3d0

    / pci @0, 0/ pci 1002, 4393@11/ di sk@3, 07. c7t 4d0

    / pci @0, 0/ pci 1002, 4393@11/ di sk@4, 0Speci f y di sk ( ent er i t s number) :

    Perfect! You can try to select a disk then partition it if you wish.

    When selecting a iSCSI disk you will have to use the fdisk option first if Solaris complainswith: WARNING - This disk may be in use by an application that has modified the fdisk

    table. Ensure that this disk is not currently in use before proceeding to use fdisk. For

    some reasons (not related to thin provisioning), it appeared that some iSCSI drives

    appeared with a corrupted partition table.

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    25/34

    Now we have a functional Solaris box with several emulated volumes accessed just like they were a DAS (Direct Attached

    Storage) SCSI hard disks. Now the $100 question: "Once a filesystem has been created on the remote iSCSI volume can I mount

    it from several hosts of my network?". Answer is: yes you can but only one of them must have the volume mounted as read-write

    if you intend to use a traditional (ZFS, UFS, EXT2/3/4, Reiser, BTRFS, JFS...) filesystem on it. Things are different if you intend

    to use the remote iSCSI disks for a cluster and use a filesystem like GFS/OCFS2 or Lustre on them. Clusters dedicated filesystem

    like GFS/OCFS2/Lustre are designed from the ground up to support concurrent read-write access (distributed locking) from

    several hosts and are smart enough to avoid data corruption.

    Just to be sure you are aware:

    Do not mount an iSCSI remote disk in read-write mode from several hosts unless you

    intend to use the disk with a filesystem dedicated to clusters like GFS/OCFS2 or

    Lustre. Not respecting this principle will make your data being corrupted.

    If you need to share the content of your newly created volume between several hosts of your network, the safest ways is to

    connect to the iSCSI volume from one of your hosts then configure on this host a NFS share. An alternative strategy is, on the

    SAN Box, set the sharenfs property (zfs set sharenfs=on /path/of/zvol) on the zvol and mount this NFS share from all of your

    client hosts. This technique suppose you have formatted the zvol with a filesystem readable by Solaris (UFS or ZFS). Remeber:

    UFS is not endianness neutral. A UFS filesystem created from a SPARC machine won't be readable on a x86 machine and

    vice-versa. For us here NFS have limited interest in the present context :-)

    You have at least two choices for setting up an iSCSI initiator:

    sys-block/iscsi-initiator-core-toolssys-block/open-iscsi (use at least the version 2.0.872, it contains several bug fixes and includes changes required to be

    used with Linux 2.6.39 and Linux 3.0). We will use this alternative here.

    Quoting Open iSCSI README (http://www.open-iscsi.org/docs/README) , Open iSCSI is composed of two parts:

    some kernel modules (built-in in Linux Kernel sources)

    several userland bits:A management tool (iscsiadm) for the persistant Open iSCSI database

    A daemon (iscsid) that handle all of the initiator iSCSI magic in the background for you. Its role is implements

    control path of iSCSI protocol, plus assuming some management facilities like automatically re-start discovery at

    startup, based on the contents of persistent iSCSI database.

    An IQN identifier generator (iscsi-iname)

    We suppose your Funtoo box is correctly setup and has an operational networking. The very first thing to do is to reconfigure

    your Linux kernel to enable iSCSI:

    underSCSI low-level drivers, activate iSCSI Initiator over TCP/IP (CONFIG_ISCSI_TCP). This is the Open iSCSI

    kernel part

    underCryptographic options, activate Cryptographic APIand

    underLibrary routines, CRC32c (Castagnoli, et al) Cyclic Redundancy-Checkshould already have been enabled.

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    26/34

    Open iSCSI does not use the term node as defined by the iSCSI RFC , where a node is a

    single iSCSI initiator or target. Open iSCSI uses the term node to refer to a portal on a

    target, so tools like iscsiadm require that --targetname and --portal argument be used

    when in node mode.

    Parameters of the Open iSCSI initiator lie in the/etc/iscsi directory:

    # l s - l / etc / i s cs ito ta l 14drwxr- xr- x 1 r oot root 26 J un 24 22: 09 i f aces- r w- r- - r- - 1 root r oot 1371 J un 12 21: 03 i ni t i atorname. i scsi- r w- r - - r - - 1 r oot r oot 1282 J un 24 22: 08 i ni t i atorname.i scsi . exampl e- r w- - - - - - - 1 r oot r oot 11242 J un 24 22: 08 i scsi d.confdrw- - - - - - - 1 r oot r oot 236 J un 13 00: 48 nodesdr w- - - - - - - 1 r oot root 70 J un 13 00: 48 send_t ar gets

    ifaces: used to bind the Open ISCSI initiator on a specific NIC (created for you by iscsid on the first execution)

    initiatorname.iscsi.example: stores a template to configure the name (IQN) of the iSCSI initiator (copy it as

    initiatorname.iscsi if that later does not exist)

    iscsid.conf: various configuration parameters (leave the default for now)

    Following directories stores what is called thepersistent database in Open iSCSI terminology

    nodes (aka the nodes database): storage of node connection parameters per remote iSCSI target and portal (created for

    you, you can then change the values)

    send_targets (aka the discovery database): storage of what iscsi daemon discovers using a dicovery protocol per portal

    and remote iSCSI target. Most of the file there are symlinks to the files lying in the /etc/iscsi/nodes directory plus a file

    named st_config which stores various parameters.

    Out-of-the-box, you don't have to do heavy tweaking. Just make sure that initiatorname.iscsi exists and contains the exact same

    initiator identifier you specified when defining the Logical Units (LUs) mapping.

    The next step consists of starting iscsid:

    # / e tc / i ni t . d/ i s cs i d s tar ti scsi d | * Checki ng open- i SCSI conf i gurati on . . .i scsi d | * Loadi ng i SCSI modul es . . .i s cs i d | * Loadi ng l i bi s cs i . . . [ ok ]i scs i d | * Loadi ng scsi _ t r ansport _ i scs i . . . [ ok ]i scs i d | * Loadi ng i scsi _ tcp . . . [ ok ]i s cs i d | * Sta r t i ng i s cs i d . . .i scs i d | * Sett i ng up i SCSI targets . . .

    i scsi d | i scsi adm: No r ecords f ound! [ ! ! ]

    At this point iscsidcomplains about not having any record telling about iSCSI targets in its persistent database, this is absolutely

    normal.

    What the Funtoo box will see when initiating a discovery session on the SAN box? Simple just ask :-)

    # i scsi adm - m di scovery - t sendtarget s - p 192. 168. 1. 14192. 168. 1. 14: 3260, 1 i qn. 1986- 03. com. sun: 02: 2e5f aacb- 4bdf - 4f 7f - e643- ebc8bf 856603

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    27/34

    192. 168. 1. 13: 3260, 1 i qn. 1986- 03. com. sun: 02: 2e5f aacb- 4bdf - 4f 7f - e643- ebc8bf 856603[ 2607: f a48: ****: ****: ****: ****: ****: ****] : 3260, 1 i qn. 1986- 03. com. sun: 02: 2e5f aacb- 4bdf- 4f 7f - e643- ebc8bf 856603

    Here we get 3 results because the connection lister used by the remote target listens on the two NICs of the SAN box using its

    IPv4 stack and also on a public IPv6 address (numbers masked for privacy reasons). In that case querying one of the other

    addresses (192.168.1.13 or 2607:fa48:****:****:****:****:****:****) would give identical results. iscsiadm will record the

    returned information in its persistent database. The ",1" in the above result is just a target ordinal number, is not related to the

    LUNs living inside the target.

    We have used the "send targets" discovery protocol, it is however possible to use iSNS or

    use a static discovery (that later is called custom iSCSI portal in Open iSCSI terminology).

    Now the most exciting part! We will initiate an iSCSI session on the remote target ("logging in" the remote target). Since we

    didn't activate CHAP authentications on the remote target, the process is straightforward:

    # i scsi adm - m node - T i qn. 1986- 03. com. sun: 02: 2e5faacb- 4bdf- 4f7f - e643- ebc8bf856603 - p 192. 168. 1. 13 - - l ogi nLoggi ng i n t o [ i f ace: def aul t , t arget : i qn. 1986- 03.com. sun: 02: 2e5f aacb- 4bdf - 4f7f - e643- ebc8bf856603, port al : 192. 168. 1. 13, 32Logi n t o [ i f ace: defaul t , t arget : i qn. 1986- 03. com. sun: 02: 2e5f aacb- 4bdf- 4f 7f - e643- ebc8bf 856603, port al : 192. 168.1. 13, 3260] s

    You must use the portal reference exactly as shown by the discovery command above

    because Open iSCSI recorded it and will lookup for it in its persistent database. Using an

    IP address instead of a hostname will make Open iSCSI complain ("iscsiadm: no records

    found!").

    The interesting part is in the system log:

    # dmes g. . .[ 263902. 476582] Loadi ng i SCSI t r anspor t cl ass v2. 0-870.[ 263902. 486357] i scsi : r egi st ered t r anspor t ( t cp)[ 263939. 655539] scsi 7 : i SCSI I ni t i ator over TCP/I P[ 263940. 163643] scsi 7: 0: 0: 10: Di r ect - Access SUN COMSTAR 1. 0 PQ: 0 ANSI : 5[ 263940. 163772] sd 7: 0: 0:10: At t ached scsi gener i c sg4 type 0[ 263940. 165029] scsi 7: 0: 0: 11: Di r ect - Access SUN COMSTAR 1. 0 PQ: 0 ANSI : 5[ 263940. 165129] sd 7: 0: 0:11: At t ached scsi gener i c sg5 type 0[ 263940. 165482] sd 7: 0: 0:10: [ sdc] 5368709120 512- byt e l ogi cal bl ocks: ( 2. 74 TB/ 2. 50 Ti B)[ 263940. 166249] sd 7: 0: 0: 11: [ sdd] 1048576000 512- byt e l ogi cal bl ocks: ( 536 GB/ 500 Gi B)[ 263940. 166254] scsi 7: 0: 0: 12: Di r ect - Access SUN COMSTAR 1. 0 PQ: 0 ANSI : 5[ 263940. 166355] sd 7: 0: 0:12: At t ached scsi gener i c sg6 type 0[ 263940. 167018] sd 7: 0: 0: 10: [ sdc] Wr i t e Protect i s of f[ 263940. 167021] sd 7: 0: 0: 10: [ sdc] Mode Sense: 53 00 00 00[ 263940. 167474] sd 7: 0: 0: 11: [ sdd] Wr i t e Protect i s of f[ 263940. 167476] s d 7: 0: 0: 11: [ sdd] Mode Sense: 53 00 00 00[ 263940. 167488] sd 7: 0: 0:12: [ sde] 41943040 512-byte l ogi cal bl ocks: ( 21. 4 GB/ 20. 0 Gi B)[ 263940. 167920] sd 7: 0: 0: 10: [ sdc] Wr i t e cache: enabl ed, r ead cache: enabl ed, doesn' t suppor t DPO or FUA[ 263940. 168453] sd 7: 0: 0: 11: [ sdd] Wr i t e cache: enabl ed, r ead cache: enabl ed, doesn' t suppor t DPO or FUA[ 263940. 169133] sd 7: 0: 0: 12: [ sde] Wr i t e Protect i s of f[ 263940. 169137] s d 7: 0: 0: 12: [ sde] Mode Sense: 53 00 00 00[ 263940. 170074] sd 7: 0: 0: 12: [ sde] Wr i t e cache: enabl ed, r ead cache: enabl ed, doesn' t suppor t DPO or FUA[ 263940. 171402] sdc: unknown par t i t i on t abl e[ 263940. 172226] sdd: sdd1[ 263940. 174295] sde: sde1[ 263940. 175320] sd 7: 0: 0:10: [ sdc] At t ached SCSI di sk[ 263940. 175991] sd 7: 0: 0:11: [ sdd] At t ached SCSI di sk[ 263940. 177275] sd 7: 0: 0:12: [ sde] At t ached SCSI di sk

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    28/34

    If you don't see the logical units appearing, check on the remote side that you have added

    the initiator to the host group when defining the view entries.

    Something we had not dived into so far: if you explore the/dev directory of your Funtoo box you will notice a sub-directory

    named disk. The content of this directory is not maintained by Open iSCSI but by udev and describes what that later knowsabout all "SCSI" (iSCSI, SATA, SCSI...) disks:

    # l s / dev/di skt o t al 0drwxr- xr- x 2 r oot r oot 700 J un 25 09: 24 by- i ddrwxr- xr- x 2 r oot r oot 60 J un 22 04: 04 by- l abeldrwxr- xr- x 2 root r oot 320 J un 25 09: 24 by- pat hdrwxr- xr- x 2 root r oot 140 J un 22 04: 04 by- uui d

    Directories are self-explicative:

    by-id: stores the devices by their hardware identifier. You will notice the devices have been identified in several manners

    and notably just like if they were Fiber Channel/SAS devices (presence of WWN lines). For local disks, their WWN is not

    random but corresponds to the WWN number shown whenever you query the disk with hdparm -I(see "Logical Unit

    WWN Device Identifier" section in hdparm output). For iSCSI disks their WWN corresponds to the GUID attributed byCOMSTAR when the logical unit was created (see Creating the Logical Units section in above paragraphs).

    # l s - l / dev/ di sk/ by- i dt o t al 0l r wxr wxr wx 1 root r oot 9 J un 22 04: 04 at a- HL- DT- ST_BD- RE_BH10LS30_K9OA4DH1606 - > . . / . . / sr 0l r wxr wxr wx 1 r oot r oot 9 J un 22 04: 04 at a- LI TE- ON_DVD_SHD- 16S1S - > . . / . . / sr 1. . .l r wxr wxr wx 1 r oot r oot 9 J un 25 09: 24 scsi - 3600144f 0200acb0000004e0505940007 - > . . / . . / sdcl r wxr wxr wx 1 r oot r oot 9 J un 25 09: 24 scsi - 3600144f 0200acb0000004e05059a0008 - > . . / . . / sddl r wxr wxr wx 1 r oot r oot 9 J un 25 09: 24 scsi - 3600144f 0200acb0000004e0505a40009 - > . . / . . / sde. . . .l r wxrwxrwx 1 r oot r oot 9 J un 22 04: 04 wwn-0x***** *** *** *** ** - > . . / . . / sdal r wxrwxrwx 1 r oot r oot 10 J un 22 08: 04 wwn-0x***** *** *** *** **- par t 1 - > . . / . . / sda1l r wxrwxrwx 1 r oot r oot 10 J un 22 04: 04 wwn-0x***** *** *** *** **- par t 2 - > . . / . . / sda2

    l r wxrwxrwx 1 r oot r oot 10 J un 22 04: 04 wwn-0x***** *** *** *** **- par t 4 - > . . / . . / sda4l r wxr wxr wx 1 r oot r oot 9 J un 25 09: 24 wwn- 0x600144f 0200acb0000004e0505940007 - > . . / . . / sdcl r wxr wxr wx 1 r oot r oot 9 J un 25 09: 24 wwn- 0x600144f 0200acb0000004e05059a0008 - > . . / . . / sddl r wxr wxr wx 1 r oot r oot 9 J un 25 09: 24 wwn- 0x600144f 0200acb0000004e0505a40009 - > . . / . . / sde

    by-label: stores the storage devices by their label (volume name). Optical media and ext2/3/4 partitions (tune2fs -L ...) are

    typically labelled. If a device has not been labelled with a name it won't show up there:

    # l s - l / dev/di sk/by- l abelt o t al 0l r wxr wxr wx 1 r oot r oot 9 J un 22 04: 04 DATA_20110621 - > . . / . . / sr 1l r wxrwxrwx 1 r oot r oot 10 J un 25 10: 41 boot- part i t i on - > . . / . . / sda1

    by-path: stores the storage devices by their physical path. For DAS (Direct Attached Storage) devices paths refers to

    something under /sys. For iSCSI devices, they are referred to with the target name where the logical unit resides and theLUN of that later is used (notice the LUN nu,ber, it is the same one you have attributed when creating the view on the

    remote iSCSI node):

    # l s - l / dev/di sk/by- patht o t al 0l r wxrwxrwx 1 r oot r oot 9 J un 25 09: 24 i p-192. 168. 1. 13: 3260- i scsi - i qn. 1986- 03. com. sun: 02: 2e5f aacb- 4bdf- 4f7f - e643- ebc8bf 8566l r wxrwxrwx 1 r oot r oot 9 J un 25 09: 24 i p-192. 168. 1. 13: 3260- i scsi - i qn. 1986- 03. com. sun: 02: 2e5f aacb- 4bdf- 4f7f - e643- ebc8bf 8566l r wxrwxrwx 1 r oot r oot 9 J un 25 09: 24 i p-192. 168. 1. 13: 3260- i scsi - i qn. 1986- 03. com. sun: 02: 2e5f aacb- 4bdf- 4f7f - e643- ebc8bf 8566l r wxrwxrwx 1 r oot r oot 9 J un 22 04: 04 pci - 0000: 00:1f . 2- scsi - 0: 0: 0: 0 - > . . / . . / sdal r wxrwxrwx 1 r oot r oot 10 J un 25 10: 41 pci - 0000: 00:1f . 2- scsi - 0: 0: 0: 0- part 1 - > . . / . . / sda1l r wxrwxrwx 1 r oot r oot 10 J un 22 04: 04 pci - 0000: 00:1f . 2- scsi - 0: 0: 0: 0- part 2 - > . . / . . / sda2l r wxrwxrwx 1 r oot r oot 10 J un 22 04: 04 pci - 0000: 00:1f . 2- scsi - 0: 0: 0: 0- part 4 - > . . / . . / sda4l r wxrwxrwx 1 r oot r oot 9 J un 22 04: 04 pci - 0000: 00:1f . 2- scsi - 0: 0: 1: 0 - > . . / . . / sr 0

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    29/34

    l r wxrwxrwx 1 r oot r oot 9 J un 22 04: 04 pci - 0000: 00:1f . 2- scsi - 1: 0: 0: 0 - > . . / . . / sdb. . .

    by-uuid: a bit similar to /dev/disk/by-label but disks are reported by their UUID (if defined). In the the following example

    only /dev/sda has some UUIDs defined (this disk holds several BTRFS partitions which have been automatically given an

    UUID at their creation time):

    # l s - l / dev/ di sk/ by- uui dt o t al 0l r wxr wxr wx 1 root r oot 10 J un 22 04: 04 01178c43- 7392- 425e- 8acf - 3ed16ab48813 - > . . / . . / sda4

    l r wxr wxr wx 1 root r oot 10 J un 22 04: 04 1701af 39- 8ea3- 4463- 8a77- ec75c59e716a - > . . / . . / sda2l r wxr wxr wx 1 r oot r oot 10 J un 25 10: 41 5432f 044- f aa2- 48d3- 901d- 249275f a2976 - > . . / . . / sda1

    For storage devices > 2.19TB capacity, do not use traditional partitioning tools like

    fdisk/cfdisk, use a GPT partitioning tool like gptfdisk (sys-apps/gptfdisk).

    At this point the most difficult part has been done, you can now use the remote iSCSI just if they were DAS devices. Let's

    demonstrate with one:

    # gdi sk / dev/ sdcGPT f di sk ( gdi sk) ver si on 0. 7. 1

    Part i t i on t abl e scan:MBR: not pr esentBSD: not pr esentAPM: not pr esentGPT: not present

    Cr eat i ng new GPT entr i es.Command ( ? f or hel p) : nPart i t i on number ( 1-128, defaul t 1) :

    Fi r st sect or ( 34- 5368709086, def aul t = 34) or {+- }si ze{KMGTP}:I nfor mati on: Moved r equest ed sector f r om 34 t o 2048 i norder t o al i gn on 2048- sector boundar i es.Use ' l ' on t he exper t s' menu to adj ust al i gnmentLast sect or ( 2048- 5368709086, def aul t = 5368709086) or {+- }si ze{KMGTP}:Curr ent t ype i s ' Li nux/ Wi ndows dat a'Hex code or GUI D ( L t o show codes, Ent er = 0700) :Changed t ype of par t i t i on t o ' Li nux/ Wi ndows dat a'

    Command ( ? f or hel p) : pDi sk / dev/ sdc: 5368709120 sect ors, 2. 5 Ti BLogi cal sect or si ze: 512 bytesDi sk i dent i f i er ( GUI D) : 6AEC03B6- 0047- 44E1- B6C7- 1C0CBC7C4CE6Part i t i on t abl e hol ds up t o 128 ent r i esFi rs t usabl e sector i s 34, l ast usabl e sector i s 5368709086Part i t i ons wi l l be al i gned on 2048- sect or boundari es

    Tot al f r ee spac e i s 2014 sec t or s ( 1007. 0 Ki B)

    Number Star t ( sect or) End ( sect or) Si ze Code Name1 2048 5368709086 2. 5 Ti B 0700 Li nux/ Wi ndows dat a

    Command ( ? f or hel p) : w

    Fi nal checks compl et e. About t o wr i t e GPT dat a. THI S WI LL OVERWRI TE EXI STI NGPARTI TI ONS! !

    Do you want t o proceed? ( Y/ N) : yesOK; wr i t i ng new GUI D part i t i on t abl e (GPT) .

    The oper at i on has compl et ed suc cessf ul l y.

    # gdi sk -l / dev/sdcGPT f di sk ( gdi sk) ver si on 0. 7. 1

    Part i t i on t abl e scan:MBR: prot ecti veBSD: not pr esentAPM: not pr esent

    Box used via iSCSI - Funtoo Linux http://www.funtoo.org/wiki/SAN_Box_used_v

    34 6/29/2011

  • 7/27/2019 SAN Box Used via iSCSI - Funtoo Linux

    30/34

    GPT: present

    Found val i d GPT wi t h pr ot ect i ve MBR; usi ng GPT.Di sk / dev/ sdc: 5368709120 sect ors, 2. 5 Ti BLogi cal sect or si ze: 512 bytesDi sk i dent i f i er ( GUI D) : 6AEC03B6- 0047- 44E1- B6C7- 1C0CBC7C4CE6Part i t i on t abl e hol ds up t o 128 ent r i esFi rs t usabl e sector i s 34, l ast usabl e sector i s 5368709086Part i t i ons wi l l be al i gned on 2048- sect or boundari es

    Tot al f r ee spac e i s 2014 sec t or s ( 1007. 0 Ki B)

    Number Star t ( sect or) End ( sect or) Si ze Code Name1 2048 5368709086 2. 5 Ti B 0700 Li nux/ Wi ndows dat a

    Notice COMSTAR emulates a disk having the traditional 512 bytes/sector division.

    Now have a look again in /dev/disk/by-id:

    # l s - l / dev/ di sk/ by- i d. . .l r wxr wxr wx 1 r oot r oot 9 J un 25 13: 23 wwn- 0x600144f 0200acb0000004e0505940007 - > . . / . . / sdcl r wxr wxr wx 1 r oot r oot 10 J un 25 13: 33 wwn- 0x600144f 0200acb0000004e0505940007-part 1 -> . . / . . / sdc1. . .

    What would be the next step? Creating a filesystem on the brand new partition of course!

    # mkf s. ext4 / dev/ sdc1mke2f s 1. 41. 14 ( 22- Dec- 2010)F i l esysteml abel =OS t ype: Li nuxBl ock si ze=4096 ( l og=2)Fr agment s i ze=4096 ( l og=2)Str i de=0 bl ocks, Str i pe wi dth=0 bl ocks134217728 i nodes, 536870911 bl ocks