Siehe auch: http://lsky.info/?id=MultipathIBMDS4000
Wir wollen feststellen, was aktuell über SCSI gesehen wird. (Fiber Devices sind hier ebenfalls enthalten, also einfach alles, was unter /proc/scsi zu sehen ist).
lsscsi
[0:0:1:0] no dev DataCore SANsymphony - [1:0:0:0] no dev DataCore SANsymphony -
Im o.a. Kernel enthaltene qlogic Treiber Version ist 8.01.04-d7.
/etc/modprobe.conf
alias eth0 tg3 alias eth1 tg3 alias eth2 e1000 alias scsi_hostadapter cciss alias scsi_hostadapter1 qla2300 options qla2xxx ql2xextended_error_logging=1 displayConfig=1 ql2xretrycount=30 ql2xfailover=0 ql2xlbType=0 alias usb-controller ehci-hcd alias usb-controller1 uhci-hcd
Da der Failover vom multipathd durchgeführt werden soll, ist NICHT der ql2xfailover zu verwenden. (Alternativ könnte auch der qlogic non-failover driver verwendet werden).
vgl. auch qlogic README
modinfo ql2xxx
filename: /lib/modules/2.6.9-42.0.10.ELsmp/kernel/drivers/scsi/qla2xxx/qla2xxx.ko parm: ql2xmaxqdepth:Maximum queue depth to report for target devices. parm: ql2xlogintimeout:Login timeout value in seconds. parm: qlport_down_retry:Maximum number of command retries to a port that returnsa PORT-DOWN status. parm: ql2xretrycount:Maximum number of mid-layer retries allowed for a command. Default value is 20, parm: displayConfig:If 1 then display the configuration used in /etc/modprobe.conf. parm: ql2xplogiabsentdevice:Option to enable PLOGI to devices that are not present after a Fabric scan. This is needed for several broken switches.Default is 0 - no PLOGI. 1 - perfom PLOGI. parm: ql2xenablezio:Option to enable ZIO:If 1 then enable it otherwise use the default set in the NVRAM. Default is 0 : disabled parm: ql2xintrdelaytimer:ZIO: Waiting time for Firmware before it generates an interrupt to the host to notify completion of request. parm: ConfigRequired:If 1, then only configured devices passed in through theql2xopts parameter will be presented to the OS parm: Bind:Target persistent binding method: 0 by Portname (default); 1 by PortID; 2 by Nodename. parm: ql2xsuspendcount:Number of 6-second suspend iterations to perform while a target returns a <NOT READY> status. Default is 10 iterations. parm: ql2xdoinitscan:Signal mid-layer to perform scan after driver load: 0 -- no signal sent to mid-layer. parm: ql2xloginretrycount:Specify an alternate value for the NVRAM login retry count. parm: ql2xprocessnotready:Option to disable handling of NOT-READY in the driver. Default is 1 - Handled by the driver. Set to 0 - Disable the handling inside the driver parm: ql2xprocessrscn:Option to enable port RSCN handling via a series of lessfabric intrusive ADISCs and PLOGIs. parm: extended_error_logging:Option to enable extended error logging, Default is 0 - no logging. 1 - log errors. parm: ql2xfwloadbin:Option to specify location in which to load ISP24xx firmware: 2 -- load firmware via the request_firmware() (hotplug) interface. A file, ql2400_fw.bin, (containing the firmware image) should be hotplug accessible. 1 -- load firmware from flash. 0 -- load firmware embedded with driver (default). parm: ql2xfdmienable:Enables FDMI registratons Default is 0 - no FDMI. 1 - perfom FDMI. author: QLogic Corporation description: QLogic Fibre Channel HBA Driver license: GPL version: 8.01.04-d7 1036BB22AA28C8D7180B7E1 vermagic: 2.6.9-42.0.10.ELsmp SMP 686 REGPARM 4KSTACKS gcc-3.4 depends: scsi_mod,scsi_transport_fc
mkinitrd -v --with=dm-multipath /boot/initrd-2.6.9-42.0.10.ELsmp-qlx.img 2.6.9-42.0.10.ELsmp
Creating initramfs Looking for deps of module scsi_mod Looking for deps of module sd_mod scsi_mod Looking for deps of module scsi_mod Looking for deps of module unknown Looking for deps of module cciss scsi_mod Looking for deps of module scsi_mod Looking for deps of module qla2300 qla2xxx scsi_transport_fc scsi_mod Looking for deps of module qla2xxx scsi_transport_fc scsi_mod Looking for deps of module scsi_transport_fc scsi_mod Looking for deps of module scsi_mod Looking for deps of module scsi_mod Looking for deps of module scsi_transport_fc scsi_mod Looking for deps of module scsi_mod Looking for deps of module scsi_mod Looking for deps of module ide-disk Looking for deps of module ext3 jbd Looking for deps of module jbd Looking for deps of module dm-multipath dm-mod Looking for deps of module dm-mod Using modules: ./kernel/drivers/scsi/scsi_mod.ko ./kernel/drivers/scsi/sd_mod.ko ./kernel/drivers/block/cciss.ko ./kernel/drivers/scsi/scsi_transport_fc.ko ./kernel/drivers/scsi/qla2xxx/qla2xxx.ko ./kernel/drivers/scsi/qla2xxx/qla2300.ko ./kernel/fs/jbd/jbd.ko ./kernel/fs/ext3/ext3.ko ./kernel/drivers/md/dm-mod.ko ./kernel/drivers/md/dm-multipath.ko /sbin/nash -> /tmp/initrd.nu4456/bin/nash /sbin/insmod.static -> /tmp/initrd.nu4456/bin/insmod /sbin/udev.static -> /tmp/initrd.nu4456/sbin/udev /etc/udev/udev.conf -> /tmp/initrd.nu4456/etc/udev/udev.conf copy from /lib/modules/2.6.9-42.0.10.ELsmp/./kernel/drivers/scsi/scsi_mod.ko(elf32-i386) to /tmp/initrd.nu4456/lib/scsi_mod.ko(elf32-i386) copy from /lib/modules/2.6.9-42.0.10.ELsmp/./kernel/drivers/scsi/sd_mod.ko(elf32-i386) to /tmp/initrd.nu4456/lib/sd_mod.ko(elf32-i386) copy from /lib/modules/2.6.9-42.0.10.ELsmp/./kernel/drivers/block/cciss.ko(elf32-i386) to /tmp/initrd.nu4456/lib/cciss.ko(elf32-i386) copy from /lib/modules/2.6.9-42.0.10.ELsmp/./kernel/drivers/scsi/scsi_transport_fc.ko(elf32-i386) to /tmp/initrd.nu4456/lib/scsi_transport_fc.ko(elf32-i386) copy from /lib/modules/2.6.9-42.0.10.ELsmp/./kernel/drivers/scsi/qla2xxx/qla2xxx.ko(elf32-i386) to /tmp/initrd.nu4456/lib/qla2xxx.ko(elf32-i386) copy from /lib/modules/2.6.9-42.0.10.ELsmp/./kernel/drivers/scsi/qla2xxx/qla2300.ko(elf32-i386) to /tmp/initrd.nu4456/lib/qla2300.ko(elf32-i386) copy from /lib/modules/2.6.9-42.0.10.ELsmp/./kernel/fs/jbd/jbd.ko(elf32-i386) to /tmp/initrd.nu4456/lib/jbd.ko(elf32-i386) copy from /lib/modules/2.6.9-42.0.10.ELsmp/./kernel/fs/ext3/ext3.ko(elf32-i386) to /tmp/initrd.nu4456/lib/ext3.ko(elf32-i386) copy from /lib/modules/2.6.9-42.0.10.ELsmp/./kernel/drivers/md/dm-mod.ko(elf32-i386) to /tmp/initrd.nu4456/lib/dm-mod.ko(elf32-i386) copy from /lib/modules/2.6.9-42.0.10.ELsmp/./kernel/drivers/md/dm-multipath.ko(elf32-i386) to /tmp/initrd.nu4456/lib/dm-multipath.ko(elf32-i386) Loading module scsi_mod Loading module sd_mod Loading module cciss Loading module scsi_transport_fc Loading module qla2xxx Loading module qla2300 Loading module jbd Loading module ext3 Loading module dm-mod Loading module dm-multipath
und reboot....
Linux oder VMware zuweisen.rescan-scsi-bus.sh - syslog
Apr 18 20:00:23 localhost kernel: qla2300 0000:07:01.0: scsi(0:0:0:0): Enabled tagged queuing, queue depth 32. Apr 18 20:00:23 localhost kernel: Vendor: DataCore Model: SANsymphony Rev: DCS Apr 18 20:00:23 localhost kernel: Type: Direct-Access ANSI SCSI revision: 04 Apr 18 20:00:23 localhost kernel: qla2300 0000:07:01.0: scsi(0:0:1:0): Enabled tagged queuing, queue depth 32. Apr 18 20:00:23 localhost kernel: SCSI device sda: 50331648 512-byte hdwr sectors (25770 MB) Apr 18 20:00:23 localhost kernel: SCSI device sda: drive cache: write back Apr 18 20:00:23 localhost kernel: SCSI device sda: 50331648 512-byte hdwr sectors (25770 MB) Apr 18 20:00:23 localhost kernel: SCSI device sda: drive cache: write back Apr 18 20:00:23 localhost kernel: sda: sda1 Apr 18 20:00:23 localhost kernel: Attached scsi disk sda at scsi0, channel 0, id 1, lun 0 Apr 18 20:00:23 localhost kernel: qla2300 0000:07:01.0: scsi(0:0:2:0): Enabled tagged queuing, queue depth 32. Apr 18 20:00:23 localhost scsi.agent[4850]: disk at /devices/pci0000:00/0000:00:04.0/0000:06:00.0/0000:07:01.0/host0/target0:0:1/0:0:1:0 Apr 18 20:00:23 localhost kernel: Vendor: DataCore Model: SANsymphony Rev: DCS Apr 18 20:00:23 localhost kernel: Type: Direct-Access ANSI SCSI revision: 04 Apr 18 20:00:23 localhost kernel: qla2300 0000:07:01.1: scsi(1:0:0:0): Enabled tagged queuing, queue depth 32. Apr 18 20:00:23 localhost kernel: SCSI device sdb: 50331648 512-byte hdwr sectors (25770 MB) Apr 18 20:00:23 localhost kernel: SCSI device sdb: drive cache: write back Apr 18 20:00:23 localhost kernel: SCSI device sdb: 50331648 512-byte hdwr sectors (25770 MB) Apr 18 20:00:23 localhost kernel: SCSI device sdb: drive cache: write back Apr 18 20:00:23 localhost scsi.agent[5021]: disk at /devices/pci0000:00/0000:00:04.0/0000:06:00.0/0000:07:01.1/host1/target1:0:0/1:0:0:0 Apr 18 20:00:23 localhost kernel: sdb: sdb1 Apr 18 20:00:23 localhost kernel: Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0
scsi_id
[root@localhost bin]# scsi_id -g -u -s /block/sda 360030d904d4952524f525f4150000000 [root@localhost bin]# scsi_id -g -u -s /block/sdb 360030d904d4952524f525f4150000000
Die Platten sda und sdb sind - wie zu erwarten - identisch, sie sind ja über 2 Pfade angeschlossen.
Auszug aus der man Page
In order to generate unique values for either page 0×80 or page 0×83, the serial numbers or world wide names are prefixed as follows.
Identifiers based on page 0×80 are prefixed by the character ‘S’, the SCSI vendor, the SCSI product (model) and then the the serial number returned by page 0×80. For example:
$ multipath -ll mpathDS (3600a0b800017738900000008436a049c) [size=100 GB][features="0"][hwhandler="0"] \_ round-robin 0 [active] \_ 2:0:0:0 sdb 8:16 [active][ready] \_ round-robin 0 [enabled] \_ 3:0:0:0 sdc 8:32 [active][ready] $ scsi_id -p 0x83 -g -u -s /block/sdb 3600a0b800017738900000008436a049c $ scsi_id -p 0x83 -g -u -s /block/sdc 3600a0b800017738900000008436a049c $ scsi_id -p 0x80 -g -u -s /block/sdb SIBM_____1722-600_______1T45243281______ $ scsi_id -p 0x80 -g -u -s /block/sdc SIBM_____1722-600_______1T53681398______
Hier sieht man deutlich, mit welchem Befehl die Gruppierung vorgenommen werden sollte. In diesem Falle wählt man den “-p 0×83” Befehl, beim “-p 0×80” wird die Seriennummer des Controllers der DS4300 angezeigt. Dies kann auch hilfreich sein, um einen bestimmten Pfad zu finden.
Identifiers based on page 0×83 are prefixed by the identifier type followed by the page 0×83 identifier. For example, a device with a NAA (Name Address Authority) type of 3 (also in this case the page 0×83 identifier starts with the NAA value of 6):
# /sbin/scsi_id -p 0x83 -s /block/sdg 3600a0b80000b174b000000d63efc5c8c
Beispiel:
scsi_id -g -p 0x80 -u -s /block/sda
| failover | 1 path per priority group (active/passive) |
| multibus | all paths in 1 priority group (active/active) |
| group_by_serial | 1 priority group per serial |
| group_by_prio | 1 priority group per priority value. Priorities are determined by callout programs specified as a global, per-controler or per-multi-path option in the configuration file |
| group_by_node_name | 1 priority group per target node name. Target node names are fetched in /sys/class/fc_transport/target*/node_name. |
/var/lib/multipath/bindings enthält die Zuordnung welcher user-friendly-name zu einer wwid gehört. Diese Datei wird vom Multipath Programm selbst gewartet, man kann aber diese Datei auch selbst bearbeiten, um auf allen Knoten immer diesselbe Zuordung zu erhalten.
bindings
# alias wwid # 3pap0 360030d90335041500000000000000000
multipath -v 2 -d
[size=24 GB][features="0"][hwhandler="0"] \_ round-robin 0 [active] \_ 0:0:1:0 sda 8:0 [active][ready] \_ round-robin 0 [enabled] \_ 1:0:0:0 sdb 8:16 [active][ready]
Da beim Systemstart die multipath-Devices erzeugt werden, und der multipath.static auf das Dateisystem schreiben muss, ist die Datei /etc/rc.sysinit (alternativ auch das init-script) anzupassen.
| | Achtung: Da, wo “Setting up LVM” ausgeführt ist, ist nur / (das root-Filesystem) gemountet. Daher ist – sofern man die bindings in der /etc/fstab verwendet, die Datei ins Root-Filesystem zu legen (nicht wie in default==/var/lib/multipath/bindings). |
| touch /etc/multipath.bindings | |
| ln -s /etc/multipath.bindings /var/lib/multipath/bindings |
/etc/rc.sysinit
# Multipath initialization
if [ -x /sbin/multipath.static ]; then
if ! LC_ALL=C fgrep -q "dm_multipath" /proc/modules 2>/dev/null ; then
modprobe dm-multipath >/dev/null 2>&1
fi
if [ -x /sbin/multipath.static ]; then
if /sbin/multipath.static -v0 -b /etc/multipath.bindings > /dev/null 2>&1 ; then
action $"Setting up Multipathing:" /sbin/multipath.static -v0 -b /etc/multipath.bindings
fi
fi
fi
multipath -ll
mpath0 (360030d904d4952524f525f4150000000) [size=24 GB][features="1 queue_if_no_path"][hwhandler="0"] \_ round-robin 0 [active] \_ 0:0:1:0 sda 8:0 [active][ready] \_ round-robin 0 [enabled] \_ 1:0:0:0 sdb 8:16 [active][ready]
LC_ALL=C fdisk -l /dev/mapper/mpath0
Disk /dev/mapper/mpath0: 25.7 GB, 25769803776 bytes
255 heads, 63 sectors/track, 3133 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/mapper/mpath0p1 1 3132 25157758+ 7 HPFS/NTFS
Zu diesem Zweck wird ein SDS vom Netz genommen bzw. ein “Local Stop” durchgeführt.
Während des gesamten Test werden rekursiv Daten auf die gemountete Multipath Partition geschrieben. Zu KEINER Zeit wurde der Schreibprozess unter- bzw. abgebrochen. Die Datenintegrität blieb gewährleistet.
multipath_checker.sh
#!/bin/sh
A=0
while :
do
multipath -ll > A_$A.log
sleep 10
multipath -ll > B_$A.log
diff B_$A.log A_$A.log
if [ "$?" != "0" ] ; then
A=$((A + 1))
fi
done
** initial alles ok **
3pap0 (360030d90335041500000000000000000) [size=6 GB][features="0"][hwhandler="0"] \_ round-robin 0 [active] \_ 0:0:1:0 sda 8:0 [active][ready] \_ round-robin 0 [enabled] \_ 1:0:0:0 sdb 8:16 [active][ready]
**spiegel defekt**
3pap0 (360030d90335041500000000000000000) [size=6 GB][features="0"][hwhandler="0"] \_ round-robin 0 [enabled] \_ 0:0:1:0 sda 8:0 [failed][faulty] \_ round-robin 0 [active] \_ 1:0:0:0 sdb 8:16 [active][ready]
**wieder da**
3pap0 (360030d90335041500000000000000000) [size=6 GB][features="0"][hwhandler="0"] \_ round-robin 0 [active] \_ 0:0:1:0 sda 8:0 [active][ready] \_ round-robin 0 [enabled] \_ 1:0:0:0 sdb 8:16 [active][ready]
syslog
Apr 19 15:22:07 localhost multipathd: 8:0: tur checker reports path is up Apr 19 15:22:07 localhost multipathd: 8:0: reinstated Apr 19 15:22:07 localhost multipathd: 3pap0: remaining active paths: 2 Apr 19 15:22:17 localhost multipathd: 3pap0: switch to path group #1
Jetzt wird der andere SDS gestoppt.
syslog
Apr 19 15:35:19 localhost kernel: device-mapper: dm-multipath: Failing path 8:16. Apr 19 15:35:19 localhost multipathd: 8:16: tur checker reports path is down Apr 19 15:35:19 localhost multipathd: checker failed path 8:16 in map 3pap0 Apr 19 15:35:19 localhost multipathd: 3pap0: remaining active paths: 1
der zweite pfad ist weg
3pap0 (360030d90335041500000000000000000) [size=6 GB][features="0"][hwhandler="0"] \_ round-robin 0 [active] \_ 0:0:1:0 sda 8:0 [active][ready] \_ round-robin 0 [enabled] \_ 1:0:0:0 sdb 8:16 [active][faulty] 3pap0 (360030d90335041500000000000000000) [size=6 GB][features="0"][hwhandler="0"] \_ round-robin 0 [active] \_ 0:0:1:0 sda 8:0 [active][ready] \_ round-robin 0 [enabled] \_ 1:0:0:0 sdb 8:16 [failed][faulty]
es ist alles wieder ok
3pap0 (360030d90335041500000000000000000) [size=6 GB][features="0"][hwhandler="0"] \_ round-robin 0 [active] \_ 0:0:1:0 sda 8:0 [active][ready] \_ round-robin 0 [enabled] \_ 1:0:0:0 sdb 8:16 [active][ready]
syslog
Apr 19 15:44:27 localhost multipathd: 8:16: tur checker reports path is up Apr 19 15:44:27 localhost multipathd: 8:16: reinstated Apr 19 15:44:27 localhost multipathd: 3pap0: remaining active paths: 2
... und noch mal den anderen Server weg ...
syslog
Apr 19 15:48:55 localhost kernel: SCSI error : <0 0 1 0> return code = 0x20000 Apr 19 15:48:55 localhost kernel: end_request: I/O error, dev sda, sector 9165679 Apr 19 15:48:55 localhost kernel: device-mapper: dm-multipath: Failing path 8:0. Apr 19 15:48:55 localhost kernel: end_request: I/O error, dev sda, sector 9165687 Apr 19 15:48:55 localhost kernel: SCSI error : <0 0 1 0> return code = 0x20000 ... Apr 19 15:49:39 localhost kernel: SCSI error : <0 0 1 0> return code = 0x10000 Apr 19 15:49:39 localhost kernel: end_request: I/O error, dev sda, sector 10055935 Apr 19 15:49:39 localhost kernel: end_request: I/O error, dev sda, sector 10055943 Apr 19 15:50:04 localhost multipathd: 3pap0: switch to path group #1
multipath -ll
3pap0 (360030d90335041500000000000000000) [size=6 GB][features="0"][hwhandler="0"] \_ round-robin 0 [enabled] \_ 0:0:1:0 sda 8:0 [failed][faulty] \_ round-robin 0 [active] \_ 1:0:0:0 sdb 8:16 [active][ready]
SDS ist wieder in Produktion...
syslog
Apr 19 15:59:06 localhost multipathd: 8:0: tur checker reports path is up Apr 19 15:59:06 localhost multipathd: 8:0: reinstated Apr 19 15:59:06 localhost multipathd: 3pap0: remaining active paths: 2 Apr 19 15:59:16 localhost multipathd: 3pap0: switch to path group #1
**alles wieder prima**
3pap0 (360030d90335041500000000000000000) [size=6 GB][features="0"][hwhandler="0"] \_ round-robin 0 [active] \_ 0:0:1:0 sda 8:0 [active][ready] \_ round-robin 0 [enabled] \_ 1:0:0:0 sdb 8:16 [active][ready]