FibreChannel and Linux at CDF
Background
The CDF project is using FibreChannel extensively, both for moving data from the detector and for subsequent access to it. We have recently begun testing FC testing with Linux to see what level of support and performance we can attain.
Note that I've moved the links for vendors, software and so forth to the
Resources page.
Current Status
As of 2/01 or so, we have settled on the 2.4.2 kernel, and are quite happy
with the system. We are still experimenting with hot-plug on the FC loop, but
other than that, things are fast and stable.
We still have not had the opportunity to test the Emulex cards, so all
experience has been with the QLogix cards.
Hardware
The main test PC is an SGI 1450 (quad Xeon 700s, 4GB memory, very spiffy.)
For FibreChannel cards, we are using the
QLogic 2200 and 2100 cards purchased from
TeamExcess.
We have some Emulex
cards (LP7000 and LP8000) that now have released Linux drivers; we have not
yet tested them.
Disks are assorted, using both plain FC disks as well as Chapparal and Adaptec FC RAID controllers.
Software
Linux drivers are included with 2.2 and newer kernels as the "qlogicfc" module in the SCSI category. We have used kernel 2.4.0test8 through 2.4.1
Distributions are RedHat 6.2 and Debian 2.2.
The scsiadd program is useful for re-scanning the FC loop if something is added after the driver is loaded. By default, Linux does not add any disks that weren't present at driver load. Most annoying, but scsiadd does an OK job.
The bonnie benchmark from Tim Bray
is useful for performance testing; we run it with a 2GB test file size: 'bonnie -s 2047'.
See the notes below for discussion of the results.
The revised bonnie++ benchmark has many improvments; including a removal of the 2G limit. We'll start posting results using the updated version as time permits.
Caveats and Discussion
Memory Size Effects and the 2G Limit
There is a problem with testing on the 1450 - it has 4GB of memory, more than enough to cache the test file in memory, rendering the results worthless. Normally, one just specifies a file larger than the memory and that ensures that the cache will overflow. However, in this case, the current file size limit is 2GB, mostly due to the limitations of routines like fseek().
Due to this, the bonnie results should be taken skeptically. We hope that iozone will have a workaround for the problem. Short of physically reducing the amount of memory present, or rewriting our benchmarks, we've not found a simple workaround.
Note that, as of 4/02, I have email from the iozone author that this has been
fixed in newer versions of iozone. See the URL below to get them.
Bonnie++ (an updated version of Bonnie with a new maintainer) is addressing
these problems; we don't know how its results compare to Bonnie however.
Driver Issues, Logistics and Utilities
In general, the QLogic driver for Linux seems to work well, but breaking the FC loop causes problems that range from new devices not being found to complete machine crashes. For example, if you power up a new device while writing a filesystem on a different disk, the loop reinitializes and the kernel panics. Given that one of the advantages of FC is hot-pluggability, we are looking into the difficulty of modifying the driver to work around the problem.
SGI Irix boxes have a useful program called 'scsiha' that can be used to reset and probe a SCSI or FC bus; the 'scsiadd' program is useful but nowhere near as complete.
On Solaris, the 'devfsadm' program will re-probe all devices and rebuild the /dev and /devices entries; quite convenient. Both of these are more polished than the Linux equivalent; how much
this affects operations depends on how dynamic the FC loop or fabric is. For static configurations, it is a non-issue.
Problems
We had a problem with a repeatably hanging the FC loop; this was fixed by a patch to the Chapparal firmware. If it recurs we'll investigate further.
Performance Results
For reference purposes, here are the results of running the
bonnie benchmark from Tim Bray on one of the internal IBM 9GB SCSI disks, with a 1900MB test size:
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
1900 13156 89.1 16934 10.1 15135 10.4 14926 100.0 373872 100.1 58656.2 293.3
(This is the SGI 1450, quad Xeon 700s, 4G memory, Adaptec 7899 onboard controller, RedHat 6.2, with kernel 2.4.0test8 with the sd.c module patch.)
We are also looking at the
iozone benchmark to get a more complete picture. We also have internally-developed RAID bencharks that do large concurrent sequential reads / writes for measuring scalability for CDF-type applications; those are on the way.
Navigation Links
Introduction / HOWTO: Fibre Channel
FC at CDF page
FC at home page
FC at home, part two - chassis and tapes
Resources page
Back to home page