Antares-RAID-sparcLinux-HOWTO Thom Coates (tdc3@psu.edu), Carl Munio, Jim Ludemann v0.1, 28 April 2000 This document describes how to install, configure, and maintain a hardware RAID built around the 5070 SBUS host based RAID controller by Antares Microsystems. Other topics of discussion include RAID levels, the 5070 controller GUI, and 5070 command line. A complete command reference for the 5070's K9 kernel and Bourne-like shell is included. ______________________________________________________________________ Table of Contents 1. Preamble 2. Acknowledgements and Thanks 3. New Versions 4. Introduction 4.1 5070 Main Features 5. Background 5.1 Raid Levels 5.2 RAID Linear 5.2...1 SUMMARY 5.3 Level 1 5.3...1 SUMMARY 5.4 Striping 5.5 Level 0 5.5...1 SUMMARY: 5.6 Level 2 and 3 5.6...1 SUMMARY 5.7 Level 4 5.7...1 SUMMARY 5.8 Level 5 5.8...1 SUMMARY 6. Installation 6.1 SBUS Controller Compatibility 6.2 Hardware Installation Procedure 6.2...1 GNOME: 6.2...2 KDE: 6.2...3 XDM: 6.2...4 Console Login (systems without X windows): 6.2...5 All Systems: 6.2...6 SPARCstation 4, 5, 10, 20 & UltraSPARC Systems: 6.2...7 Ultra Enterprise Servers, SPARCserver 1000 & 2000 Systems, SPARCserver 6XO MP Series: 6.2...8 All Systems: 6.2...9 Verifying the Hardware Installation: 6.3 Serial Terminal 6.4 Hard Drive Plant 7. 5070 Onboard Configuration 7.1 Main Screen Options 7.1...1
7.2 [Q]uit 7.3 [R]aidSets: 7.3...1
7.4 [H]ostports: 7.4...1
7.5 [S]pares: 7.5...1
7.6 [M]onitor: 7.6...1
7.7 [G]eneral: 7.7...1
7.8 [P]robe 7.9 Example RAID Configuration Session 8. Linux Configuration 8.1 Existing Linux Installation 8.1.1 QLogic SCSI Driver 8.1.2 Device mappings 8.1.3 Partitioning 8.1.4 Installing a filesystem 8.1.5 Mounting 8.2 New Linux Installation 9. Maintenance 9.1 Activating a spare 9.2 Re-integrating a repaired drive into the RAID (levels 3 and 5) 10. Troubleshooting / Error Messages 10.1 Out of band temperature detected... 10.2 ... failed ... cannot have more than 1 faulty backend. 10.3 When booting I see: ... Sun disklabel: bad magic 0000 ... unknown partition table. 11. Bugs 12. Frequently Asked Questions 12.1 How do I reset/erase the onboard configuration? 12.2 How can I tell if a drive in my RAID has failed? 13. Advanced Topics: 5070 Command Reference 13.1 AUTOBOOT - script to automatically create all raid sets and scsi monitors 13.2 AUTOFAULT - script to automatically mark a backend faulty after a drive failure 13.3 AUTOREPAIR - script to automatically allocate a spare and reconstruct a raid set 13.4 BIND - combine elements of the namespace 13.5 BUZZER - get the state or turn on or off the buzzer 13.6 CACHE - display information about and delete cache ranges 13.7 CACHEDUMP - Dump the contents of the write cache to battery backed-up ram 13.8 CACHERESTORE - Load the cache with data from battery backed-up ram 13.9 CAT - concatenate files and print on the standard output 13.10 CMP - compare the contents of 2 files 13.11 CONS - console device for Husky 13.12 DD - copy a file (disk, etc) 13.13 DEVSCMP - Compare a file's size against a given value 13.14 DFORMAT- Perform formatting functions on a backend disk drive 13.15 DIAGS - script to run a diagnostic on a given device 13.16 DPART - edit a scsihd disk partition table 13.17 DUP - open file descriptor device 13.18 ECHO - display a line of text 13.19 ENV- environment variables file system 13.20 ENVIRON - RaidRunner Global environment variables - names and effects 13.21 EXEC - cause arguments to be executed in place of this shell 13.22 EXIT - exit a K9 process 13.23 EXPR - evaluation of numeric expressions 13.24 FALSE - returns the K9 false status 13.25 FIFO - bi-directional fifo buffer of fixed size 13.26 GET - select one value from list 13.27 GETIV - get the value an internal RaidRunner variable 13.28 HELP - print a list of commands and their synopses 13.29 HUSKY - shell for K9 kernel 13.30 HWCONF - print various hardware configuration details 13.31 HWMON - monitoring daemon for temperature, fans, PSUs. 13.32 INTERNALS - Internal variables used by RaidRunner to change dynamics of running kernel 13.33 KILL - send a signal to the nominated process 13.34 LED- turn on/off LED's on RaidRunner 13.35 LFLASH- flash a led on RaidRunner 13.36 LINE - copies one line of standard input to standard output 13.37 LLENGTH - return the number of elements in the given list 13.38 LOG - like zero with additional logging of accesses 13.39 LRANGE - extract a range of elements from the given list 13.40 LS - list the files in a directory 13.41 LSEARCH - find the a pattern in a list 13.42 LSUBSTR - replace a character in all elements of a list 13.43 MEM - memory mapped file (system) 13.44 MDEBUG - exercise and display statistics about memory allocation 13.45 MKDIR - create directory (or directories) 13.46 MKDISKFS - script to create a disk filesystem 13.47 MKHOSTFS - script to create a host port filesystem 13.48 MKRAID - script to create a raid given a line of output of rconf 13.49 MKRAIDFS - script to create a raid filesystem 13.50 MKSMON - script to start the scsi monitor daemon smon ______________________________________________________________________ 1. Preamble Copyright 2000 by Thomas D. Coates, Jr. This document's source is licensed under the terms if the GNU general public license agreement. Permission to use, copy, modify, and distribute this document without fee for any purpose commercial or non-commercial is hereby granted, provided that the author's names and this notice appear in all copies and/or supporting documents; and that the location where a freely available unmodified version of this document may be obtained is given. This document is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY, either expressed or implied. While every effort has been taken to ensure the accuracy of the information documented herein, the author(s)/editor(s)/maintainer(s)/contributor(s) assumes NO RESPONSIBILITY for any errors, or for any damages, direct or consequential, as a result of the use of the information documented herein. A complete copy of the GNU Public License agreement may be obtained from: Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. Portions of this document are adapted and/or re-printed from the 5070 installation guide and man pages with permission of Antares Microsystems, Inc., Campbell CA. 2. Acknowledgements and Thanks · Carl and Jim at Antares for the hardware, man pages, and other support/contributions they provided during the writing of this document. · Penn State University - Hershey Medical Center, Department of Radiology, Section of Clinical Image Management (My home away from my home away from home). · The software-raid-HOWTO Copyright 1997 by Linas Vepstas under the GNU public license agreement. The software-raid-HOWTO is Available from : http://www.linuxdoc.org 3. New Versions · The most recent version of this document can be found at my homepage: http://www.xray.hmc.psu.edu/~tcoates/ · Other versions may be found in different formats at the LDP homepage: http://www.linuxdoc.org and mirror sites. 4. Introduction The Antares 5070 is a high performance, versatile, yet relatively inexpensive host based RAID controller. Its embedded operating system (K9 kernel) is modelled on the Plan 9 operating system whose design is discussed in several papers from AT&T (see the "Further Reading" section). K9 is a kernel targeted at embedded controllers of small to medium complexity (e.g. ISDN-ethernet bridges, RAID controllers, etc). It supports multiple lightweight processes (i.e. without memory management) on a single CPU with a non-pre-emptive scheduler. Device driver architecture is based on Plan 9 (and Unix SVR4) streams. Concurrency control mechanisms include semaphores and signals. The 5070 has three single ended ultra 1 SCSI channels and two onboard serial interfaces one of which provides command line access via a connected serial terminal or modem. The other is used to upgrade the firmware. The command line is robust, implementing many of the essential Unix commands (e.g. dd, ls, cat, etc.) and a scaled down Bourne shell for scripting. The Unix command set is augmented with RAID specific configuration commands and scripts. In addition to the command line interface an ASCII text based GUI is provided to permit easy configuration of level 0, 1, 3, 4, and 5 RAIDs. 4.1. 5070 Main Features · RAID levels 0, 1, 3, 4, and 5 are supported. · Text based GUI for easy configuration for all supported RAID levels. · A Multidisk RAID volume appears as an individual SCSI drive to the operating system and can be managed with the standard utilities (fdisk, mkfs, fsck,etc.). RAID Volumes may be assigned to different SCSI IDs or the same SCSI IDs but different LUNs. · No special RAID drivers required for the host operating system. · Multiple RAID volumes of different levels can be mixed among the drives forming the physical plant. For example in a hypothetical drive plant consisting of 9 drives: · 2 drives form a level 3 RAID assigned to SCSI ID 5, LUN 0 · 2 drives form a level 0 RAID assigned to SCSI ID 5, LUN 1 · 5 drives form a level 5 RAID assigned to SCSI ID 6, LUN 0 · Three single ended SCSI channels which can accommodate 6 drives each (18 drives total). · Two serial interfaces. The first permits configuration/control/monitoring of the RAID from a local serial terminal. The second serial port is used to upload new programming into the 5070 (using PPP and TFTP). · Robust Unix-like command line and NVRAM based file system. · Configurable ASCII SCSI communication channel for passing commands to the 5070's command line interpreter. Allows programming running on host OS to directly configure/control/monitor all parameters of the 5070. 5. Background Much of the information/knowledge pertaining to RAID levels in this section is adapted from the software-raid-HOWTO by Linas Vepstas . See the acknowledgements section for the URL where the full document may be obtained. RAID is an acronym for "Redundant Array of Inexpensive Disks" and is used to create large, reliable disk storage systems out of individual hard disk drives. There are two basic ways of implementing a RAID, software or hardware. The main advantage of a software RAID is low cost. However, since the OS of the host system must manage the RAID directly there is a substantial penalty in performance. Furthermore if the RAID is also the boot device, a drive failure could prove disastrous since the operating system and utility software needed to perform the recovery is located on the RAID. The primary advantages of hardware RAID is performance and improved reliability. Since all RAID operations are handled by a dedicated CPU on the controller, the host system's CPU is never bothered with RAID related tasks. In fact the host OS is completely oblivious to the fact that its SCSI drives are really virtual RAID drives. When a drive fails on the 5070 it can be replaced on-the-fly with a drive from the spares pool and its data reconstructed without the host's OS ever knowing anything has happened. 5.1. Raid Levels The different RAID levels have different performance, redundancy, storage capacity, reliability and cost characteristics. Most, but not all levels of RAID offer redundancy against drive failure. There are many different levels of RAID which have been defined by various vendors and researchers. The following describes the first 7 RAID levels in the context of the Antares 5070 hardware RAID implementation. 5.2. RAID Linear RAID-linear is a simple concatenation of drives to create a larger virtual drive. It is handy if you have a number small drives, and wish to create a single, large drive. This concatenation offers no redundancy, and in fact decreases the overall reliability: if any one drive fails, the combined drive will fail. 5.2.0.0.1. SUMMARY · Enables construction of a large virtual drive from a number of smaller drives · No protection, less reliable than a single drive · RAID 0 is a better choice due to better I/O performance 5.3. Level 1 Also referred to as "mirroring". Two (or more) drives, all of the same size, each store an exact copy of all data, disk-block by disk-block. Mirroring gives strong protection against drive failure: if one drive fails, there is another with the an exact copy of the same data. Mirroring can also help improve performance in I/O-laden systems, as read requests can be divided up between several drives. Unfortunately, mirroring is also one of the least efficient in terms of storage: two mirrored drives can store no more data than a single drive. 5.3.0.0.1. SUMMARY · Good read/write performance · Inefficient use of storage space (half the total space available for data) · RAID 6 may be a better choice due to better I/O performance. 5.4. Striping Striping is the underlying concept behind all of the other RAID levels. A stripe is a contiguous sequence of disk blocks. A stripe may be as short as a single disk block, or may consist of thousands. The RAID drivers split up their component drives into stripes; the different RAID levels differ in how they organize the stripes, and what data they put in them. The interplay between the size of the stripes, the typical size of files in the file system, and their location on the drive is what determines the overall performance of the RAID subsystem. 5.5. Level 0 Similar to RAID-linear, except that the component drives are divided into stripes and then interleaved. Like RAID-linear, the result is a single larger virtual drive. Also like RAID-linear, it offers no redundancy, and therefore decreases overall reliability: a single drive failure will knock out the whole thing. However, the 5070 hardware RAID 0 is the fastest of any of the schemes listed here. 5.5.0.0.1. SUMMARY: · Use RAID 0 to combine smaller drives into one large virtual drive. · Best Read/Write performance of all the schemes listed here. · No protection from drive failure. · ADVICE: Buy very reliable hard disk drives if you plan to use this scheme. 5.6. Level 2 and 3 RAID-2 is seldom used anymore, and to some degree has been made obsolete by modern hard disk technology. RAID-2 is similar to RAID-4, but stores ECC information instead of parity. Since all modern disk drives incorporate ECC under the covers, this offers little additional protection. RAID-2 can offer greater data consistency if power is lost during a write; however, battery backup and a clean shutdown can offer the same benefits. RAID-3 is similar to RAID-4, except that it uses the smallest possible stripe size. 5.6.0.0.1. SUMMARY · RAID 2 is largely obsolete · Use RAID 3 to combine separate drives together into one large virtual drive. · Protection against single drive failure, · Good read/write performance. 5.7. Level 4 RAID-4 interleaves stripes like RAID-0, but it requires an additional drive to store parity information. The parity is used to offer redundancy: if any one of the drives fail, the data on the remaining drives can be used to reconstruct the data that was on the failed drive. Given N data disks, and one parity disk, the parity stripe is computed by taking one stripe from each of the data disks, and XOR'ing them together. Thus, the storage capacity of a an (N+1)-disk RAID-4 array is N, which is a lot better than mirroring (N+1) drives, and is almost as good as a RAID-0 setup for large N. Note that for N=1, where there is one data disk, and one parity disk, RAID-4 is a lot like mirroring, in that each of the two disks is a copy of each other. However, RAID-4 does NOT offer the read-performance of mirroring, and offers considerably degraded write performance. In brief, this is because updating the parity requires a read of the old parity, before the new parity can be calculated and written out. In an environment with lots of writes, the parity disk can become a bottleneck, as each write must access the parity disk. 5.7.0.0.1. SUMMARY · Similar to RAID 0 · Protection against single drive failure. · Poorer I/O performance than RAID 3 · Less of the combined storage space is available for data [than RAID 3] since an additional drive is needed for parity information. 5.8. Level 5 RAID-5 avoids the write-bottleneck of RAID-4 by alternately storing the parity stripe on each of the drives. However, write performance is still not as good as for mirroring, as the parity stripe must still be read and XOR'ed before it is written. Read performance is also not as good as it is for mirroring, as, after all, there is only one copy of the data, not two or more. RAID-5's principle advantage over mirroring is that it offers redundancy and protection against single-drive failure, while offering far more storage capacity when used with three or more drives. 5.8.0.0.1. SUMMARY · Use RAID 5 if you need to make the best use of your available storage space while gaining protection against single drive failure. · Slower I/O performance than RAID 3 6. Installation NOTE: The installation procedure given here for the SBUS controller is similar to that found in the manual. It has been modified so minor variations in the SPARCLinux installation may be included. 6.1. SBUS Controller Compatibility The 5070 / Linux 2.2 combination was tested on SPARCstation (5, 10, & 20), Ultra 1, and Ultra 2 Creator. The 5070 was also tested on Linux with Symmetrical Multiprocessing (SMP) support on a dual processor Ultra 2 creator 3D with no problems. Other 5070 / Linux / hardware combinations may work as well. 6.2. Hardware Installation Procedure If your system is already up and running, you must halt the operating system. 6.2.0.0.1. GNOME: 1. From the login screen right click the "Options" button. 2. On the popup menu select System -> Halt. 3. Click "Yes" when the verification box appears 6.2.0.0.2. KDE: 1. From the login screen right click shutdown. 2. On the popup menu select shutdown by right clicking its radio button. 3. Click OK 6.2.0.0.3. XDM: 1. login as root 2. Left click on the desktop to bring up the pop-up menu 3. select "New Shell" 4. When the shell opens type "halt" at the prompt and press return 6.2.0.0.4. Console Login (systems without X windows): 1. Login as root 2. Type "halt" 6.2.0.0.5. All Systems: Wait for the message "power down" or "system halted" before proceeding. Turn off your SPARCstation system (Note: Your system may have turned itself off following the power down directive), its video monitor, external disk expansion boxes, and any other peripherals connected to the system. Be sure to check that the green power LED on the front of the system enclosure is not lit and that the fans inside the system are not running. Do not disconnect the system power cord. 6.2.0.0.6. SPARCstation 4, 5, 10, 20 & UltraSPARC Systems: 1. Remove the top cover on the CPU enclosure. On a SPARCstation 10, this is done by loosening the captive screw at the top right corner of the back of the CPU enclosure, then tilting the top of the enclosure forward while using a Phillips screwdriver to press the plastic tab on the top left corner. 2. Decide which SBUS slot you will use. Any slot will do. Remove the filler panel for that slot by removing the two screws and rectangular washers that hold it in. 3. Remove the SBUS retainer (commonly called the handle) by pressing outward on one leg of the retainer while pulling it out of the hole in the printed circuit board. 4. Insert the board into the SBUS slot you have chosen. To insert the board, first engage the top of the 5070 RAIDium backpanel into the backpanel of the CPU enclosure, then rotate the board into a level position and mate the SBUS connectors. Make sure that the SBUS connectors are completely engaged. 5. Snap the nylon board retainers inside the SPARCstation over the 5070 RAIDium board to secure it inside the system. 6. Secure the 5070 RAIDium SBUS backpanel to the system by replacing the rectangular washers and screws that held the original filler panel in place. 7. Replace the top cover by first mating the plastic hooks on the front of the cover to the chassis, then rotating the cover down over the unit until the plastic tab in back snaps into place. Tighten the captive screw on the upper right corner. 6.2.0.0.7. 6XO MP Series: Ultra Enterprise Servers, SPARCserver 1000 & 2000 Systems, SPARCserver 1. Remove the two Allen screws that secure the CPU board to the card cage. These are located at each end of the CPU board backpanel. 2. Remove the CPU board from the enclosure and place it on a static- free surface. 3. Decide which SBUS slot you will use. Any slot will do. Remove the filler panel for that slot by removing the two screws and rectangular washers that hold it in. Save these screws and washers. 4. Remove the SBUS retainer (commonly called the handle) by pressing outward on one leg of the retainer while pulling it out of the hole in the printed circuit board. 5. Insert the board into the SBUS slot you have chosen. To insert the board, first engage the top of the 5070 RAIDium backpanel into the backpanel of the CPU enclosure, then rotate the board into a level position and mate the SBUS connectors. Make sure that the SBUS connectors are completely engaged. 6. Secure the 5070 RAIDium board to the CPU board with the nylon screws and standoffs provided on the CPU board. The standoffs may have to be moved so that they match the holes used by the SBUS retainer, as the standoffs are used in different holes for an MBus module. Replace the screws and rectangular washers that originally held the filler panel in place, securing the 5070 RAIDium SBus backpanel to the system enclosure. 7. Re-insert the CPU board into the CPU enclosure and re-install the Allen-head retaining screws that secure the CPU board. 6.2.0.0.8. All Systems: 1. Mate the external cable adapter box to the 5070 RAIDium and gently tighten the two screws that extend through the cable adapter box. 2. Connect the three cables from your SCSI devices to the three 68-pin SCSI-3 connectors on the Antares 5070 RAIDium. The three SCSI cables must always be reconnected in the same order after a RAID set has been established, so you should clearly mark the cables and disk enclosures for future disassembly and reassembly. 3. Configure the attached SCSI devices to use SCSI target IDs other than 7, as that is taken by the 5070 RAIDium itself. Configuring the target number is done differently on various devices. Consult the manufacturer's installation instructions to determine the method appropriate for your device. 4. As you are likely to be installing multiple SCSI devices, make sure that all SCSI buses are properly terminated. This means a terminator is installed only at each end of each SCSI bus daisy chain. 6.2.0.0.9. Verifying the Hardware Installation: These steps are optional but recommended. First, power-on your system and interrupt the booting process by pressing the "Stop" and "a" keys (or the "break" key if you are on a serial terminal) simultaneously as soon as the Solaris release number is shown on the screen. This will force the system to run the Forth Monitor in the system EPROM, which will display the "ok" prompt. This gives you access to many useful low-level commands, including: ok show-devs . . . /iommu@f,e0000000/sbus@f,e000100SUNW, isp@1,8800000 . . . The first line in the response shown above means that the 5070 RAIDium host adapter has been properly recognized. If you don't see a line like this, you may have a hardware problem. Next, to see a listing of all the SCSI devices in your system, you can use the probe-scsi-all command, but first you must prepare your system as follows: ok setenv auto-boot? False ok reset ok probe-scsi-all This will tell you the type, target number, and logical unit number of every SCSI device recognized in your system. The 5070 RAIDium board will report itself attached to an ISP controller at target 0 with two Logical Unit Numbers (LUNs): 0 for the virtual hard disk drive, and 7 for the connection to the Graphical User Interface (GUI). Note: the GUI communication channel on LUN 7 is currently unused under Linux. See the discussion under "SCSI Monitor Daemon (SMON)" in the "Advanced Topics" section for more information. REQUIRED: Perform a reconfiguration boot of the operating system: ok boot -r If no image appears on your screen within a minute, you most likely have a hardware installation problem. In this case, go back and check each step of the installation procedure. This completes the hardware installation procedure. 6.3. Serial Terminal If you have a serial terminal at your disposal (e.g. DEC-VT420) it may be connected to the controller's serial port using a 9 pin DIN male to DB25 male serial cable. Otherwise you will need to supplement the above cable with a null modem adapter to connect the RAID controller's serial port to the serial port on either the host computer or a PC. The terminal emulators I have successfully used include Minicom (on Linux), Kermit (on Caldera's Dr. DOS), and Hyperterminal (on a windows CE palmtop), however, any decent terminal emulation software should work. The basic settings are 9600 baud , no parity, 8 data bits, and 1 stop bit. 6.4. Hard Drive Plant Choosing the brand and capacity of the drives that will form the hard drive physical plant is up to you. I do have some recommendations: · Remember, you generally get what you pay for. I strongly recommend paying the extra money for better (i.e. more reliable) hardware especially if you are setting up a RAID for a mission critical project. For example, consider purchasing drive cabinets with redundant hot-swappable power supplies, etc. · You will also want a UPS for your host system and drive cabinets. Remember, RAID levels 3 and 5 protect you from data loss due to drive failure NOT power failure. · The drive cabinet you select should have hot swappable drive bays, these cost more but are definitely worth it when you need to add/change drives. · Make sure the cabinet(s) have adequate cooling when fully loaded with drives. · Keep your SCSI cables (internal and external) as short as possible · Mark the drives/cabinet(s) in such a way that you will be able to reconnect them to the controller in their original configuration. Once the RAID is configured you cannot re-organize you drives without re-configuring the RAID (and subsequently erasing the data stored on it). · Keep in mind that although it is physically possible to connect/configure up to 6 drives per channel, performance will sharply decrease for RAIDs with more than three drives per channel. This is due to the 25 MHz bandwidth limitation of the SBUS. Therefore, if read/write performance is an issue go with a small number of large drives. If you need a really large RAID (~ 1 terabyte) then you will have no other choice but to load the channels to capacity and pay the performance penalty. NOTE: if you are serving files over a 10/100 Base T network you may not notice the performance decrease since the network is usually the bottleneck not the SBUS. 7. 5070 Onboard Configuration Before diving into the RAID configuration I need to define a few terms. · "RaidRunner" is the name given to the the 5070 controller board. · "Husky" is the name given to the shell which produces the ":raid;" command prompt. It is a command language interpreter that executes commands read from the standard input or from a file. Husky is a scaled down model of Unix's Bourne shell (sh). One major difference is that husky has no concept of current working directory. For more information on the husky shell and command prompt see the "Advanced Topics" section · The "host port" is the SCSI ID assigned to the controller card itself. This is usually ID 7. · A "backend" is a drive attached to the controller on a given channel. · A "rank" is a collection of all the backends from each channel with the same SCSI ID (i.e. rank 0 would consist of all the drives with SCSI ID 0 on each channel) · Each of the backends is identified by a three digit number where the first digit is the channel, the second the SCSI ID of the drive, and the third the LUN of the drive. The numbers are separated by a period. The identifier is prefixed with a "D" if it is a disk or "T" if it is a tape (e.g. D0.1.0). This scheme is referred to as in the following documentation. · A "RAID set" consists of given number of backends (there are certain requirements which I'll come to later) · A "spare" is a drive which is unused until there is a failure in one of the RAID drives. At that time the damaged drive is automatically taken offline and replaced with the spare. The data is then reconstructed on the spare and the RAID resumes normal operation. · Spares may either be "hot" or "warm" depending on user configuration. Hot spares are spun up when the RAID is started, which shortens the replacement time when a drive failure occurs. Warm spares are spun up when needed, which saves wear on the drive. The test based GUI can be started by typing "agui" : raid; agui at the husky prompt on the serial terminal (or emulator). Agui is a simple ASCII based GUI that can be run on the RaidRunner console port which enables one to configure the RaidRunner. The only argument agui takes is the terminal type that is connected to the RaidRunner console. Current supported terminals are dtterm, vt100 and xterm. The default is dtterm. Each agui screen is split into two areas, data and menu. The data area, which generally uses all but the last line of the screen, displays the details of the information under consideration. The menu area, which generally is the bottom line of the screen, displays a strip menu with a title then list of options or sub-menus. Each option has one character enclosed in square brackets (e.g. [Q]uit) which is the character to type to select that option. Each menu line allows you to refresh the screen data (in case another process on the RaidRunner writes to the console). The refresh character may also be used during data entry if the screen is overwritten. The refresh character is either or . When agui starts, it reads the configuration of the RaidRunner and probes for every possible backend. As it probes for each backend, it's "name" is displayed in the bottom left corner of the screen. 7.1. Main Screen Options 7.1.0.0.1.
The Main screen is the first screen displayed. It provides a summary of the RaidRunner configuration. At the top is the RaidRunner model, version and serial number. Next is a line displaying, for each controller, the SCSI ID's for each host port (labeled A, B, C, etc) and total and currently available amounts of memory. The next set of lines display the ranks of devices on the RaidRunner. Each device follows the nomenclature of where device_type_ can be D for disk or T for tape, c is the internal channel the device is attached to, s is the SCSI ID (Rank) of the device on that channel, and l is the SCSI LUN of the device (typically 0). The next set of lines provide a summary of the Raid Sets configured on the RaidRunner. The summary includes the raid set name, it's type, it's size, the amount of cache allocated to it and a comma separated list of it's backends. See rconf in the "Advanced Topics" section for a full description of the above. Next are the spare devices configured. Each spare is named (device_type_c.s.l format), followed by it's size (in 512-byte blocks), it's spin state (Hot or Warm), it's controller allocation , and finally it's current status (Used/Unused, Faulty/Working). If used, the raid set that uses it is nominated. At the bottom of the data area, the number of controllers, channels, ranks and devices are displayed. The menu line allows one to quit agui or select further actions or sub-menus. · [Q]uit: Exit the main screen and return to the husky prompt. · [R]aidSets: Enter the RaidSet configuration screen. · [H]ostports Enter the Host Port configuration screen. · [S]pares Enter the Spare Device configuration screen. · [M]onitor Enter the SCSI Monitor configuration screen. · [G]eneral Enter the General configuration/information screen. · [P]robe Re-probe the device backends on the RaidRunner. As each backend is probed it's "name" (c.s.l format) is displayed in the bottom left corner of the screen. These selections are described in detail below. 7.2. [Q]uit Exit the agui main screen and return to the husky ( :raid; ) prompt. 7.3. [R]aidSets: 7.3.0.0.1.
The Raid Set Configuration screen displays a Raid Set in the data area and provides a menu which allows you to Add, Delete, Modify, Install (changes) and Scroll through all other raid sets (First, Last, Next and Previous). If no raid sets have been configured, only the screen title and menu is displayed. All attributes of the raid set are displayed. For information on each attribute of the raid set, see the rconf command in the "Advanced Topics" section. The menu line allows one to leave the Raid Set Configuration screen or select further actions: · [Q]uit: Exit the Raid Set Configuration screen and return to the Main screen. If you have modified, deleted or added a raid set and have not installed the changes you will be asked to confirm this. If you select Yes to continue the exit, all changes made since the last install action will be discarded. · [I]nst: This action installs (into the RaidRunner configuration area) any changes that may have been made to raid sets, be that deletion, addition or modification. If you exit prior to installing, all changes made since the last installation will be discarded. The installation process takes time. It is complete once the typed "i" character, is cleared from the menu line. · [M]od: This action allows you to modify the displayed raid set. You will be prompted for each Raid Set attribute that can be changed. The prompt includes allowable options or formats required. If you don't wish to change a particular attribute, then press the RETURN or TAB key. The attributes you can change are the raid set name, I/O mode, status (Active to Inactive), bootmode, spares usage, backend zone table usage, IO size (if raid set has never been used - i.e. just added), cache size, I/O queues length, host interfaces and additional stargd arguments. If you wish to change a single attribute then use the RETURN or TAB key to skip all other options. The changed attribute will be re-displayed as soon as you press the RETURN key. When specifying cache size, you may suffix the number with 'm' or 'M' to indicate the number is in Megabytes or with 'k' or 'K' to indicate the number is in Kilobytes. Note you can only enter whole integer values. When specifying io size, you may suffix the number with 'k' or 'K' to indicate the number is in Kilobytes. When you enter data, it is checked for correctness and if incorrect, a message is displayed and all changes are discarded and you will have to start again. Remember you must install ([I]nst.) any changes. · [A]dd: When this option is selected you will be prompted for various attributes of the new raid set. These attributes are the raid set name, the raid set type, the initial host interface the raid set is to appear on (in c.h.l format where c is the controller number, h is the host port (0, 1, 2 etc) and l is the SCSI LUN) and finally a list of backends. When backends are to be entered, the screen displays a list of available backends, each with a numeric index (commencing at 0). You select each backend by entering the index and once complete enter q for Quit. As each backend index is entered, it's backend name is displayed in a comma separated list. When you enter data, it is checked for correctness and if incorrect, a message is displayed and the addition will be ignored and you will have to start again. Once the backends are complete, the newly created raid set will be displayed on the screen with supplied and default attributes. You can then modify the raid set to change other attributes. Remember you must install ([I]nst.) any new raid sets. · [D]elete: This action will delete the currently displayed raid set. If this raid set is Active, then you will not be allowed to delete it. You will have to make it Inactive (via the [M]od. option) then delete it. You will be prompted to confirm the deletion. Once you confirm the deletion, the screen will be cleared and the next raid set will be displayed, if configured. Remember you must install ([I]nst.) any changes. · [F]irst, [L]ast, [N]ext and [P]rev allow you to scroll through the configured raid sets. 7.4. [H]ostports: 7.4.0.0.1.
The Host Port Configuration screen displays for each controller, each host port (labelled A, B, C, etc for port number 0, 1, 2, etc) and the assigned SCSI ID. If the RaidRunner you use, has external switches for host port SCSI ID selection, you may only exit ([Q]uit) from this screen. If the RaidRunner you use, does NOT have external switches for host port SCSI ID selection, then you may modify (and hence install) the SCSI ID for any host port. The menu line allows one to leave the Host Port Configuration screen or select further actions (if NO external host): · [Q]uit: Exit the Host Port Configuration screen and return to the Main screen. If you have modified a host port SCSI ID assignment and have not installed the changes you will be asked to confirm this. If you select Yes to continue the exit, all changes made since the last install action will be discarded. · [I]nstall: This action installs (into the RaidRunner configuration area) any changes that may have been made to host port SCSI ID assign­ ments. If you exit prior to installing, all changes made since the last installation will be discarded. The installation process takes time. It is complete once the typed "i" character, is cleared from the menu line. · [M]odify: This action allows you to modify the host port SCSI ID assignments for each host port on each controller (if NO external host port SCSI ID switches). You will be prompted for the SCSI ID for each host port. You can enter either a SCSI ID (0 thru 15), the minus "-" character to clear the SCSI ID assignment or RETURN to SKIP. As you enter data, it is checked for correctness and if incorrect, a message will be printed although previously correctly entered data will be retained. Remember you must install ([I]nst.) any changes. 7.5. [S]pares: 7.5.0.0.1.
The Spare Device Configuration screen displays all configured spare devices in the data area and provides a menu which allows you to Add, Delete, Mod­ ify and Install (changes) spare devices. If no spare devices have been configured, only the screen title and menu is displayed. Each spare device displayed, shows it's name (in device_type_c.s.l format), it's size in 512-byte blocks, it's spin status (Hot or Warm), it's controller allocation, finally it's current status (Used/Unused, Faulty/Working). If used, the raid set that uses it is nominated. For information on each attribute of a spare device, see the rconf command in the "Advanced Topics" section. The menu line allows one to leave the Spare Device Configuration screen or select further actions: · [Q]uit: Exit the Spare Device Configuration screen and return to the Main screen. If you have modified, deleted or added a spare device and have not installed the changes you will be asked to confirm this. If you select Yes to continue the exit, all changes made since the last install action will be discarded. · [I]nstall: This action installs (into the RaidRunner configuration area) any changes that may have been made to the spare devices, be that deletion, addition or modification. If you exit prior to installing, all changes made since the last installation will be discarded. The installation process takes time. It is complete once the typed "i" character, is cleared from the menu line. · [M]odify: This action allows you to modify the unused spare devices. You will be prompted for each spare device attribute that can be changed. The prompt includes allowable options or formats required. If you don't wish to change a particular attribute, then press the RETURN key. The attributes you can change are the new size (in 512-byte blocks), the spin state (H or hot or W for Warm), and the controller allocation (A for any, 0 for controller 0, 1 for controller 1, etc). If you wish to change a single attribute of a spare device, then use the RETURN key to skip all other attributes for each spare device. The changed attribute will not be re- displayed until the last prompted attribute is entered (or skipped). When you enter data, it is checked for cor­ rectness and if incorrect, a message is dis­ played and all changes are discarded and you will have to start again. Remember you must install ([I]nstall) any changes. · [A]dd: When adding a spare device, the list of available devices is displayed and you are required to type in the device name. Once entered, the spare is added with defaults which you can change, if required, via the [M]odify option. Remember you must install ([I]nstall) any changes. · [D]elete: When deleting a spare device, the list of spare devices allowed to be deleted is displayed and you are required to type in the required device name. Once entered, the spare is deleted from the screen. Remember you must install ([I]nstall) any changes. 7.6. [M]onitor: 7.6.0.0.1.
The SCSI Monitor Configuration screen displays a table of SCSI monitors configured for the RaidRunner. Up to four SCSI monitors may be configured. The table columns are entitled Controller, Host Port, SCSI LUN and Protocol and each line of the table shows the appropriate SCSI Monitor attribute. For details on SCSI Monitor attributes, see the rconf command in the "Advanced Topics" section. The menu line allows one to leave the SCSI Monitor Configuration screen or modify and install the table. · [Q]uit: Exit the SCSI Monitor Configuration screen and return to the Main screen. If you have made changes and have not installed them you will be asked to confirm this. If you select Yes to continue the exit, all changes made since the last install action will be discarded. · [I]nstall: This action installs (into the RaidRunner configuration area) any changes that may have been made to SCSI Monitor configuration. If you exit prior to installing, all changes made since the last installation will be discarded. The installation process takes time. It is complete once the typed "i" character, is cleared from the menu line. · [M]odify: This action allows you to modify the SCSI Monitor configuration. The cursor will be moved around the table, prompting you for input. If you do not want to change an attribute, enter RETURN to skip. If you want to delete a SCSI monitor then enter the minus "-" character when prompted for the controller number. If you want to use the default protocol list, then enter RETURN at the Protocol List prompt. As you enter data, it is checked for correctness and if incorrect, a message will be printed and any previously entered data is discarded. You will have to re- enter the data again. Remember you must install ([I]nstall) any changes. 7.7. [G]eneral: 7.7.0.0.1.
The General screen has a blank data area and a menu which allows one to Quit and return to the main screen, or to select further sub-menus which provide information about Devices, the System Message Logger, Global Environment variables and throughput Statistics. · [Q]uit: Exit the General screen and return to the Main screen. · [D]evices: Enter the Device information screen. The Devices screen displays the name of all devices on the RaidRunner. The menu line allows one to Quit and return to the General screen or display information about the devices.
· [Q]uit: Exit the Devices screen and return to the General screen. · [I]nformation: The Device Information screen displays information about each device. You can scroll through the devices. For disks, information displayed includes, the device name, serial number, vendor name, product id, speed, version, sector size, sector count, total device size in MB, number of cylinders, heads and sectors per track and the zone/notch partitions. The menu line allows one the leave the Device Information screen or browse through devices.
· [Q]uit: Exit the Device Information screen and return to the Devices screen. · [F]irst, [L]ast, [N]ext and [P]rev allow you to scroll through the devices and hence display their current data . · Sys[L]og: Enter the System Logger Messages screen.
· [Q]uit: Exit the System Logger Messages screen and return to the General screen. · [F]irst, [L]ast, [N]ext and [P]rev allow you to scroll through the system log. · [E]nvironment: Enter the Global Environment Variable configuration screen. The Environment Variable Configuration screen dis­ plays all configured Global Environment Variables and provides a menu which allows you to Add, Delete, Modify and Install (changes) variables. Each variable name is displayed followed by an equals "=" and the value assigned to that variable enclosed in braces - "{" .. "}". The menu line allows you to Quit and return to the General screen or select further actions.
· [Q]uit: Exit the Environment Variable Configuration screen and return to the General screen. If you have modified, deleted or added an environment variable and have not installed the changes you will be asked to confirm this. If you select Yes to continue the exit, all changes made since the last install action will be discarded. · [I]nst: This action installs (into the RaidRunner configuration area) any changes that may have been made to environment variables, be that deletion, addition or modification. If you exit prior to installing, all changes made since the last installation will be discarded. The installation process takes time. It is complete once the typed "i" character, is cleared from the menu line. · [M]od: This action allows you to modify an environment variable's value. You will be prompted for the name of the environment variable and then prompted for it's new value. If the environment variable entered is not found, a message will be printed and you will not be prompted for a new value. If you do not enter a new value, (i.e. just press RETURN) no change will be made. Remember you must install ([I]nstall) any changes. · [A]dd: When adding a new environment variable, you will be prompted for it's name and value. Providing the variable name is not already used and you enter a value, the new variable will be added and displayed. Remember you must install ([I]nstall) any changes. · [D]elete: When deleting an environment variable, you will be prompted for the variable name and if valid, the environment variable will be deleted. Remember you must install ([I]nstall) any changes. · [S]tats: Enter the Statistics monitoring screen. The Statistics screen display various general and specific statistics about raid sets configured and running on the RaidRunner. The first section of the data area displays the current temperature in degrees Celsius and the current speed of fans in the RaidRunner. The next section of the data area displays various statistics about the named raid set. The statistics are - the current cache hit rate, the cumulative number of reads, read failures, writes and write failures for each backend of the raid set and finally the read and write throughput for each stargd process (indicated by it's process id) that front's the raid set. The menu line allows one the leave the Statistics screen or select further actions.
· [Q]uit: Exit the Statistics screen and return to the General screen. · [F]irst, [L]ast, [N]ext and [P]rev allow you to scroll through the statistics. · [R]efresh: This option will get the statistics for the given raid set and re-display the current statistics on the screen. · [Z]ero: This option will zero the cumulative statistics for the currently displayed raid set. · [C]ontinuous: This option will start a back­ ground process that will update the statis­ tics of the currently displayed raid set every 2 seconds. A loop counter is created and updated every 2 seconds also. To inter­ rupt this continuous mode of gathering statistics, just press any character. If you need to re-fresh the display, then press the refresh characters - or . 7.8. [P]robe The probe option re-scans the SCSI channels and updates the backend list with the hardware it finds. 7.9. Example RAID Configuration Session The generalized procedure for configuration consists of three steps arranged in the following order: 1. Configuring the Host Port(s) 2. Assigning Spares 3. Configuring the RAID set Note that there is a minimum number of backends required for the various supported RAID levels: · Level 0 : 2 backends · Level 3 : 2 backends · Level 5 : 3 backends In this example we will configure a RAID 5 using 6, 2.04 gigabyte drives. The total capacity of the virtual drive will be 10 gigabytes (the equivalent of one drive is used for redundancy). This same configuration procedure can be used to configure other levels of RAID sets by changing the type parameter. 1. Power on the computer with the serial terminal connected to the RaidRunner's serial port. 2. When the husky ( :raid; ) prompt appears, Start the GUI by typing "agui" and pressing return. 3. When the main screen appears, select "H" for [H]ostport configuration 4. On some models of RaidRunner the host port in not configurable. If you have only a [Q]uit option here then there is nothing further to be done for the host port configuration, note the values and skip to step 6. If you have add/modify options then your host port is software configurable. 5. If there is no entry for a host port on this screen, add an entry with the parameters: controller=0, hostport=0 , SCSI ID=0. Don't forget to [I]nstall your changes. If there is already and entry present, note the values (they will be used in a later step). 6. From this point onward I will assume the following hardware configuration: a. There are 7 - 2.04 gig drives connected as follows: i. 2 drives on SCSI channel 0 with SCSI IDs 0 and 1 (backends 0.0.0, and 0.1.0, respectively). ii. 3 drives on SCSI channel 1 with SCSI IDs 0 ,1 and 5 (backends 1.0.0, 1.1.0, and 1.5.0). iii. 2 drives on SCSI channel 2 with SCSI IDs 0 and 1 (backends 2.0.0 and 2.1.0). b. Therefore: i. Rank 0 consists of backends 0.0.0, 1.0.0, 2.0.0 ii. Rank 1 consists of backends 0.1.0, 1.1.0, 2.1.0 iii. Rank 5 contains only the backend 1.5.0 c. The RaidRunner is assigned to controller 0, hostport 0 7. Press Q to [Q]uit the hostports screen and return to the Main screen. 8. Press S to enter the [S]pares screen 9. Select A to [A]dd a new spare to the spares pool. A list of available backends will be displayed and you will be prompted for the following information: Enter the device name to add to spares - from above: enter D1.5.0 Select I to [I]nstall your changes Select Q to [Q]uit the spares screen and return to the Main screen Select R from the Main screen to enter the [R]aidsets screen. Select A to [A]dd a new RAID set. You will be prompted for each of the RAID set parameters. The prompts and responses are given below. 1. Enter the name of Raid Set: cim_homes (or whatever you want to call it). 2. Raid set type [0,1,3,5]: 5 3. Enter initial host interface - ctlr,hostport,scsilun: 0.0.0 Now a list of the available backends will be displayed in the form: 0 - D0.0.0 1 - D1.0.0 2 - D2.0.0 3 - D0.1.0 4 - D1.1.0 5 - D2.1.0 4. Enter index from above - Q to Quit: 1 press return 2 press return 3 press return 4 press return 5 press return Q After pressing Q you will be returned to the Raid Sets screen. You should see the newly configured Raid set displayed in the data area. Press I to [I]nstall the changes
Press Q to exit the RaidSet screen and return to the the Main screen Press Q to [Q]uit agui and exit to the husky prompt. type "reboot" then press enter. This will reboot the RaidRunner (not the host machine.) When the RaidRunner reboots it will prepare the drives for the newly configured RAID. NOTE: Depending on the size of the RAID this could take a few minutes to a few hours. For the above example it takes the 5070 approximately 10 - 20 minutes to stripe the RAID set. Once you see the husky prompt again the RAID is ready for use. You can then proceed with the Linux configuration. 8. Linux Configuration These instructions cover setting up the virtual RAID drives on RedHat Linux 6.1. Setting it up under other Linux distributions should not be a problem. The same general instructions apply. If you are new to Linux you may want to consider installing Linux from scratch since the RedHat installer will do most of the configuration work for you. If so skip to section titled "New Linux Installation." Otherwise go to the "Existing Linux Installation" section (next). 8.1. Existing Linux Installation Follow these instructions if you already have Redhat Linux installed on your system and you do not want to re-install. If you are installing the RAID as part of a new RedHat Linux installation (or are re-installing) skip to the "New Linux Installation" section. 8.1.1. QLogic SCSI Driver The driver can either be loaded as a module or compiled into your kernel. If you want to boot from the RAID then you may want to use a kernel with compiled in QLogic support (see the kernel-HOWTO available from http://www.linuxdoc.org. To use the modular driver become the superuser and add the following lines to /etc/conf.modules: alias qlogicpti /lib/modules/preferred/scsi/qlogicpti Change the above path to where ever your SCSI modules live. Then add the following line to you /etc/fstab (with the appropriate changes for device and mount point, see the fstab man page if you are unsure) /dev/sdc1 /home ext2 defaults 1 2 Or, if you prefer to use a SYSV initialization script, create a file called "raid" in the /etc/rc.d/init.d directory with the following contents (NOTE: while there are a few good reasons to start the RAID using a script, one of the aforementioned methods would be preferable): #!/bin/bash case "$1" in start) echo "Loading raid module" /sbin/modprobe qlogicpti echo echo "Checking and Mounting raid volumes..." mount -t ext2 -o check /dev/sdc1 /home touch /var/lock/subsys/raid ;; stop) echo "Unmounting raid volumes" umount /home echo "Removing raid module(s)" /sbin/rmmod qlogicpti rm -f /var/lock/subsys/raid echo ;; restart) $0 stop $0 start ;; *) echo "Usage: raid {start|stop|restart}" exit 1 esac exit 0 You will need to edit this example and substitute your device name(s) in place of /dev/sdc1 and mount point(s) in place of /home. The next step is to make the script executable by root by doing: chmod 0700 /etc/rc.d/init.d/raid Now use your run level editor of choice (tksysv, ksysv, etc.) to add the script to the appropriate run level. 8.1.2. Device mappings Linux uses dynamic device mappings you can determine if the drives were found by typing: more /proc/scsi/scsi one or more of the entries should look something like this: Host: scsi1 Channel: 00 Id: 00 Lun: 00 Vendor: ANTARES Model: CX106 Rev: 0109 Type: Direct-Access ANSI SCSI revision: 02 There may also be one which looks like this: Host: scsi1 Channel: 00 Id: 00 Lun: 07 Vendor: ANTARES Model: CX106-SMON Rev: 0109 Type: Direct-Access ANSI SCSI revision: 02 This is the SCSI monitor communications channel which is currently un- used under Linux (see SMON in the advanced topics section below). To locate the drives (following reboot) type: dmesg | more Locate the section of the boot messages pertaining to you SCSI devices. You should see something like this: qpti0: IRQ 53 SCSI ID 7 (Firmware v1.31.32)(Firmware 1.25 96/10/15) [Ultra Wide, using single ended interface] QPTI: Total of 1 PTI Qlogic/ISP hosts found, 1 actually in use. scsi1 : PTI Qlogic,ISP SBUS SCSI irq 53 regs at fd018000 PROM node ffd746e0 Which indicates that the SCSI controller was properly recognized, Below this look for the disk section: Vendor ANTARES Model: CX106 Rev: 0109 Type: Direct-Access ANSI SCSI revision: 02 Detected scsi disk sdc at scsi1, channel 0, id 0, lun 0 SCSI device sdc: hdwr sector= 512 bytes. Sectors= 20971200 [10239 MB] [10.2 GB] Note the line that reads "Detected scsi disk sdc ..." this tells you that this virtual disk has been mapped to device /dev/sdc. Following partitioning the first partition will be /dev/sdc1, the second will be /dev/sdc2, etc. There should be one of the above disk sections for each virtual disk that was detected. There may also be an entry like the following: Vendor ANTARES Model: CX106-SMON Rev: 0109 Type: Direct-Access ANSI SCSI revision: 02 Detected scsi disk sdd at scsi1, channel 0, id 0, lun 7 SCSI device sdd: hdwr sector= 512 bytes. Sectors= 20971200 [128 MB] [128.2 MB] BEWARE: this is not a drive DO NOT try to fdisk, mkfs, or mount it!! Doing so WILL hang your system. 8.1.3. Partitioning A virtual drive appears to the host operating system as a large but otherwise ordinary SCSI drive. Partitioning is performed using fdisk or your favorite utility. You will have to give the virtual drive a disk label when fdisk is started. Using the choice "Custom with autoprobed defaults" seems to work well. See the man page for the given utility for details. 8.1.4. Installing a filesystem Installing a filesystem is no different from any other SCSI drive: mkfs -t /dev/ for example: mkfs -t ext2 /dev/sdc1 8.1.5. Mounting If QLogic SCSI support is compiled into you kernel OR you are loading the "qlogicpti" module at boot from /etc/conf.modules then add the following line(s) to the /etc/fstab: /dev/ ext2 defaults 1 1 If you are using a SystemV initialization script to load/unload the module you must mount/unmount the drives there as well. See the example script above. 8.2. New Linux Installation This is the easiest way to install the RAID since the RedHat installer program will do most of the work for you. 1. Configure the host port, RAID sets, and spares as outlined in "Onboard Configuration." Your computer must be on to perform this step since the 5070 is powered from the SBUS. It does not matter if the computer has an operating system installed at this point all we need is power to the controller card. 2. Begin the RedHat SparcLinux installation 3. The installation program will auto detect the 5070 controller and load the Qlogic driver 4. Your virtual RAID drives will appear as ordinary SCSI hard drives to be partitioned and formatted during the installation. NOTE: When using the graphical partitioning utility during the RedHat installation DO NOT designate any partition on the virtual drives as type RAID since they are already hardware managed virtual RAID drives. The RAID selection on the partitioning utilities screen is for setting up a software RAID. IMPORTANT NOTE: you may see a small SCSI drive ( usually ~128 MB) on the list of available drives. DO NOT select this drive for use. It is the SMON communication channel NOT a drive. If setup tries to use it it will hang the installer. 5. Thats it, the installation program takes care of everything else !! 9. Maintenance 9.1. Activating a spare When running a RAID 3 or 5 (if you configured one or more drives to be spares) the 5070 will detect when a drive goes offline and automatically select a spare from the spares pool to replace it. The data will be rebuilt on-the-fly. The RAID will continue operating normally during the re-construction process (i.e. it can be read from and written to just is if nothing has happened). When a backend fails you will see messages similar to the following displayed on the 5070 console: 930 secs: Redo:1:1 Retry:1 (DIO_cim_homes_D1.1.0_q1) CDB=28(Read_10)Re-/Selection Time-out @682400+16 932 secs: Redo:1:1 Retry:2 (DIO_cim_homes_D1.1.0_q1) CDB=28(Read_10)Re-/Selection Time-out @682400+16 933 secs: Redo:1:1 Retry:3 (DIO_cim_homes_D1.1.0_q1) CDB=28(Read_10)Re-/Selection Time-out @682400+16 934 secs: CIO_cim_homes_q3 R5_W(3412000, 16): Pre-Read drive 4 (D1.1.0) fails with result "Re-/Selection Time-out" 934 secs: CIO_cim_homes_q2 R5: Drained alternate jobs for drive 4 (D1.1.0) 934 secs: CIO_cim_homes_q2 R5: Drained alternate jobs for drive 4 (D1.1.0) RPT 1/0 934 secs: CIO_cim_homes_q2 R5_W(524288, 16): Initial Pre-Read drive 4 (D1.1.0) fails with result "Re-/Selection Time-out" 935 secs: Redo:1:0 Retry:1 (DIO_cim_homes_D1.0.0_q1) CDB=28(Read_10)SCSI Bus ~Reset detected @210544+16 936 secs: Failed:1:1 Retry:0 (rconf) CDB=2A(Write_10)Re-/Selection Time-out @4194866+128 Then you will see the spare being pulled from the spares pool, spun up, tested, engaged, and the data reconstructed. 937 secs: autorepair pid=1149 /raid/cim_homes: Spinning up spare device 938 secs: autorepair pid=1149 /raid/cim_homes: Testing spare device/dev/hd/1.5.0/data 939 secs: autorepair pid=1149 /raid/cim_homes: engaging hot spare ... 939 secs: autorepair pid=1149 /raid/cim_homes: reconstructing drive 4 ... 939 secs: 1054 939 secs: Rebuild on /raid/cim_homes/repair: Max buffer 2800 in 7491 reads, priority 6 sleep 500 The rebuild script will printout its progress every 10% of the job completed 939 secs: Rebuild on /raid/cim_homes/repair @ 0/7491 1920 secs: Rebuild on /raid/cim_homes/repair @ 1498/7491 2414 secs: Rebuild on /raid/cim_homes/repair @ 2247/7491 2906 secs: Rebuild on /raid/cim_homes/repair @ 2996/7491 9.2. Re-integrating a repaired drive into the RAID (levels 3 and 5) After you have replaced the bad drive you must re-integrate it into the RAID set using the following procedure. 1. Start the text GUI 2. Look the list of backends for the RAID set(s). 3. Backends that have been marked faulty will have a (-) to the right of their ID ( e.g. D1.1.0- ). 4. If you set up spares the ID of the faulty backend will be followed by the ID of the spare that has replaced it ( e.g. D1.1.0-D1.5.0 ) . 5. Write down the ID(s) of the faulty backend(s) (NOT the spares). 6. Press Q to exit agui 7. At the husky prompt type: replace Where is whatever you named the raid set and is the ID of the backend that is being re-integrated into the RAID. If a spare was in use it will be automatically returned to the spares pool. Be patient, reconstruction can take a few minutes minutes to several hours depending on the RAID level and the size. Fortunately, you can use the RAID as you normally would during this process. 10. Troubleshooting / Error Messages 10.1. Out of band temperature detected... · Probable Cause: The 5070 SBUS card is not adequately cooled. · Solution: Try to improve cooling inside the case. Clean dust from the fans, re-organize the cards so the raid card is closest to the fan, etc. On some of the "pizza box" sun cases (e.g. SPARC 20) you may need to add supplementary cooling fans especially if you have it loaded with cards. 10.2. ... failed ... cannot have more than 1 faulty backend. · Cause: More than one backend in the RAID 3/4/5 has failed (i.e. there is no longer sufficient redundancy to enable the lost data to be reconstructed). · Solution: You're hosed ... Sorry. If you did not assign spares when you configured you RAID 3/4/5 now may be a good time to re- consider the wisdom of that decision. Hopefully you have been making regular backups. Since now you will have to replace the defective drives, re-configure the RAID, and restore the data from a secondary source. 10.3. table. When booting I see: ... Sun disklabel: bad magic 0000 ... unknown partition · Suspected Cause: Incorrect settings in the disk label set by fdisk (or whatever partitioning utility you used). This message seems to happen when you choose one of the preset disk labels rather than "Custom with autoprobed defaults." · Solution: Since this error does not seem to effect the operation of the drive you can choose to do nothing and be ok. If you want to correct it you can try re-labeling the disk or re-partitioning the disk and choose "Custom with autoprobed defaults." If you are installing RedHat Linux from scratch the installer will get all of this right for you. 11. Bugs None yet! Please send bug reports to tdc3@psu.edu 12. Frequently Asked Questions 12.1. How do I reset/erase the onboard configuration? At the husky prompt issue the following command: rconf -init This will delete all of the RAID configuration information but not the global variables and scsi monitors. the remove ALL configuration information type: rconf -fullinit Use these commands with caution! 12.2. How can I tell if a drive in my RAID has failed? In the text GUI faulty backends appear with a (-) to the right of their ID. For example the list of backends: D0.0.0,D1.0.0-,D2.0.0,D0.1.0,D1.1.0,D2.1.0 Indicates that backend (drive) D1.0.0 is either faulty or not present. If you assigned spares (RAID 3 or 5) then you should also see that one or more spares are in use. Both the main and the and the RaidSets screens will show information on faulty/not present drives in a RAID set. 13. Advanced Topics: 5070 Command Reference In addition to the text based GUI the RAID configuration may also be manipulated from the husky prompt ( the : raid; prompt) of the onboard controller. This section describes commands that a user can input interactively or via a script file to the K9 kernel. Since K9 is an ANSI C Application Programming Interface (API) a shell is needed to interpret user input and form output. Only one shell is currently available and it is called husky. The K9 kernel is modelled on the Plan 9 operating system whose design is discussed in several papers from AT&T (See the "Further Reading" section for more information). K9 is a kernel targeted at embedded controllers of small to medium complexity (e.g. ISDN-ethernet bridges, RAID controllers, etc). It supports multiple lightweight processes (i.e. without memory management) on a single CPU with a non-pre-emptive scheduler. Device driver architecture is based on Plan 9 (and Unix SVR4) STREAMS. Concurrency control mechanisms include semaphores and signals. The husky shell is modelled on a scaled down Unix Bourne shell. Using the built-in commands the user can write new scripts thus extending the functionality of the 5070. The commands (adapted from the 5070 man pages) are extensive and are described below. 13.1. AUTOBOOT - script to automatically create all raid sets and scsi monitors · SYNOPSIS: autoboot · DESCRIPTION: autoboot is a husky script which is typically executed when a RaidRunner boots. The following steps are taken - 1. Start all configured scsi monitor daemons (smon). 2. Test to see if the total cache required by all the raid sets that are to boot is not more than 90% of available memory. 3. Start all the scsi target daemons (stargd) and set each daemon's mode to "spinning-up" which enables it to respond to all non medium access commands from the host. This is done to allow hosts to gain knowledge about the RaidRunner's scsi targets as quickly as possible. 4. Bind into the root (ram) filesystem all unused spare backend devices. 5. Build all raid sets. 6. If battery backed-up ram is present, check for any saved writes and restore them into the just built raid sets. 7. Finally, set the state of all scsi target daemons to "spun-up" enabling hosts to fully access the raid set's behind them. 13.2. failure AUTOFAULT - script to automatically mark a backend faulty after a drive · SYNOPSIS: autofault raidset · DESCRIPTION: autofault is a husky script which is typically executed by a raid file system upon the failure of a backend of that raid set when that raid file system cannot use spare backends or has been configured not to use spare backends. After parsing it's arguments (command and environment) autofault issues a rconf command to mark a given backend as faulty. · OPTIONS: · raidset: The bind point of the raid set whose backend failed. · $DRIVE_NUMBER: The index of the backend that failed. The first backend in a raid set is 0. This option is passed as an environment variable. · $BLOCK_SIZE: The raid set's io block size in bytes. (Ignored). This option is passed as an environment variable. · $QUEUE_LENGTH: The raid set's queue length. (Ignored). This option is passed as an environment variable. · SEE ALSO: rconf 13.3. raid set AUTOREPAIR - script to automatically allocate a spare and reconstruct a · SYNOPSIS: autorepair raidset size · DESCRIPTION: autorepair is a husky script which is typically executed by either a raid type 1, 3 or 5 file system upon the failure of a backend of that raid set. After parsing it's arguments (command and environment) autorepair gets a spare device from the RaidRunner's spares spool. It then engages it in write-only mode and reads the complete raid device which reconstructs the data on the spare. The read is from the raid file system repair entrypoint. Reading from this entrypoint causes a read of a block immediately followed by a write of that block. The read/write sequence is atomic (i.e is not interruptible). Once the reconstruction has completed, a check is made to ensure the spare did not fail during reconstruction and if not, the access mode of the spare device is set to the access mode of the raid set. The process that reads the repair entrypoint is rebuild. This device reconstruction will take anywhere from 10 minutes to one and a half hours depending on both the size and speed of the backends and the amount of activity the host is generating. During device reconstruction, pairs of numbers will be printed indicating each 10% of data reconstructed. The pairs of numbers are separated by a slash character, the first number being the number of blocks reconstructed so far and the second being the number number of blocks to be reconstructed. Further status about the rebuild can be gained from running rebuild. When the spare is allocated both the number of spares currently used on the backend and the spare device name is printed. The number of spares on a backend is referred to the depth of spares on the backend. Thus prior to re-engaging the spare after a reconstruction a check can be made to see if the depth is the same. If it is not, then the spare reconstruction failed and reconstruction using another spare is underway (or no spares are available), and hence we don't re-engage the drive. · OPTIONS: · raidset: The bind point of the raid set whose backend failed. · size : The size of the raid set in 512 byte blocks. · $DRIVE_NUMBER: The index of the backend that failed. The first backend in a raid set is 0. This option is passed as an environment variable. · $BLOCK_SIZE: The raid set's io block size in bytes. This option is passed as an environment variable. · $QUEUE_LENGTH: The raid set's queue length. This option is passed as an environment variable. · SEE ALSO: rconf, rebuild 13.4. BIND - combine elements of the namespace · SYNOPSIS: bind [-k] new old · DESCRIPTION: Bind replaces the existing old file (or directory) with the new file (or directory). If the"-k" switch is given then new must be a kernel recognized device (file system). Section 7k of the manual pages documents the devices (sometimes called file systems) that can be bound using the "-k" switch. 13.5. BUZZER - get the state or turn on or off the buzzer · SYNOPSIS: buzzer or buzzer on|off|mute · DESCRIPTION: Buzzer will either print the state of the buzzer, turn on or off the buzzer or mute it. If no arguments are given then the state of the buzzer is printed, that is on or off will be printed if the buzzer is currently on or off respectively. If the buzzer has been muted, then you will be informed of this. If the buzzer has not been used since the RaidRunner has booted then the special state, unused, is printed. If the argument on is given the buzzer is turned on, if off, the buzzer is turned off. If the argument mute is given then the muted state of the buzzer is changed. · SEE ALSO: warble, sos 13.6. CACHE - display information about and delete cache ranges · SYNOPSIS: cache [-D moniker] [-I moniker] [-F] [-g moniker first|last] lastoffset · DESCRIPTION: cache will print (to standard output) information about the given cache range, delete a given cache range, flush the cache or return the last offset of all cache ranges. · OPTIONS · -F: Flush all cache buffers to their backends (typically raid sets). · -D moniker: Delete the cache range with moniker (name) moniker. · -I moniker: Invalidate the cache for the given cache range (moniker). This is only useful for debugging or elaborate benchmarks. · g moniker first|last: Print either the first or last block number of a cache range with moniker (name) moniker. · lastoffset: Print the last offset of all cache ranges. The last offset is the last block number of all cache ranges. 13.7. CACHEDUMP - Dump the contents of the write cache to battery backed-up ram · SYNOPSIS: cachedump · DESCRIPTION: cachedump causes all unwritten data in the RaidRunner's cache to be written out to the battery backed-up ram. No data will be written to battery backed-up ram if there is currently valid data already stored there. This command is typically executed when there is something wrong with the data (or it's organization) in battery backed-up ram and you need to re- initialize it. cachedump will always return a NULL status. · SEE ALSO: showbat, cacherestore 13.8. CACHERESTORE - Load the cache with data from battery backed-up ram · SYNOPSIS: cacherestore · DESCRIPTION: cacherestore will check the RaidRunner's battery backed-up ram for any data it has stored as a result of a power failure. It will copy any data directly into the cache. This command is typically executed automatically at boot time and prior to the RaidRunner making it's data available to a host. Having successfully copied any data from battery backed-up ram into the cache, it flushes the cache and then re-initializes battery backed- up ram to indicate it holds no data. cacherestore will return a NULL status on success or 1 if an error occurred during the loading (with a message written to standard error). · SEE ALSO: showbat 13.9. CAT - concatenate files and print on the standard output · SYNOPSIS: cat [ file... ] · DESCRIPTION: cat writes the contents of each given file, or standard input if none are given or when a file named `-' is given, to standard output. If the nominated file is a directory then the filenames contained in that directory are sent to standard out (one per line). More information on a file (e.g. its size) can be obtained by using stat. The script file ls uses cat and stat to produce directory listings. · SEE ALSO echo, ls, stat 13.10. CMP - compare the contents of 2 files · SYNOPSIS: cmp [-b blockSize] [-c count] [-e] [-x] file1 file2 · DESCRIPTION: cmp compares the contents of the 2 named files. If file1 is "-" then standard input is used for that file. If the files are the same length and contain the same val­ ues then nothing is written to standard output and the exit status NIL (i.e. true) is set. Where the 2 files dif­ fer, the first bytes that differ and the position are out­ put to standard out and the exit status is set to "differ" (i.e. false). The position is given by a block number (origin 0) followed by a byte offset within that block (origin 0). The optional "-b" switch allows the blockSize of each read operation to be set. The default blockSize is 512 (bytes). For big compares involving disks a relatively large blockSize may be useful (e.g. 64k). See suffix for allowable suffixes. The optional "-c" switch allows the count of blocks read to fixed. A value of 0 for count is interpreted as read to the end of file (EOF). To compare the first 64 Megabytes of 2 files the switches "-b 64k -c 1k" could be used. See suffix for allowable suffixes. The optional "-e" switch instructs ccmmpp to output to stan­ dard out (usually overwriting the same line) the count of blocks compared, each time a multiple of 100 is reached. The final block count is also output. The optional "-x" switch instructs ccmmpp to continue after a comparison error (but not a file error) and keep a count of blocks in error. If any errors are detected only the last one will be output when the command exits. If the "-e" switch is also given then the current count of blocks in error is output to the right of the multiple of 100 blocks compared. This command is designed to compare very large files. Two buffers of blockSize are allocated dynamically so their size is bounded by the amount of memory (i.e. RAM in the target) available at the time of command execution. The count could be up to 2G. The number of bytes compared is the product of blockSize and count (i.e. big enough). · SEE ALSO: suffix 13.11. CONS - console device for Husky · SYNOPSIS: bind -k cons bind_point · DESCRIPTION: cons allows an interpreter (e.g. Husky) to route console input and output to an appropriate device. That console input and output is available at bind_point in the K9 namespace. The special file cons should always be available. · EXAMPLES: Husky does the following in its initialisation: bind -k cons /dev/cons On a Unix system this is equivalent to: bind -k unixfd /dev/cons On a DOS system this is equivalent to: bind -k doscon /dev/cons On target hardware using a SCN2681 chip this is equivalent to: bind -k scn2681 /dev/cons SEE ALSO: unixfd, doscon, scn2681 13.12. DD - copy a file (disk, etc) · SYNOPSIS: dd [if=file] [of=file] [ibs=bytes] [obs=bytes] [bs=bytes] [skip=blocks] [seek=blocks] [count=blocks] [flags=verbose] · DESCRIPTION: dd copies a file (from the standard input to the standard output, by default) with a user-selectable blocksize. · OPTIONS · if=file Read from file instead of the standard input. · of=file, Write to file instead of the standard output. · ibs=bytes, Read given number of bytes at a time. · obs=bytes, Write given number of bytes at a time. · bs=bytes, Read and write given number of bytes at a time. Override ibs and obs. · skip=blocks, Skip ibs-sized blocks at start of input. · seek=blocks, By-pass obs-sized blocks at start of output. · count=blocks, Copy only ibs-sized input blocks. · flags=verbose, Print (to standard output) the number of blocks copied every ten percent of the copy. The output is of the form X/T where X is the number of blocks copied so far and T is the total number of blocks to copy. This option can only be used if both the count= and of= options are also given. The decimal numbers given to "ibs", "obs", "bs", "skip", "seek" and "count" must not be negative. These numbers can optionally have a suffix (see suffix). dd outputs to standard out in all cases. A successful copy of 8 (full) blocks would cause the following output: 8+0 records in 8+0 records out The number after the "+" is the number of fractional blocks (i.e. blocks that are less than the block size) involved. This number will usually be zero (and is otherwise when physical media with alignment requirements is involved). A write failure outputting the last block on the previous example would cause the following output: Write failed 8+0 records in 7+0 records out SEE ALSO: suffix 13.13. DEVSCMP - Compare a file's size against a given value · SYNOPSIS: devscmp filename size · DESCRIPTION: devscmp will find the size of the given file and compare it's size in 512-byte blocks to the given size (to be in 512-byte blocks). If the size of the file is less than the given value, then -1 is printed, if equal to then 0 is printed, and if the size of the given file is greater than the given size then 1 is printed. This routine is used in internal scripts to ensure that backends of raid sets are of an appropriate size. 13.14. DFORMAT- Perform formatting functions on a backend disk drive · SYNOPSIS · dformat -p c.s.l -R bnum · dformat -p c.s.l -pdA|-pdP|-pdG · dformat -p c.s.l -S [-v] [-B firstbn] · dformat -p c.s.l -F · dformat -p c.s.l -D file · DESCRIPTION: In it's first form dformat will either reassign a block on a nominated disk drive. via the SCSI-2 REASSIGN BLOCKS command. The second form will allow you to print out the current manufacturers defect list (-pdP), the grown defect list (-pdG) or both defect lists (-pdA). Each printed list is sorted with one defect per line in Physical Sector Format - Cylinder Number, Head Number and Defect Sector Number. The third form causes the drive to be scanned in a destructive write/read/compare manner. If a read or write or data comparison error occurs then an attempt is made to identify the bad sector(s). Typically the drive is scanned from block 0 to the last block on the drive. You can optionally give an alternative starting block number. The fourth form causes a low level format on the specified device. The fifth option allows you to download a device's microcode into the device. · OPTIONS: · -R bnum: Specify a logical block number to reassign to the drive's grown defect list. · -pdA: Print both the manufacturer's and grown defect list. · \ -pdP: Print the manufacturer's defect list. · -pdG: Print the grown defect list. · -S: Perform a destructive scan of the disk reporting I/O errors. · -B firstbn: Specify the first logical block number to start a scan from. · -v: Turn on verbose mode - which prints the current block number being scanned. · -F: Issue a low-level SCSI format command to the given device. This will take some time. · -D file: Download into the specified device, the given file. The download is effected by a single SCSI Write-Buffer command in save microcode mode. This allows users to update a device's microcode. Use this command carefully as you could destroy the device by loading an incorrect file. · -p c.s.l: Identify the disk device by specifying it's channel, SCSI ID (rank) and SCSI LUN provided in the format "c.s.l" · SEE ALSO: Product manual for disk drives used in your RAID. 13.15. DIAGS - script to run a diagnostic on a given device · SYNOPSIS: diags disk -C count -L length -M io-mode -T io-type -D device · DESCRIPTION: diags is a husky script which is used to run the randio diagnostic on a given device. When randio is executed, it is executed in verbose mode. · OPTIONS: · disk: This is the device type of diagnostic we are to run. · -C count: Specify the number of times to execute the diagnostic. · -L length: Specify the "length" of the diagnostic to execute. This can be either short, medium or long and specified with the letter's s, m or l respectively. In the case of a disk, a short test will the first 10% of the device, a medium the first 50% and long the whole (100%) of the disk. · -M io-mode: Specify a destructive (read-write) or non-destructive (read-only) test. Use either read-write or read-only. · -T io-type: Specify a type of io - either sequential or random. · -D device: Specify the device to test. · SEE ALSO: randio, scsihdfs 13.16. DPART - edit a scsihd disk partition table · SYNOPSIS: · dpart -a|d|l|m -D file [-N name] [-F firstblock] [-L lastblock] · dpart -a -D file -N name -F firstblock -L lastblock · dpart -d -D file -N name · dpart -l -D file · dpart -m -D file -N name -F firstblock -L lastblock · DESCRIPTION: Each scsihd device (typically a SCSI disk drive) can be divided up into eight logical partitions. By default when a scsihd device is bound into the RaidRunner's file system, it has four partitions, the whole device (raw), typically named bindpoint/raw, the partition file (bindpoint/partition), the RaidRunner backup configuration file (bindpoint/rconfig), and the "data" portion of the disk (bind- point/data) which represents the whole device less the backup configuration area and partition file. For more information, see scsihdfs. If other partitions are added, then they will appear as bindpoint/partitionname. dpart allows you to edit or list the partition table on a scsihd device (typically a disk). · OPTIONS: · -a: Add a partition. When adding a partition, you need to specify the partition name (-N) and the partition range from the first block (-F) to the last block (-L). · -d: Delete a named (-N) partition. · -l: List all partitions. · -m: Modify an existing partition. You will need to specify the partition name (-N) and BOTH it's first (-F) and last (-L) blocknumbers even if you are just modifying the last block number. · -D file: Specify the partition file to be edited. Typically, this is the bindpoint/partition file. · -N name: Specify the partition name. · -F firstblock: Specify the first block number of the partition. · -L lastblock: Specify the last block number of the partition. · SEE ALSO: scsihd 13.17. DUP - open file descriptor device · SYNOPSIS: bind -k dup bind_point · DESCRIPTION: The dup device makes a one level directory with an entry in that directory for every open file descriptor of the invoking K9 process. These directory "entries" are the numbers. Thus a typical process (script) binding a dup device would at least make these files in the namespace: "bind_point/0", "bind_point/1" and "bind_point/2". These would correspond to its open standard in, standard out and standard error file descriptors. A dup device allows other K9 processes to access the open file descriptors of the invoking process. To do this the other processes simply "open" the required dup device directory entry whose name (a number) corresponds to the required file descriptor. 13.18. ECHO - display a line of text · SYNOPSIS: echo [string ...] · DESCRIPTION: echo writes each given string to the standard output, with a space between them and a newline after the last one. Note that all the string arguments are written in a single write kernel call. The following backslash-escaped characters in the strings are converted as follows: \b backspace \c suppress trailing newline \f form feed \n new line \r carriage return \t horizontal tab \v vertical tab \\ backslash \nnn the character whose ASCII code is nnn (octal) · SEE ALSO: cat 13.19. ENV- environment variables file system · SYNOPSIS: bind -k env bind_point · DESCRIPTION: env file system associates a one level directory with the bind_point in the K9 namespace. Each file name in that directory is the name of the environment variable while the contents of the file is that variable's current value. Conceptually each process sees their own copy of the env file system. This copy is either empty or inherited from this process's parent at spawn time (depending on the flags to spawn). 13.20. ENVIRON - RaidRunner Global environment variables - names and effects · DESCRIPTION: The RaidRunner uses GLOBAL environment variables to control the functionality of automatic actions. GLOBAL environment variables are saved in the Raid configuration area so they retain their values between reboots/power downs. Certain RaidRunner internal run-time variables can also be set as a GLOBAL environment variables. See the internals manual entry for details. The table below describes those GLOBAL environment variables that are used by the RaidRunner in it's normal operation. · RebuildPri This variable, if set, controls the priority used when drive reconstruction occurs via the rebuild program. If the variable is not set then the default rebuild priority would be used. The variable is to be a comma separated list of raid set names and their associated rebuild priorities and sleep periods (colon separated). The form is Rname_1:Pri_1:Sleep_1,Rname_2:Pri_2:Sleep_2,...,Rname_N:Pri_N:Sleep_N where Pri_1 is to be the priority the rebuild program runs with when run on raid set Rname_1, Sleep_1 is the period, in milliseconds, to sleep between each rebuild action on the raid set, Pri_2 is to be the priority for raid set Rname_2, and so forth. For example, if the value of RebuildPri is R:5:30000 then if a rebuild occurs (via replace, repair or autorepair) on raid set R then the rebuild will run with priority 5 (via the -p rebuild option) and will sleep 30000 milliseconds (30 seconds) between each rebuild action (specified via the -S rebuild option). The priority given must be valid for the rebuild program. · BackendRanks On certain RaidRunner's where multiple controllers may exist, you can restrict a controller's access to the backend ranks of devices available. For example, you may have 2 controllers and 4 ranks of backend devices. You can specify that the first controller can only access the first two ranks and the second controller, the second two ranks. This variable along with other associated commands allows you to set up this restriction. Additionally, you may only have a single controller RaidRunner which is in an enclosure with multiple ranks. By default the controller will attempt to probe for all devices on all ranks. If you have only populated the RaidRunner with say, half it's possible compliment of backend devices, then the RaidRunner will still probe for the other half. Setting this variable appropriately will prevent this un- needed (and on occasion time consuming) process. This variable takes the form controller_id:ranklist controller_id:ranklist ... where controller_id is the controller number (from 0 upwards) and ranklist is a comma list of backend ranks which the given controller will access. Note that the backend rank is the scsi-id of that rank. For example, on a 2 rank (rank 1 and 2 - i.e scsi id 1 for the first rank and scsi id 2 for the second), 1 controller This variable takes the form For example, on a 2 rank (rank 1 and 2 - i.e scsi id 1 for the first rank and scsi id 2 for the second), 1 controller RaidRunner where only the first rank has devices you could prevent the controller from attempting to access the (empty) second rank by setting Back­ endRanks to 0:1 Typically, you would not set this variable directly, but use support­ ing commands to set it. These commands are pranks and sranks. See these manual entries for details. · RAIDn_reference_PBUFS Raid types 3, 4 and 5 all make use of memory for temporary parity buffers when they need to create parity data. This memory is in addition to that allocated to a raid set's cache. When a raid set is created, it will also create a default number of parity buffers (which are the same size is a raid set's iosize). Sometimes, if the iosize of the raid set is large there will not be enough memory to create this default number of parity buffers. To overcome this situation, you can set GLOBAL environment variables to over-ride the default number of parity buffers that all raid sets of a particular type or a specific raid set will use. You need to set these variables before you define the raid set via agui and if you delete them and not the raid set, then the effect raid sets may not boot and hence will not be accessible by a host. The variables are of the form RAIDn_reference_PBUFS where n is the raid type (3, 4 or 5), and reference is the raid set's name or the string 'Default' You use the reference of 'Default' to specify all raid sets of a particular type. For example, to over-ride the number of parity buffers for a raid 5 named : raid ; setenv RAID5_FRED_PBUFS 64 To over-ride the number of parity buffers for ALL raid 3's (and set only 72 parity buffers) set : raid ; setenv RAID3_Default_PBUFS 128 If you set a default for all raid sets of a particular type, but want ONE of them to be different then set up a variable for that particular raid set as it's value will over-ride the default. In the above example, where all Raid Type 3 will have 128 parity buffers, you could set the variable : raid ; setenv RAID3_Dbase_PBUFS 56 which will allow the raid 3 raid set named 'Dbase' to have 56 parity buffers, but all other raid 3's defined on the RaidRunner will have 128. · SEE ALSO: setenv, printenv, rconf, rebuild, internals 13.21. EXEC - cause arguments to be executed in place of this shell · SYNOPSIS: exec [ arg ... ] · DESCRIPTION: exec causes the command specified by the first arg to be executed in place of this shell without creating a new process. Subsequent args are passed to the command specified by the first arg as its arguments. Shell redirection may appear and, if no other arguments are given, causes the shell input/output to be modified. 13.22. EXIT - exit a K9 process · SYNOPSIS: exit [string] · DESCRIPTION: exit has an optional string argument. If the optional argument is given the current K9 process is terminated with the given string as its exit value. (If the string has embedded spaces then the whole string should be a quoted_string). If no argument is given then the shell gets the string associated with the environment variable "status" and returns that string as the exit value. If the environment variable "status" is not found then the "true" exit status (i.e. NIL) is returned. · SEE ALSO: true, K9exit 13.23. EXPR - evaluation of numeric expressions · SYNOPSIS: expr numeric_expr ... · DESCRIPTION: expr evaluates each numeric_expr command line argument as a separate numeric expression. Thus a single expression cannot contain unescaped whitespaces or needs to be placed in a quoted string (i.e. between "{" and "}"). Arithmetic is performed on signed integers (currently numbers in the range from -2,147,483,648 to 2,147,483,647). Successful calculations cause no output (to either standard out/error or environment variables). So each useful numeric_expr needs to include an assignment (or op-assignment). Each numeric_expr argument supplied is evaluated in the order given (i.e. left to right) until they all evaluate successfully (returning a true status). If evaluating a numeric_expr fails (usually due to a syntax error) then the expr command fails with "error" as the exit status and the error message is written to the environment variable "error". · OPERATORS: The precedence of each operator is shown following the description in square brackets. "0" is the highest precedence. Within a single precedence group evaluation is left-to-right except for assignment operators which are right-to-left. Parentheses have higher precedence than all operators and can be used to change the default precedence shown below. UNARY OPERATORS + Does nothing to expression/number to the right. - negates expression/number to the right. ! logically negate expression/number to the right. ~ Bitwise negate expression/number to the right. BINARY ARITHMETIC OPERATORS * Multiply enclosing expressions [2] / Integer division of enclosing expressions % Modulus of enclosing expressions. + Add enclosing expressions - Subtract enclosing expressions. << Shift left expression _left_ by number in right expression. Equivalent to: left * (2 ** right) >> Shift left expression _right_ by number in right expression. Equivalent to: left / (2 ** right) & Bitwise AND of enclosing expressions ^ Bitwise exclusive OR of enclosing expressions. [8] | Bitwise OR of enclosing expressions. [9] BINARY LOGICAL OPERATORS These logical operators yield the number 1 for a true comparison and 0 for a false comparison. For logical ANDs and ORs their left and right expressions are assumed to be false if 0 otherwise true. Both logical ANDs and ORs evaluate both their left and right expressions in all case (cf. C's short-circuit action). <= true when left less than or equal to right. [5] >= true when left greater than or equal to right. [5] < true when left less than right. [5] > true when left greater than right. [5] == true when left equal to right. [6] != true when left not equal to right. [6] && logical AND of enclosing expressions [10] || logical OR of enclosing expressions [11] ASSIGNMENT OPERATORS In the following descriptions "n" is an environment variable while "r_exp" is an expression to the right. All assignment operators have the same precedence which is lower than all other operators. N.B. Multiple assignment operators group right-to-left (i.e. same as C language). = Assign right expression into environment variable on left. *= n *= r_exp is equivalent to: n = n * r_exp /= n /= r_exp is equivalent to: n = n / r_exp %= n %= r_exp is equivalent to: n = n % r_exp += n += r_exp is equivalent to: n = n + r_exp -= n -= r_exp is equivalent to: n = n - r_exp <<= n <<= r_exp is equivalent to: n = n << r_exp >>= n >>= r_exp is equivalent to: n = n >> r_exp &= n &= r_exp is equivalent to: n = n & r_exp |= n |= r_exp is equivalent to: n = n | r_exp · NUMBERS: All number are signed integers in the range stated in the description above. Numbers can be input in base 2 through to base 36. Base 10 is the default base. The default base can be overridden by: 1. a leading "0" : implies octal or hexadecimal 2. a number of the form _base_#_num_ Numbers prefixed with "0" are interpreted as octal. Numbers pre­ fixed with "0x" or "0X" are interpreted as hexadecimal. For num­ bers using the "#" notation the _base_ must be in the range 2 through to 36 inclusive. For bases greater then 10 the letters "a" through "z" are utilised for the extra "digits". Upper and lower case letters are acceptable. Any single digit that exceeds (or is equal to) the base is consider an error. Base 10 numbers only may have a suffix. See suffix for a list of valid suffixes. Also note that since expr uses signed integers then "1G" is the largest mag­ nitude number that can be represented with the "Gigabyte" suffix (assuming 32 bit signed integers, -2G is invalid due to the order of evaluation). · VARIABLES: The only symbolic variables allowed are K9 environment variables. Regardless of whether they are being read or written they should never appear preceded by a "$". Environment variables that didn't previous exist that appear as left argument of an assignment are created. When a non-existent environment variable is read then it is interpreted as the value 0. · EXAMPLES: Some simple examples: expr {n = 1 + 2} # create n echo $n 3 expr {n*=2} # 3 * 2 result back into n echo $n 6 expr { k = n > 5 } # 6 > 5 is true so create k = 1 echo $k 1 · NOTE: expr is a Husky "built-in" command. See the "Note" section in "set" to see the implications. · SEE ALSO: husky, set, suffix, test 13.24. FALSE - returns the K9 false status · SYNOPSIS: false · DESCRIPTION: false does nothing other than return a K9 false status. K9 processes return a pointer to a C string (null terminated array of characters) on termination. If that pointer is NULL then a true exit value is assumed while all other returned pointer values are interpreted as false (with the string being some explanation of what went wrong). This command returns a pointer to the string "false" as its return value. · EXAMPLE: The following script fragment will print "got here" to standard out: if false then echo impossible else echo got here end · SEE ALSO: true 13.25. FIFO - bi-directional fifo buffer of fixed size · SYNOPSIS: · bind -k {fifo size} bind_point · cat bind_point · bind_point/data · bind_point/ctl · DESCRIPTION: fifo file system associates a one level directory with the bind_point in the K9 namespace with a buffer size of size bytes. bind_point/data and bind_point/ctl are the data and control channels for the fifo. Data written to the bind_point/data file is available for reading from the same file in a first-in first-out basis. A write of x bytes to the bind_point/data file will either complete and and transfer all the data, or will transfer sufficient bytes until the fifo buffer is full then block until data is removed from the fifo buffer by reading. A read of x bytes from the bind_point/data file will transfer the lessor of the current amount of data in the fifo buffer or x bytes. A read from the bind_point/ctl will return the size of the fifo buffer and the current usage. The number of opens (# Opens) is the number of processes that currently have the bind_point/data file open. · EXAMPLE > /buffer bind -k {fifo 2048} /buffer ls -l /buffer /buffer: /buffer/ctl fifo 2 0x00000001 1 0 /buffer/data fifo 2 0x00000002 1 0 cat /buffer/ctl Max: 2048 Cur: 0, # Opens: 0 echo hello > /buffer/data cat /buffer/ctl Max: 2048 Cur: 6, # Opens: 0 dd if=/buffer/data bs=512 count=1 hello 0+1 records in 0+1 records out cat /buffer/ctl Max: 2048 Cur: 0, # Opens: 0 · SEE ALSO: pipe 13.26. GET - select one value from list · SYNOPSIS: get number [ value ... ] · DESCRIPTION: get uses the given number to select one value from the given list. Indexing is origin 0 (e.g. "get 0 aaa bb c" returns "aaa"). If the number is out of range for an index on the given list of values then nothing is returned. 13.27. GETIV - get the value an internal RaidRunner variable · SYNOPSIS: · getiv · getiv name · DESCRIPTION: getiv prints the current value of an internal RaidRunner variable or prints a list of all variables. When a variable name is given it's current value is printed. If no value is given the all available internal variables are listed. · NOTES: As different models of RaidRunners have different internal variables see your RaidRunner's Hardware Reference manual for a list of variables together with the meaning of their values. These variables are run-time variables and hence revert to their default value whenever the RaidRunner is booted. · SEE ALSO: setiv 13.28. HELP - print a list of commands and their synopses · SYNOPSIS: help or ? · DESCRIPTION: help or the question mark character - ?, will print a list of all commands available to the command interpreter. Along with each command, it's synopsis is printed. 13.29. HUSKY - shell for K9 kernel · SYNOPSIS · husky [-c command] [ file [ arg ... ] ] · hs [-c command] [ file [ arg ... ] ] · DESCRIPTION: husky and hs are synonyms. husky is a command language interpreter that executes commands read from the standard input or from a file. husky is a scaled down model of Unix's Bourne shell (sh). One major difference is that husky has no concept of current working directory. If the "-c" switch is present then the following command is interpreted by husky in a newly thrown shell nested in the current environment. This newly thrown shell exits back to the current environment when the command finishes. Otherwise if arguments are given the first one is assumed to be a file containing husky commands. Again a new shell is thrown to execute these commands. husky script files can access their command line arguments and the 2nd and subsequent arguments to husky (if present) are passed to the file for that purpose. If no arguments are given to husky then commands are read from standard in (and the shell is considered interactive). · RETURN STATUS: husky places the K9 return status of a process (NIL if ok, otherwise a string explaining the error) in the file "/env/status" An example: dd if=/xx dd: could not open /xx cat /env/status open failed cat /env/status # empty because previous "cat" worked As the file "/env/status" is an environment variable the return status of a command is also available in the variable $status. The exit status of a pipeline is the exit status of the last command in the pipeline. · SIGNALS If an interactive shell receives an interrupt signal (i.e. K9_SIGINT - usually a control-C on the console) then the shell exits. The "init" process will then start a new instance of the husky shell with all the previously running processes (with the exception of the just killed shell) still running. This allows the user to kill the process that caused the previous shell problems. Alternatively a process that is acci­ dentally run in foreground is effectively put in the background by sending an interrupt signal to the shell. Note that this is quite different to Unix shells which would forward the signal onto the foreground process. · QUOTES, ESCAPING, STRING CONCATENATION, ETC: A quoted_string (as defined in the grammar) commences with a "{" and finishes with the matching "}". The term "matching" implies that all embedded "{" must have a corresponding embedded "}" before the final "}" is said to match the original "{". A quoted_string can be spread across several lines. No command line substitution occurs within quoted_strings. The character for escaping the following character is "\". If a "{" needs to be interpreted literally then it can be represented by "\{". If a string containing spaces (whitespaces) needs to be interpreted as a single token then space (whitespace) can be escaped (i.e. "\ "). If a "\" itself needs to be interpreted literally then it can be represented by "\\". The string concatenation character is "^". This is useful when a token such as "/d4" needs to built up by a script when "/d" is fixed and the "4" is derived from some variable: set n 4 > /d^$n This example would create the file "/d4". The output of another husky command or script can be made available inline by starting the sequence with "`" and finishing it with a "'". For example: echo {ps output follows: } `ps' This prints the string "ps output follows:" followed on the next line by the current output from the command "ps". That output from "ps" would have its embedded newlines replaced by whitespaces. COMMAND LINE FILE REDIRECTION: · Redirection should appear after a command and its arguments in a line to be interpreted by husky. A special case is a line that just contains "> filename" which creates the filename with zero length if it didn't previously exist or truncates to zero length if it did. · Redirection of standard in to come from a file uses the token "<" with the filename appearing to its right. The default source of standard in is the console. · Redirection of standard out to go to a file uses the token ">" with the filename appearing to its right. The default destination of standard out is the console. · Redirection of standard error to go to a file uses the token ">[2]" with the filename appearing to its right. The default destination of standard error is the console. · Redirection of writes from within a command which uses a known file descriptor number (say "n") to go to a file uses the token ">[n]" with the filename appearing to its right. · Redirection of read from within a command which uses a known file descriptor number (say "n") to come from a file uses the token "<[n]" with the filename appearing to its right. · Redirection of reads and writes from within a command which uses a known file descriptor number (say "n") to a file uses the token "<>[n]" with the filename appearing to its right. In order to redirect both standard out and standard error to the one file the form " > filename >[2=1]" can be used. This sequence first redirects standard out (i.e. file descriptor 1) to filename and then redirects what is written to file descriptor 2 (i.e. standard error) to file descriptor 1 which is now associated with filename. ENVIRONMENT VARIABLES: Each process can access the name it was invoked by via the variable: "arg0" . The command line arguments (excluding the invocation name) can be accessed as a list in the variable: "argv" . The number of elements in the list "argv" is place in "argc". The get command is useful for fetching individual arguments from this list. The pid of the current process can be fetched from the variable: "pid". When a script launches a new process in the background then the child's pid can be accessed from the variable "child". The variable "ContollerId" is set to the RaidRunner controller number husky is running on. Environment variables are a separate "space" for each process. Depending on the way a process was created, its initial set of environment variables may be copied from its parent process at the "spawn" point. SEE ALSO: intro 13.30. HWCONF - print various hardware configuration details · SYNOPSIS: hwconf [-D] [-M] [-I] [-d [-n]] [-f] [-h] [-i -p c.s.l] [-m] [-p c.s.l] [-s] [-S] [-t] [-T] [-P] [-W] · DESCRIPTION: hwconf prints details about the RaidRunner hardware and devices attached. · OPTIONS: · -h: Print the number of controllers, host interfaces per controller, the number of disk channels per controller, number of ranks of disks and the details memory (in bytes) on each controller. Four memory figures are printed, the first is the total memory in the controller, next is the amount of memory at boot time, next is the amount currently available and lastly is the largest available contiguous area of memory. This is the default option. · -f: Print the number of fans in the RaidRunner and then the speed for each fan in the system. The speeds values are in revolutions per minute (rpms). The fans in the system are labeled in your hardware specification sheet for your RaidRunner. The first speed printed from this command corresponds to fan number 0 on your specification sheet, the second is for fan 1, and so forth. · -d: Print out information on all the disk drives on the RaidRunner. For each disk on the RaidRunner, print out - the device name, in the format c.s.l where c is the channel, s is the SCSI ID (or rank) and l is the SCSI LUN of the device, the manufacturer's name (vendor id), the disk's model name (product id), the disk's version id, the disk serial number, the disk geometry - number of cylinders, heads and sectors, and the last block number on the disk and the block size in bytes. the disk revolution count per minute (rpm's), the number of notches/zones available on the drive (if any) · -n: Print out the disk drive notch/zone tables if available. This is a sub-option to the -d option. Not all disks appear to correctly report the notch/zone partition tables. For each notch/zone, · the following is printed: the zone number, the zone's starting cylinder, the zone's starting head, the zone's ending cylinder, the zone's ending head, the zone's starting logical block number, the zone's ending logical block number, the zone's number of sectors per track · -D: Print out the device names for all disk drives on the system. · -I: Initialize back-end NCR SCSI chips. This flag may be used in conjunction with any other option and will done first. It has an effect only the first call to hwconf that has not yet used a -d, -D or -I options, or on those chips that have not yet had a -p on the channel associated with that chip. · -m: Print out major flash and battery backed-up ram addresses (in hex). Additionally print out the size of the RaidRunner configuration area. Eight (8) addresses are printed in order RaidRunner configuration area start and end addresses (FLASH RAM), RaidRunner Husky Scripts area start and end addresses (FLASH RAM), RaidRunner Binary Image area start and end addresses (FLASH RAM), RaidRunner Battery Backed-up area start and end addresses. And the size of the RaidRunner configuration area (in bytes) is then printed. · -p c.s.l: Probe a single device specified by the given channel, SCSI ID (rank) and SCSI LUN provided in the format "c.s.l". The output of this command is the same as the "-d" option but just for the given device. If the device is not present then nothing will be output and the exit status of the command will be 1. · -i -p c.s.l: Re-initialize the SCSI device driver specified by the given channel, SCSI ID (rank) and SCSI LUN provided in the format "c.s.l". Typically this command is used when, on a running RaidRunner, a new drive is plugged in, and it will be used prior to the RaidRunner's next reboot. · -M: Set the boottime memory. This option is executed internally by the controller at boot time and has no function (or effect) executed at any other time. · -s: Print the 12 character serial number of the RaidRunner. · -S: Issue SCSI spin up commands to all backends as quickly as possible. This option is intended for use at power-on stage only. · -t: Probe the temperature monitor returning the internal temperature of the RaidRunner in degrees Celsius. · -T: Print the temperatures being recorded by the hardware monitoring daemon (hwmon). · -P: For both AC and DC power supplies, print the number of each present and the state of each supply. The state will be printed as ok or flt depending on whether the PSU is working or faulty. · -W: This option will wait until all possible backends have spun up. It is used in conjunction with · NOTES : The order of printing the disk information is by SCSI ID (rank), by channel, by SCSI LUN. 13.31. HWMON - monitoring daemon for temperature, fans, PSUs. · SYNOPSIS: hwmon [-t seconds] [-d] · DESCRIPTION: hwmon is a hardware monitoring daemon. It periodically probes the status of certain elements of a RaidRunner and if an out-of-band occurrence happens, will cause the alarm to sound or light up fault leds as well as saving a message in the system log. Depending on the model of RaidRunner, the elements monitored are temperature, fans and power supplies. When an out-of- band occurrence is found, hwmon will reduce the time between probes to 5 seconds. If a buzzer is the alarm device, then the buzzer will turn on for 5 seconds then off for 5 seconds and repeat this cycle until the buzzer is muted or the occurrence is corrected. If the RaidRunner model supports a buzzer muting switch, then the buzzer will be muted if the switch is pressed during a cycle change as per the previous paragraph. When hwmon recognizes the mute switch it will beep twice. Certain out-of-band occurrences can be considered to be catastrophic, meaning if the occurrence remains uncorrected, the RaidRunner's hardware is likely to be damaged. Occurrences such as total fan failure and sustained high temperature along with total or partial fan failure are considered as catastrophic. hwmon has a means of automatically placing the RaidRunner into a "shutdown" or quiescent state where minimal power is consumed (and hence less heat is generated). This is done by the execution of the shutdown command after a period of time where catastrophic out-of-band occurrences are sustained. This process is enabled, via the AutoShutdownSecs internal variable. See the internals manual for use of this variable. hwmon can be prevented from starting at boot time by creating the global environment variable NoHwmon and setting any value to it. A warning message will be stored in the syslog. · OPTIONS: · t seconds: Specify the number of seconds to wait between probes of the hardware elements. If this option is not specified, the default period is 300 seconds. · -d: Turn on debugging mode which can produce debugging output. · SEE ALSO: hwconf, pstatus, syslogd, shutdown, internals 13.32. running kernel INTERNALS - Internal variables used by RaidRun­ ner to change dynamics of · DESCRIPTION: Certain run-time features of the RaidRunner can be manipulated by changing internal variables via the setiv command. The table below describes each changeable variable, it's effect, it's default value and range of values it can be set to. The variables below are run-time features of a RaidRunner and hence are always set to their default values when a RaidRunner boots. Certain variables can be stored as a global environment variable and will over-ride the defaults at boot time. If you create a global environment variable of that variable's name with an appropriate value, it's default value will be over-ridden the next time the RaidRunner is re-booted. Note, that the values of these variables ARE NOT CHECKED when set in the global environment variable tables and, if incorrectly set, will generate errors at boot until deleted or corrected. In the table below, any variable that can have a value stored as a global environment variable is marked with (GEnv) · write_limit: This variable is the maximum number of 512-byte blocks the cache filesystem will buffer for writes. If this limit is reached all writes to the cache filesystem will be blocked until the cache filesystem has written out (to it's backend) enough blocks to reach a low water mark - write_low_tide. This variable cannot be changed if battery backed-up RAM is available as it is tied to the amount of battery backed-up RAM available. The value of this variable is calculated when the cache is initialized. It's value is dependant on whether battery backed-up RAM is installed in the RaidRunner. If installed, the number of blocks of data that can be saved into the battery backed-up RAM is calculated. If no battery backed-up RAM is present, it's value is set to 75% of the RaidRunner's memory (expressed in a count of 512 byte blocks) then adjusted to reflect the amount of cache requested by configured raid sets. When write_limit is changed then both write_high_tide and write_low_tide are automatically changed to there default values (a function of the value of write_limit). · write_high_tide: This variable is a high water mark for the number of written-to 512-byte blocks in the cache. When the number of data blocks exceeds this value, to avoid the cache filesystem from blocking it's front end, the cache flushing mechanism continually flushes the cache buffer until the amount of unwritten (to the backend) cache buffers is below the low water mark (write_low_tide). This value defaults to 75% of write_limit. This variable can have values ranging from write_limit down to write_low_tide. It is recommended that this variable not be changed. · write_low_tide: This variable is a low water mark for when the cache flushing mechanism is continually flushing data to it's backend. Once the number of written-to cache blocks yet to be flushed equals or is less than this value, the sustained flushing is stopped. This value defaults to 25% of write_limit. This variable can have values ranging from write_high_tide-1 down to zero (0). It is recommended that this variable not be changed. · cache_nflush: This variable is the number of cache buffers (not 512-byte data blocks) that the cache flushing mechanism will attempt to write out in one flush cycle. Adjusting this value may improve performance on writes depending of the size of the cache buffers and type of disk drives used in the raid set backends. The default value is 128. It's value can range from 2 to 128. · cache_nread: This variable is the number of cache buffers (not 512-byte data blocks) that the cache reading mechanism will attempt to read out in one read cycle. Adjusting this value may improve performance on reads depending of the size of the cache buffers and type of disk drives used in the raid set backends. The default value is 128. It's value can range from 2 to 128. · cache_wlimit: This variable is the number of cache buffers (not 512-byte data blocks) that the cache flushing mechanism will attempt coalesce into a single sequential write. It is different to cache_nflush in that cache_nflush is the total number of cache buffers that can be written in a single cache flush cycle and these buffers can be non sequential whereas cache_wlimit is a limit on the number of sequential cache buffer's that can be written with one write. Adjusting this value may improve performance on writes depending of the size of the cache buffers and type of disk drives used in the raid set backends. The default value is 128. It's value can range from 2 to 128. · cache_fperiod (GEnv): By default, the cache flushes any data to be written every 1000 milliseconds (unless it's forced to by the fact that the cache is getting full and then it flushes the cache and resets the timer). You can vary this flushing period by setting this variable. Given you have a large number of sustained reads and minimal writes, then you may want to delay the writes out of cache to the backends as long as possible. Note, that by setting this to a high value, you run the risk of loosing what you have written. The default value is 1000 milliseconds (i.e 1 second). It's value can range from 500ms to 300000ms. · scsi_write_thru (GEnv): By default all writes (from a host) are buffered in the RaidRunner's cache and are flushed to the backend disks periodically. When battery backed-up RAM is available then this results in the most efficient write throughput. If no battery backed-up RAM is available or you do not want to depend on writes being saved in battery backed-up RAM in event of a power failure you can force the RaidRunner to write data straight thru to the backends prior to returning an OK status to the host. This essentially provides a write-thru cache. The default value of this variable is 0 - write-thru mode is DISABLED. The values this variable can take are · 0 - DISABLE write-thru mode, or · 1 - ENABLE write-thru mode. · scsi_write_fua (GEnv): This variable effects what is done when the FUA (Force Unit Access) bit is set on a SCSI WRITE-10 command. When this variable is enabled and a SCSI WRITE-10 command has the FUA bit set is processed then the data is written directly thru the cache to the backend disks. If the variable is disabled, then the setting of the FUA bit on SCSI WRITE-10 commands is ignored. The default value for this variable is disabled (0) if battery backed- up RAM is present, or enabled (1) if battery backed-up RAM is NOT present. The values this variable can take are · 0 - IGNORE FUA bit on SCSI WRITE-10 commands, or · 1 - ACT on FUA bit on SCSI WRITE-10 commands. · scsi_ierror (GEnv): This variable controls what is done when the RaidRunner receives a Initiator Detected Error message on a SCSI host channel. If set (1), cause an Check Condition, If NOT set (0), follow the SCSI-2 standard and re-transmit the Data In / Out phase. The default value is 0. The values this variable can take are · 0 - follow SCSI-2 standard · 1 - ignore the SCSI-2 standard and cause a Check Condition. · scsi_sol_reboot (GEnv): Determines whether to auto-detect a Solaris reboot and the clear any wide mode negotiations. If set (1), detect a Solaris reboot and clear wide mode. If NOT set (0), follow the SCSI-2 standard and not clear wide mode. The default value is 0. The values this variable can take are · 0 - follow SCSI-2 standard · 1 - ignore the SCSI-2 standard and clear wide mode. · scsi_hreset (GEnv): Determines whether to issue a SCSI bus reset on host ports after power-on. If set (1), then a SCSI bus reset is done on the host port when starting the first smon/stargd process on that port. If NOT set (0), nothing is done. The default value is 0. The values this variable can take are · 0 - don't issue SCSI bus resets on power-on. · 1 - issue SCSI bus resets on power-on when the first smon/stargd process is started. · scsi_full_log (GEnv): Determines whether or not stargd reports, via syslog, a Reset Check condition on Read, Write, Test Unit Ready and Start Stop commands. This reset check condition is always set when a RaidRunner boots or the raid detects a scsi-bus reset. Note that this variable only suppresses the logging of this Check condition into syslog, it does not effect the response to the host of this and any Check condition. If set (1), then all stargd detected reset Check condition error messages are logged. If NOT set (0), these messages are suppressed The default value is 0. The values this variable can take are · 0 - suppress logging these messages · 1 - log all messages. · scsi_ms_badpage (GEnv): Determines whether or not stargd reports, via syslog, that it has received a non-supported page number in a MODE SENSE or MODE SELECT command it receives from a host. Note that stargd will issue the appropriate Check condition to the host ("Invalid Field in CDB") irrespective of the value of this variable. If set (1), then all stargd detected non-supported page numbers in MODE SENSE and MODE SELECT commands will be logged. If NOT set (0), these messages are suppressed The default value is 0. The values this variable can take are · 0 - suppress logging these messages · 1 - log all messages. · scsi_bechng (GEnv): Determines whether or not the raid reports backend device parameter change errors. In a multi controller environment, backends are probed and some of their parameters are changed by a booting controller. This will generate parameter change mode sense errors. If cleared (0), then all parameter change errors will NOT be logged. If set (1), these messages are logged like any other backend error. The default value is 0. The values this variable can take are · 0 - suppress logging these messages · 1 - log all messages. · scsi_dnotch (GEnv): Some disk drives take an inordinate amount of time to perform mode select commands. One set of information a RaidRunner will obtain from a device backend are the disk notch pages (if present). As this is for information only, then to reduce the boot time of a RaidRunner you can request that disk notches are not obtained. If cleared (0), backend disk notch information is not probed for. If set (1), then backend disk notch information is probed for. The default value is 1. The values this variable can take are: · 0 - don't probe for notch pages · 1 - probe for notch pages · scsi_rw_retries (GEnv): Specify the number of read or write retries to perform on a device backend before effecting an error on the given operation. Note that ALL retries are reported via syslog. The default value is 3. It's value can range from 1 to 9. · scsi_errpage_r (GEnv): Specify the number of internal read retries that a disk backend is to perform before reporting an error (to the raid). Setting this variable causes the Read Retry Count field in the Read-Write Error Recovery mode sense page. A value of -1 will cause the drive's default to be used. The default value is -1. It's value can range from -1 (use disk's default) or from 0 to 255. · scsi_errpage_w (GEnv): Specify the number of internal write retries that a disk backend is to perform before reporting an error (to the raid). Setting this variable causes the Write Retry Count field in the Read-Write Error Recovery mode sense page. A value of -1 will cause the drive's default to be used. The default value is -1. It's value can range from -1 (use disk's default) or from 0 to 255. · BackFrank: Specify the SCSI-ID of the first rank of backend disks on a RaidRunner. This variable should never be changed and is for informative purposes only. The default value is dependant on the model of RaidRunner being run. The values this variable can take are · 0 - the first rank SCSI-ID will be 0 · 1 - the first rank SCSI-ID will be 1 · raid_drainwait (GEnv): Specify the number of milliseconds a raidset is to delay, before draining all backend I/O's when a backend fails. Setting this variable to a lower value will speed up the commencement of any error recovery procedures that would be performed on a raid set when a backend fails. The default value is 500 milliseconds. It's value can range from 50 to 10000 milliseconds. · EnclosureType: Specify the enclosure type a raid controller is running within. This variable should never be changed and is for informative purposes only. The default value is dependant on the model of RaidRunner being run. The values this variable can take are integers starting from 0. · fmt_idisc_tmo (GEnv): Specify the SCSI command timeout (in milliseconds) when a SCSI FORMAT command is issued on a backend. Disk drives take different amounts of time to perform a SCSI FORMAT command and hence a timeout is required to be set when the command is issued. As certain drives may take longer to format than the default timeout you can change it. The default value is 720000 milliseconds. It's value can range from 200000 to 1440000 milliseconds. · AutoShutdownSecs (GEnv): Specify the number of seconds the RaidRunner should monitor catastrophic hardware failures before deciding to automatically shutdown. A catastrophic failure is one which will cause damage to the RaidRunner's hardware if not fixed immediately. Failures like all fans failing would be considered catastrophic. A value of 0 seconds (the default) will disable this feature, that is, with the exception of logging the errors, no action will occur. See the shutdown and hwmon for further details. The default value is 0 seconds. It's value can range from 20 to 125 seconds. · SEE ALSO: setiv, getiv, syslog, setenv, printenv, hwmon, shutdown 13.33. KILL - send a signal to the nominated process · SYNOPSIS: kill [-sig_name] pid · DESCRIPTION: kill sends a signal to the process nominated by pid. If the pid is a positive number then only the nominated process is signaled. If the pid is a negative number then the signal is sent to all processes in the same process group as the process with the id of -pid. The switch is optional and if not given a SIGTERM (software termination signal) is sent. If the sig_name switch is given then it should be one of the following (lower case) abbreviations. Only the first 3 letters need to be given for the signal name to be recognized. Following each abbreviation is a brief explanation and the signal number in brackets: null - unused signal [0] hup - hangup [1] int - interrupt (rubout) [2] quit - quit (ASCII FS) [3] kill - kill (cannot be caught or ignored) [4] pipe - write on a pipe with no one to read it [5] alrm - alarm clock [6] term - software termination signal [7] cld - child process has changed state [8] nomem - could not obtain memory (from heap) [9] You cannot kill processes whose process id is between 0 and 5 inclusive. These are considered sacrosanct - hyena, init and console reader/writers. · SEE ALSO: K9kill 13.34. LED- turn on/off LED's on RaidRunner · SYNOPSIS: · led · led led_id led_function · DESCRIPTION: led uses the given led_id to identify the LED to manipulate based on the led_function. When no arguments are given, an internal LED register is printed along with the current function the onboard LEDS, led1 and led2 are tracing. If a undefined led_id is given, the led command silently does nothing and returns NULL. If an incorrect number of arguments or invalid led_function is given a usage message is printed. Depending on the RaidRunner model the led_id can be one of · led1 - LED1 on the RaidRunner controller itself · led2 - LED2 on the RaidRunner controller itself · Dc.s.l - Device on channel c, scsi id s, scsi lun l · status - the status LED on the RaidRunner · io - the io LED on the RaidRunner and led_function can be one of · on - turn on the given LED · off - turn off the given LED · ok - set the given LED to the defined OK state · faulty - set the given LED to the defined FAULTY state · warning - set the given LED to the defined WARNING state · rebuild - set the given LED to the defined REBUILD state · tprocsw - set the given LED to trace kernel process switching · tparity - set the given LED to trace I/O parity generation · tdisconn - set the given LED to trace host interface disconnect activity · pid - set the given LED to trace the process pid as it runs Different models of RaidRunner have various differences in number of LED's and their functionality. Depending on the type of LED, the ok, faulty, warning and rebuild functions perform different functions. See your RaidRunner's Hardware Reference manual to see what LED's exist and what different functions do. NOTES: Tracing activities can only occur on the `onboard` leds (LED1, LED2). SEE ALSO: lflash 13.35. LFLASH- flash a led on RaidRunner · SYNOPSIS: lflash led_id period · DESCRIPTION: lflash uses the given led_id to identify the LED to flash every period seconds. If a undefined led_id is given, the led command silently does nothing and returns NULL. Depending on the RaidRunner model the led_id can be one of: led1 - LED1 on the RaidRunner controller itself led2 - LED2 on the RaidRunner controller itself Dc.s.l - Device on channel c, scsi id s, scsi lun l status - the status LED on the RaidRunner io - the io LED on the RaidRunner · NOTE: The number of seconds must be greater than or equal to 2. · SEE ALSO: led 13.36. LINE - copies one line of standard input to standard output · SYNOPSIS: line · DESCRIPTION: line accomplishes the one line copy by reading up to a newline character followed by a single K9write. · SEE ALSO: K9read, K9write 13.37. LLENGTH - return the number of elements in the given list · SYNOPSIS: llength list · DESCRIPTION: llength returns the number of elements in a given list. · EXAMPLES: Some simple examples: set list D1 D2 D3 D4 D5 # create the list set len `llength $list' # get it's length echo $len 5 set list {D1 D2 D3 D4 D5} {D6 D7} # create the list set len `llength $list' # get it's length echo $len 2 set list {} # create an empty list set len `llength $list' # get it's length echo $len 0 13.38. LOG - like zero with additional logging of accesses · SYNOPSIS: bind -k {log fd error_rate tag} bind_point · DESCRIPTION: log is a special file that when written to is a infinite sink of data (i.e. anything can be written to it and it will be disposed of quickly). When log is read it is an infinite source of zeros (i.e. the byte value 0). The log file will appear in the K9 namespace at the bind_point. Additionally, ASCII log data is written to the file associated with file descriptor fd. error_rate should be a number between 0 and 100 and is the percentages of errors (randomly distributed) that will be reported (as an EIO error) to the caller. Each line written to fd will have tag appended to it. There is one line output to fd for each IO operation on the log special file. The first character output is "R" or "W" indicating a read or write. The second character is blank if no error was reported and "*" if one was reported. Next (after a white space) is a (64 bit integer) offset into the file of the start of the operation, followed by the size (in bytes) of that operation. The line finishes with the tag. · EXAMPLE: Bind a log special file at "/dev/log" that writes log information to standard error. Each line written to standard error has the tag string "scsi" appended to it. Approximately 30% of reads and writes (i.e. randomly distributed) return an EIO error to the caller. This is done as follows: bind "log 2 30 scsi" /dev/log dd if=/dev/zero of=/dev/log count=5 bs=512 W 0000000000 512 scsi W 0000000200 512 scsi W 0000000400 512 scsi W* 0000000600 512 scsi Write failed. 4+0 records in 3+0 records out SEE ALSO: zero 13.39. LRANGE - extract a range of elements from the given list · SYNOPSIS: lrange first last list · DESCRIPTION: lrange returns a list consisting of elements first through last of list. 0 refers to the first element in the list. If first is greateR THAN last then the list is extracted in reverse order. · EXAMPLES: Some simple examples: set list D1 D2 D3 D4 D5 # create the list set subl `lrange 0 3 $list' # extract from indices 0 to 3 echo $subl D1 D2 D3 D4 set subl `lrange 3 1 $list' # extract from indices 3 to 1 echo $subl D4 D3 D2 set subl `lrange 4 4 $list' # extract from indices 0 to 3 echo $subl # equivalent to get 4 $list D5 set subl `lrange 3 100 $list' echo $subl D4 D5 13.40. LS - list the files in a directory · SYNOPSIS: ls [ -l ] [ directory... ] · DESCRIPTION: ls lists the files in the given directory on standard out. If no directory is given then the root directory (i.e. "/") is listed. Each file name contained in a directory is put on a separate line. Each listing has a lead-in line stating which directory is being shown. If there is more than one directory then they are listed sequentially separated by a blank line. If the "-l" switch is given then every listed file has data such as its length and the file system it belongs to shown on the same line as its name. See the stat command for more information. ls is not an inbuilt command but a husky script which utilizes cat and stat. The script source can be found in the file "/bin/ps". · SEE ALSO: cat, stat 13.41. LSEARCH - find the a pattern in a list · SYNOPSIS: lsearch pattern list · DESCRIPTION: lsearch returns the index of the first element in list that matches pattern or -1 if none. 0 refers to the first element in the list · EXAMPLES: Some simple examples: set list D1 D2 D3 D4 D5 # create the list set idx `lsearch D4 $list' # get index of D4 in list echo $idx 3 set idx `lsearch D1 $list' # get index of D1 in list echo $idx 0 set idx `lsearch D8 $list' # get index of D8 in list echo $idx # equivalent to get 4 $list -1 13.42. LSUBSTR - replace a character in all elements of a list · SYNOPSIS: lsubstr find_char replacement_char list · DESCRIPTION: lsubstr returns a list replacing every find_ch character found in any element of the list with the replacement_char character. replacement_char can be NULL which effectively deletes all find_char characters in the list. · EXAMPLES: Some simple examples: set list D1 D2 D3 D4 D5 # create the list set subl `lsubstr D x $list' # replace all D's with x's echo $subl x1 x2 x3 x4 x5 set subl `lsubstr D {} $list' # delete all D's echo $subl 1 2 3 4 5 set list -L -16 # create a list with embedded braces set subl `lsubstr {} $list' # delete all open braces set subl `lsubstr {} $subl' # delete all close braces echo $subl -L 16 13.43. MEM - memory mapped file (system) · SYNOPSIS: bind -k {mem first last [ r ]} bind_point · DESCRIPTION: mem allows machine memory to be accessed as a single K9 file (rather than a file system). The host system's memory is used starting at the first memory location up to and including the last memory location. Both first and last need to be given in hexadecimal. If successful the mem file will appear in the K9 namespace at the bind_point. The stat command will show it as a normal file with the appropriate size (i.e. last - first + 1). If the optional "r" is given then only read-only access to the file is permitted. In a target environment mem can usefully associate battery backed-up RAM (or ROM) with the K9 namespace. In a Unix environment it is of limited use (see unixfd instead). In a DOS environment it may be useful to access memory directly (IO space) but for accessing the DOS console see doscon. When mem is associated with the partition of Flash RAM that stores the husky scripts, which is stored compressed, reading from that page will automatically decompress and return the data as it is read. When mem is associated with the writable partitions of Flash RAM (configuration partition, husky script partition and main binary partition) a write to the start of any partition will erase that partition. · SEE ALSO: ram · BUGS: Only a single file rather than a file system can be bound. 13.44. MDEBUG - exercise and display statistics about memory alloca­ tion · SYNOPSIS: mdebug [off|on|trace|p|m size|f ptr|c nel elsize|r ptr size] · DESCRIPTION: mdebug can be used to directly allocate and free memory. mdebug will also print (to standard output) information about the current state of memory allocation. With out any given options a brief five line summary of memory usage is printed, e.g. : raid; mdebug Mdebug is off nreq-nfree=87096-82951=4145(13905745) size=15956672/16150000 waste=1%/2% list=4251/8396 : raid; The first line indicates the debug mode, either off, on or trace. The second line indicates the number times a request for memory is made (to Mmalloc() or Mcalloc() and related functions) and the number of times the memory allocator is called to free memory (via Mfree()). The difference between these first two numbers is the total number of currently allocated blocks of memory, with the number between the '(' and ')' being the total memory requested. Note that the amount of memory actually assign may be more than requested. The third line indicates the amount of memory being managed. The second number is the total memory man aged (i.e. left over after loading the statically allocated text, data and bss space). The first number is that left over after various memory allocation tables have been subtracted out from that afore mention number. The fourth line is the total amount of extra memory assigned to requests in excess of the actual requested memory as compared with the totals on line 3. The fifth line relates to the list of currently allocated memory. The first number is the number of free entries left and the second is the maximum table size. Note that the number of currently allocated blocks (third number on line 2) when added to the first number on line 5 gives the second number on line 5. OPTIONS: · p: Prints the above mentioned five line summary and then the free list. · P: Prints all the above plus dumps the list of currently allocated memory. · PP: Prints all the above plus the free bitmap. The above three options can generate copious output and require a detailed knowledge of the source to understand their meaning. off: Turns off memory allocation debugging. This is the default condition after booting. on: Turns on memory allocation assertion checking. trace: Turns on memory allocation assertion checking and traces every memory allocation / deallocation. m: Uses Mmalloc() to allocate a block of memory of size bytes. f: Uses Mfree() to de-allocate a block of memory addressed by ptr. c: Uses Mcalloc() to allocate a contiguous block of memory consisting of nel elements each of size bytes. r: Uses Mrealloc() to re-allocate a block of previously allocated memory, ptr, changing the allocated size to be size bytes. SEE ALSO: Unix man pages on malloc() 13.45. MKDIR - create directory (or directories) · SYNOPSIS: mkdir [ directory_name ... ] · DESCRIPTION: mkdir creates the given directory (or directories). If all the given directories can be created then NIL is returned as the status; otherwise the first directory that could not be created is returned (and this command will continue trying to create directories until the list is exhausted). A directory cannot be created with a file name that previously existed in the enclosing directory. 13.46. MKDISKFS - script to create a disk filesystem · SYNOPSIS: mkdiskfs disk_directory_root disk_name · DESCRIPTION: mkdiskfs is a husky script which is used to perform all the necessary commands to create a disk filesystem given the root of the disk file system and the name of the disk. · OPTIONS : · disk_directory_root: Specify the directory root under which the disk filesystems are bound. This is typically /dev/hd. · disk_name: Specify the name of the disk in the format Dc.s.l where c is the channel, s is the scsi id (or rank) and l is the scsi lun of the disk. After parsing it's arguments mkdiskfs creates the disk filesystem's bind point and binds in the disk at that point. set. SEE ALSO: rconf, scsihdfs 13.47. MKHOSTFS - script to create a host port filesystem · SYNOPSIS: mkhostfs controller_number host_port host_bus_directory · DESCRIPTION: mkhostfs is a husky script which is used to perform all the necessary commands to create a host port filesystem on the given RaidRunner controller given the root of the host port file systems and the host port number. · OPTIONS: · controller_number: Specify the controller on which the host port filesystem is to be created. · host_port: Specify the host port number to create the filesystem for. · host_bus_directory: Specify the directory root under which host filesystems are bound. This is typically /dev/hostbus. After parsing it's arguments mkhostfs finds out what SCSI ID the host port is to present (see hconf and then binds in the host filesystem. set. · SEE ALSO: hconf, scsihpfs 13.48. MKRAID - script to create a raid given a line of output of rconf · SYNOPSIS: mkraid `rconf -list RaidSetName' · DESCRIPTION: mkraid is a husky script which is used to perform all the necessary commands to create and enable host access to a given Raid Set. The arguments to mkraid is a line of output from a rconf -list command. After parsing it's arguments mkraid checks to see if a reconstruction was being performed when the RaidRunner was last operating, and if so, notes this. It then creates the raid filesystem (see mkraidfs) and adds a cache frontend to the raid filesystem. It then creates the required host filesystems (see mkhsotfs) and finally, if a reconstruction had been taking place when the RaidRunner was last operating, it restarts a reconstruction. · NOTE: This husky script DOES NOT enable target access (stargd) to the raid set it creates. · SEE ALSO: rconf, mkraidfs, mkhostfs 13.49. MKRAIDFS - script to create a raid filesystem · SYNOPSIS: mkraidfs -r raidtype -n raidname -b backends [-c chunk] [-i iomode] [-q qlen] [-v] [-C capacity] [-S] · DESCRIPTION: mkraidfs is a husky script which is used to perform all the necessary commands to create a Raid filesystem. · OPTIONS: · -r raidtype: Specify the raid type as raidtype for the raid set. Must be 0, 1, 3 or 5. · -n raidname: Specify the name of the raid set as raidname. · -b backends: Specify the comma separated list of the raid set's backends in the format used by rconf. · -c iosize: Optionally specify the IOSIZE (in bytes) of the raid set. · -i iomode: Optionally specify the raid set's iomode - read-write, read-only, write-only. · -q qlen: Optionally specify the raid set's queue length for each backend. · -v: Enable verbose mode which prints out the main actions (binding, engage commands) as they are performed. · -C capacity: Optionally specify the raid set's size in 512-byte blocks. · -S: Optionally specify that spares pool access is required should a backend fail. After parsing it's arguments mkraidfs creates the Raid Set's backend filesystems, typically, disks (see mkdisfs) taking care of failed backends. It then binds in the raid filesystem and engages the backends into the filesystem. If spares access is requested, it enables the autorepair feature of the raid set. SEE ALSO: rconf, mkraidfs, mkhostfs, mkdiskfs, raid[0135]fs 13.50. MKSMON - script to start the scsi monitor daemon smon · SYNOPSIS: mksmon controllerno hostport scsi_lun protocol_list · DESCRIPTION: mksmon is a husky script which is used to perform all the necessary commands to start the scsi monitor daemon smon given the controller number, hostport, scsi lun, and the block protocol list. Typically, mksmon, is run with it's arguments from the output of a mconf -list command. · OPTIONS: · controllerno: Specify the controller on which the scsi monitor daemon is to be run. · hostport: Specify the host port through which the scsi monitor daemon communicates. · scsi_lun: Specify the SCSI LUN the scsi monitor daemon is to respond to. · protocol_list: Specify the comma separated block protocol list the scsi monitor daemon is to implement. After parsing it's arguments mksmon checks to see if it's already running and issues a message if so and exits. Otherwise, it creates the host filesystem (mkhostfs), creates a memory file and set of fifo's for smon to use and finally starts smon set.