more storage, more performance or more reliability than...

46
Mirrors Primary (US) Issues February 2000 February 2000 Search Submit Article Contact Us How to Help Merchandise T H I S M O N T H ’ S F E A T U R E S What Can Linux Learn From FreeBSD? by Matthew Karim Borowski As a network consultant, my clients often ask me which operating system they should run on their servers. But after eliminating Windows NT from the choices, which Unix should I recommend? Read More Planning Your New Box by Brennan Stehling Recommended hardware and installation configurations for your next BSD machine, whatever its role may be. Read More R E G U L A R C O L U M N S Newbies: Dealing with Disconnection by Chris Coleman Being disconnected from a terminal unexpectedly can be a real bummer. Here’s a program that can save your time and make your life easier... Read More Blueprints: Software RAID for BSD: Vinum by Greg Lehey Many BSD systems have storage needs which current generation disks can’t fulfill by themselves: they may want more storage, more performance or more reliability than an From the Editor No time to rest by Brett Taylor Being a good advocate of BSD requires many things, not the least of which is being able to ignore these hopeless flamewars. Daily Daemon News BiTMICRO NETWORKS Redefines Capacity andPerformance Limits for 2.5 Inch Solid State Flash Disks FreeBSD Specific Search?!? New issue of Daemon News NIDS - Sniffers on Steroids Mac OS X Update: Quartz and Aqua Darby Daemon in... Source Wars

Transcript of more storage, more performance or more reliability than...

Page 1: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

Mirrors Primary (US) IssuesFebruary 2000

February 2000 Search Submit Article Contact Us How to Help Merchandise

T H I S M O N T H ’ S F E A T U R E S

What Can Linux Learn From FreeBSD?by Matthew Karim Borowski

As a network consultant, my clients often ask me whichoperating system they should run on their servers. But aftereliminating Windows NT from the choices, which Unixshould I recommend? Read More

Planning Your New Boxby Brennan Stehling

Recommended hardware and installation configurations foryour next BSD machine, whatever its role may be. ReadMore

R E G U L A R C O L U M N S

Newbies: Dealing with Disconnectionby Chris Coleman

Being disconnected from a terminal unexpectedly can be areal bummer. Here’s a program that can save your time andmake your life easier... Read More

Blueprints: Software RAID for BSD: Vinumby Greg Lehey

Many BSD systems have storage needs which currentgeneration disks can’t fulfill by themselves: they may wantmore storage, more performance or more reliability than an

From the Editor

No time to restby Brett TaylorBeing a good advocate ofBSD requires many things,not the least of which isbeing able to ignore thesehopeless flamewars.

Daily Daemon News

BiTMICRO NETWORKSRedefines CapacityandPerformance Limits for2.5 Inch Solid State FlashDisks FreeBSD SpecificSearch?!? New issue of DaemonNews NIDS - Sniffers on SteroidsMac OS X Update: Quartzand Aqua

Darby Daemon in...

Source Wars

Page 2: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

more storage, more performance or more reliability than anindividual disk can provide... Read More

The Answer Man: Help, I’ve Fallenby Gary Kline and David Leonard

This month we’re discussingTime-and-the-single-computer. Read More

Dæmon’s Advocate: Commercial BSD supportby Greg Lehey

One of the most unusual things about Open Sourcesoftware is that it is free. You can legally pick up thesoftware off the net, or pay a small charge for theconvenience of having it on CD-ROM. But what aboutsupport for that software? Read More

Miscellaneous

CreditsThe hard-workingcrew TarballDownload a tar.gzversion of this issue

Search

Search

Advanced

or Search all DaemonNews

Copyright © 1998-2000 DæmonNews. All Rights Reserved.

Page 3: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

February 2000 Search Submit Article Contact Us Join Us Merchandise

No time to rest

by Brett Taylor [email protected]

It’s now February and the year is moving right along. There were blessedly few Y2K glitches andnow we just await the 29th. In addition, the BSDs have been getting more exposure to the publicthrough increased media coverage. OpenBSD has continued to be lauded for their security efforts.FreeBSD has been fairly visible in the mass media. NetBSD is making public appearances as well,the latest at the Canadian Special Olympics 2000 Winter Games.

One thing that has been distressing amidst all of this good advocacy is the amount of bad advocacy,even if unintended by the ‘‘advocate.’’ Every action that you take as a user of BSD says somethingabout yourself and also the BSDs as a whole. Take a look at the comments for any article related toBSD at Slashdot and you’ll see lots of this type of advocacy. Being a good advocate of BSDrequires many things, not the least of which is being able to ignore these hopeless flamewars.

In regards to this, I’ve made up two short and sweet rules of being a good advocate to BSD.

Do one thing every day/week/month to promote BSD.This is harder than it sounds. It’s like that New Year’s resolution you made to run one mileeveryday. You run everyday for a couple of weeks, then you miss one day. The next thingyou know you haven’t run for a week. Your act of advocacy could be as small as answeringone question in one of the support lists for any of the BSDs or asking your local bookstore orcomputer shop to carry a BSD-related item, to something as big as writing an article for amagazine, newspaper, or ezine. Remember helping one person eventually helps more thanthat one user as he learns to help others with their difficulties. We will reap what we sow.

If you can’t say anything nice, don’t say anything at all.This one is straight from mom. It’s hard to stay out of an emotional exchange, especially ifthe person taunting BSD users is using false information. The best thing to do is stay calm,correct the falsehood with facts, and let the rest lie. Focus on the positive aspects of BSD andnot the ‘‘failings’’ of the other OS. Emotional exchanges lead to people talking past eachother and bad feelings toward others. More importantly, that very exchange could be the onethat turns potential users into non-users.

This quest for good advocacy has led to a few changes here at Dæmon News as well.. ChrisColeman has found himself overseeing the growth of the various Dæmon News sites and doing alot of behind the scenes work. Because of that he has removed himself from his editor in chiefposition leaving me as the sole editor in chief of the monthly site. Geoff Jukema, who does a lot ofthe work behind the monthly (sometimes it seems like all of the editing work) has been promoted tomanaging editor. Congrats to you Geoff and yes it means you need to do more work. :-) Chris willstill be writing the Newbies’ Column and the occasional editorial.

Page 4: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

I’d also like to deeply thank Allen Briggs for his incredible proofing/editing work for the Decemberand January issues. Without Allen we would have been hard pressed to get things done and thearticles would certainly not have been as good, particularly Matt Dillon’s virtual memory article.

Have a good month and be sure to tell us about your acts of advocacy!

Author maintains all copyrights on this article.Images and layout Copyright © 1998-2000 Dæmon News. All Rights Reserved.

Page 5: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

February 2000 Search Submit Article Contact Us Join Us Merchandise

What Can Linux Learn From FreeBSD?

by Matthew Karim Borowski [email protected]

As a network consultant, my clients often ask me which operating system they should run on theirservers. I start by telling them why NT should be avoided, often referring to the Kirch document athttp://www.unix-vs-nt.org. It doesn’t take much convincing to get past this hurdle. It helps to tellthe customer that many varieties of Unix can be acquired free of charge, compared to the high costsof Windows NT. Microsoft’s products may be somewhat easier to administrate if you don’t knowwhat you’re doing, but my clients don’t care about that. They pay me to handle the administrativefunctions.

The big decision lies in choosing which Unix to use. There are a number of Unix-like operatingsystems available on the market today. As you probably know, the most popular one in recent yearshas been Linux. Available in a variety of distributions, Linux is technically a clone of Unix in that ituses no source code from the original Unix system. Originally created by Linus Torvalds (a Finnishstudent of Computer Science) in 1991, Linux is licensed according to the GNU General PublicLicense, which makes its source code and its binaries freely available to anyone who wishes todownload it. Thanks to the contributions of many developers, Linux has grown to become apowerful and well-supported operating system.

The second most popular free Unix is FreeBSD. BSD (Berkeley Software Distribution) originatedat the University of California, Berkeley, as a modified version of AT&T’s Unix. The Berkeleycontributors pioneered many new features in Unix, including TCP/IP support and job control. By1989, the group in charge of BSD development had re-written so much of the original Unix codethat they decided to release the non-AT&T portions of the code to the public, as the Net/2distribution. BSD development was stifled by a damaging lawsuit filed by the owners of Unix. UnixSystems Labs claimed that BSD contained copyrighted Unix code. The lawsuit was droppedfollowing the sale of USL to Novell, and numerous parties began completing the missing portionsof code in BSD’s new 4.4BSD-Lite release. The 386BSD port to Intel-based systems, completed byBill Jolitz, was adopted and maintained by the FreeBSD and NetBSD groups. BSDI sells their owncommercial system, BSD/OS.

The modern FreeBSD operating system still stays true to its predecessors. The latest release hassupport for both Intel and Alpha processors. It supports Symmetric Multi-processing (SMP), has arobust TCP stack, and has a fully integrated build system. Whereas Linux development isfragmented, FreeBSD development is centralized. Linux systems tend to use tools which aremaintained by different people (as well as many GNU tools). FreeBSD has a central CVSrepository where developers work on the entire system. There are advantages to both styles ofdevelopment. With FreeBSD you can easily sync your source code to the latest version using theCVSup tool, and then rebuild the full operating system with a single command. In fact, this is thestandard method of upgrading a FreeBSD installation.

Page 6: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

Another nice feature in FreeBSD is the ports system. Currently containing over 3,000 ports, theports system simplifies the installation of software packages. Programs are organized in differentcategories. Each program’s directory contains necessary patches and a Makefile. When you typemake, the source code for the program is fetched from the Internet or from a FreeBSD CD-ROM.The ports system handles dependency checking and will recursively install any needed packages. Ifind the ports system easier to use and more convenient than Red Hat’s rpm or Debian’s dpkg(although dpkg itself is a very useful and convenient system). With the ports system, you can eveninstall every single package in the ports system by typing "make install" in /usr/local/ports. Makesure you have a lot of spare disk space though!

A lot of the charm of FreeBSD is in its classic feel. A lot of people ( excluding myself) prefer thesimple BSD-style initialization scripts to the more complex System V init. FreeBSD’s installer isquick and easy, yet powerful. The base install is small. It doesn’t clutter your hard drive withunneeded software. The userland programs are true to the original Unix versions, whereas Linuxusually includes radically different versions (for example, Red Hat Linux comes with the vim editorinstead of the standard vi). Linux also uses the GNU utilities, which often have many differentoptions than the classic Unix versions. The differences between Linux and FreeBSD are all aboutfreedom of choice.

What are the pros and cons of Linux and FreeBSD as servers? Both operating systems have beenproven in mission critical settings. The largest FTP server on the Internet, ftp.cdrom.com, runsFreeBSD. So does Yahoo!, the most popular Internet search engine, and Hotmail, a large e-mailprovider (ironically, Hotmail is owned by Microsoft). Linux is the force behind Deja.com, eBay,and many NASA servers. Both systems have robust TCP stacks. At one point, FreeBSD had muchbetter networking performance. But with the 2.2 kernel, Linux has finally caught up. Which oneshould you choose for your own server? Try them both and decide.

About the author: Matthew Borowski does consulting work for businesses in the Washington, D.C. area through hiscompany, WorldServe Consulting. He has been using Linux since 1996 and FreeBSD since 1998. In his spare time, hemountain bikes, listens to music and travels. (Contact Info)

Author maintains all copyrights on this article.Images and layout Copyright © 1998-2000 Dæmon News. All Rights Reserved.

Page 7: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

February 2000 Search Submit Article Contact Us Join Us Merchandise

Planning Your New Box

Brennan Stehling brennan.offwhite.net

If you are into computers and love working with them, the best feeling is planning your next box.What OS do you want to put on it? How much disk space will you get? What processor should youbuy? If you are a home user you have a wide range of options. If you are building a file server foruse at the office, your range of options becomes more narrow.

To start out, you should decide what the primary use of the new computer will be. If it is a homebox you will likely want X Windows with a web browser and the ability to play games, whichrequires compatible sound and video cards. If it destined to be a server, you can leave out the soundcard and stick to a basic video card.

I have personally been involved in planning and building several computers and have installedvarious versions of Linux and FreeBSD. They all had a different purpose. A couple were forworkstations and the rest were meant to be servers. After a couple years of experience I havelearned a few things which have helped me in planning for my next box.

In my case I wanted to replace my old server which I built back in early 1999. I built it for less than$900. It is a Pentium II 350 with 64megs of RAM and a 8.5 Gig IDE hard-drive. The machine is ona network and serves up websites with Apache. It has been a good little server, but as my needshave grown, it is not enough. I want a faster machine with a current version of the OS. My serverOS of choice is FreeBSD. I want improved performance for Java and cryptography for the nextserver because I have been dabbling in OpenSSL and Java Servlets lately.

First off, let’s make a list of necessary items for a server and for a home computer. To be morespecific, I will detail two servers, a file server and a web server along with a home workstation usedfor playing games and playing mp3 music. I will explain the configuration of each one below.

File Server:Fast Processor (Intel or AMD)MotherboardMemory (128 megs or more)System Disk (SCSI)RAID drive (hardware drive)Video Card (built-in or generic)Network Card (3Com)CD-ROMFloppy Drive

Web Server:Fast Processor (Intel or AMD)MotherboardMemory (128 megs or more)

Page 8: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

System Disk (SCSI)Export Disk (IDE)Video Card (built-in or generic)Network Card (3Com)CD-ROMFloppy Drive

Home Workstation:Decent Processor (Intel or AMD)MotherboardMemory (32 megs or more)System Disk (IDE)Export Disk (IDE)Video Card (anything nice and compatible)Sound Card (Sound Blaster)ModemCD-ROMFloppy Drive

One key difference between the home and server machines is the internet connection. If you happento be lucky enough to have DSL at home you may need a network card instead of the modem, butgenerally you will be going over the conventional phone line for dial-up internet access. If you donot plan on going online at all you do not need either.

The only other key difference is the drive configuration. For all systems I recommend a system diskfor the base install of the system. That is where you will place your boot and swap partitions andthe rest of your system files; that will be a busy drive. Then you will need much more space foryour various other files. If you have that on a different physical disk you will notice a performancebenefit because you will be using a different read/write head than your system drive. The moreheads you have reading and writing to your filesystem the better.

For the home system you can use all IDE drives, and those are cheap. You can get an IDE drive forover 20 gigs for well under $200 easily these days. You only need about 5 gigs for the system diskwhile the export disk can be any size you like. This configuration should make you quite happy athome. If you want more space in the future you can always go out and buy the next largest IDEdrive to come out and add that to your system.

The servers may require SCSI drives. If you are running a web server, which I assume will beApache, you will do fine with lots of physical memory and swap space on the fast system disk.With a fast SCSI drive you will have great performance. The export disk does not need to be SCSIsince the web server will hold most of the content in physical memory or on the SCSI swappartition. Since IDE drives are so cheap and SCSI drives are so expensive you will do well using afast system disk to do all your heavy lifting.

If your box is serving up files on a network to many client machines you cannot rely on a webserver caching your content. Your disk heads will be very busy reading and writing to files all overthe place. If they are not fast enough you will experience slow performance in the system. Irecommend a RAID configuration because you want to have redundancy on such a system and youalso have many heads reading and writing to your filesystem. You also want a hardware drivenRAID drive so your system disk is not doing all the work to run a software driven RAID system. Ifyou do not feel a RAID system is necessary, you may do fine with a few SCSI drives. At least youwill have one head for the system disk and one for each SCSI drive.

Your exact hard-drive configuration could be any variation between these three examples and a bitbeyond. The most important thing is to give your system disk room to do it’s work. If you are

Page 9: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

running a file server based on a large IDE drive you may see your server die due to kernel panicsbecause it cannot swap out memory to disk fast enough as the disk head is busy on another part ofthe disk. Choosing your disk configuration is a science on it’s own and it is best to just get morethan you expect since your usage will likely grow anyway.

In order to install your OS of choice you will need at least a CD-ROM and possibly a floppy drive.Get any floppy drive that is cheap and get a well known and compatible brand for the CD-ROM.These two devices are well supported and should offer few headaches.

With a home system you will need a decent video card and a compatible sound card. Depending onyour OS of choice, you will be limited to a select number of hardware vendors for sound support.Check the various sites listed below for hardware information. You can generally select a wellknown brand safely.

For a sound card, get a real Sound Blaster card from Creative Labs, not a cheap clone like Compaqor other OEM’s like to make. Creative Labs is pretty well supported. It’s like buying a 3Comethernet card, but be sure you buy a compatible model. The newest models may not have driversupport quite yet so bring a print-out listing of the sound cards which your OS supports. If it islisted you can feel good knowing that you will be able to get your card working right away.

Video cards are much simpler. A normal installation of your system will not need advancedgraphics so any video card will work, but once you want to start using X Windows you will want asupported card. The number of supported cards is quite large, so odds are the card you choose willwork, but be sure to check the supported list at the XFree86 web site. No matter what card youchoose, be sure to get one with a decent amount of RAM. I have noticed they generally have wellover 2 megs these days. Get a card with 8 megs if they have it. That will affect the possibleresolution and color depth for X Windows.

Once you have decided on all of your parts, shop around. Get quotes online from places like MicroWarehouse or Price Watch. You can even purchase the parts online if they have a reasonable price,but only buy with a warranty. Most manufacturers, like Seagate, have at least a one-yearmanufacturer’s warranty on their products.

I shopped around and found that Best Buy sells hard-drives and memory cheap while the localcomputer parts store sells them for much more. I purchased my motherboard and other necessaryparts at the local computer parts store. They sold the processor and more of the other parts realcheap. If you are building a home system to play games, you may be able to afford a better videocard if you save money on your processor, memory and hard-drives.

Then after you have planned your new box, bought your parts, assembled the machine and installedyour glorious OS of choice you can give it a spin. After planning your new box down to the lastdevice you should be happy with its performance.

Resources

Operating Systems

FreeBSD NetBSD OpenBSD Linux

Page 10: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

X Windows

The XFree86 Project, Inc.

Hardware

FreeBSD Hardware NetBSD Hardware OpenBSD Hardware Linux Hardware

Author maintains all copyrights on this article.Images and layout Copyright © 1998-2000 Dæmon News. All Rights Reserved.

Page 11: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

February 2000 Search Submit Article Contact Us Join Us Merchandise

Dealing with Disconnection.

by Chris Coleman [email protected]

I work at home from a dialup Internet connection. I have a small home network of about 4computers of various types. All of them route through my BSD box to the Internet. Things workreally well, except I live out in the middle of nowhere and the telephone lines here won’t even letme connect at 28.8K. I think I get about 26.4K on a good day. Needless to say, it’s slow.

I use ssh from my BSD desktop to connect to the computers I work on. They are all remotelyhosted and I have never actually seen any of the boxes. Usually, I am editing files using vi(1),ytalk(1)ing to people, or reading my e-mail with pine(1).

My biggest problem working remote is not the lack of speed. I can deal with that - push a key, goeat lunch, no real problem. What I hate is getting disconnected. When I am in the middle of aproject, the phone line goes dead and I lose all my work. vi(1) can sometimes be forgiving, andpine(1) is getting better, but it’s far from the ideal method. I can usually recover part of my work.

Often after I get disconnected and log back in, the program I am working on is still running. Usingw(1), I can see that the server still thinks that I am logged in and hasn’t terminated the program. Ifonly I could get control of that program again, I could keep working so I wouldn’t lose any of mywork.

I was ytalk(1)ing with a friend of mine the other day, and I got disconnected. Before I could rejointhe conversation, I had to kill my other ytalk(1) session and start over. When I explained that I hadgotten disconnected, he asked me if I used screen(1). I immediately thought back to a utility that Ihad used when I first started using BSD. This utility would take a text console and split it in two,giving two windows with out using X. I never did get the hang of this program, and found myscreen space shrinking in halves everytime I pressed the wrong button. I had abandoned it when Igot X configured for the first time.

After some discussion, I realized that this was not what he was talking about. The program I wasremembering was splitvt and not what I needed.

He ranted and raved about how wonderful screen(1) was and how the world needed to know aboutit. I told him I would look into it and if I liked it, I would write an article on it. Well, as you can tell,I liked it. It does exactly what I needed. Actually, it does quite a bit more than I needed.

The nice thing about screen(1) is that it gets out of your way. Once you run it, you can forget aboutit. It will be there when you get disconnected.

I found screen in the ports/packages collection under the misc section (/usr/ports/misc/screen) and

Page 12: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

installed it.

The install was simple and included a man(1) page, which I quickly read.

Screen is a full-screen window manager that multiplexes a physical

terminal between several processes (typically interactive shells)

When I read the part about "full-screen window manager", I immediately thought of The XWindow system and wondered if it somehow required X to be installed. As I read further, I realizedthat it was a window manager independent of X. It manages the current screen or window that youare using, allowing you to stack sessions on top of each other. Like cards on a stack, it allows youto flip through the active sessions you have open; viewing only one at a time.

This is very similar to the virtual console effect that BSD has when you are not using X. At the textconsole, pressing <ctrl> + <alt> + <f2> (or <alt> + <f2> on FreeBSD) will switch to anotherterminal where you can log in. In screen, you can press <ctrl> + <a> and then press ’n’ to switch tothe next screen session. But, unlike the virtual consoles where you have a set number of them,screen sessions don’t exist until you start a new one.

Typing ’screen’ at the command line will start your first screen session. Everytime you type’screen’ again, it will spawn a new screen session in the same window. For instance, I start screenand get a new command line back. Then I type screen again. It gives me another command line,exactly like the first, but the screen is clean. Pressing <ctrl> + <a> then ’n’ will bring back the firstscreen session I was working on. (Hold down the ctrl key and press the ’a’ key one time. Then pressthe ’n’ key.)

Pressing <ctrl> + a and then ’?’ will bring up the help screen.

This lists all the key mappings. Each of the key mappings need to be proceeded by pressing <ctrl>

Page 13: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

+ ’a’, otherwise screen will pass it through as a shell command.

However, none of this is really what I was looking for. I was looking for a way to reconnect toprograms that I had been disconnected from.

In the man page I found the syntax:screen -r [[pid.]tty[.host]]

I assumed that the ’-r’ option was for recover.

I started screen on my remote workstation, loaded a program, and went for lunch.When I returned, Ifound that the screen was frozen and I had been disconnected. So I logged back in and checked therunning processes. I noticed that I was still logged in on ttypc. So I typed

screen -r ttypc

There is a screen on:

79448.ttypc.vnode (Attached) There is no screen to be resumed matching ttypc.

hmm.. I thought. Maybe I need to type the whole thing. So I cut and pasted the screen name into thecommand.

screen -r 79448.ttypc.vnode

There is a screen on:

79448.ttypc.vnode (Attached) There is no screen to be resumed matching

79448.ttypc.vnode.

At this point, I got disconnected again. When I finally got reconnected, I thought I would try itwithout telling it which screen to get back. I had hoped it would give me a list of the screensrunning.

screen -r

Boom!. There was my session.

After a little playing around with screen, I realized that the screen session was attached to an activesession, since the dialup connection hadn’t closed it yet. When the attached shell died, I was able tograb the screen session and resume work. If I had wanted to detach the screen from wherever it wasrunning and reattach it onto my new terminal, I could have used the detach command along withthe reattach command:

screen -d -r 79448

If you use the ’w’ command to see who is doing what, screen will show that you are logged into ascreen session instead of a displaying where you are logged in from.

3:58PM up 113 days, 21:23, 25 users, load averages: 1.06, 1.04, 1.04USER TTY FROM LOGIN@ IDLE WHATchrisc pm modem10-04:S.0 9:07AM - w

The "modem10-04:S.0" listed above shows that I am dialed up from host ’modem10-04’, usingscreen session 0. Normally, it would have appended the domain name from which I was connected.

Page 14: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

Instead, it appended the screen session name.

I have been using the same screen session for quite some time and really like how very easy it is todisconnect from and grab again when I come back.

-Chris Coleman [email protected]

[Editor’s note: I am typing this note on a terminal inside a screen session that has been active sinceNovember 30, 1999; I simply detach and reattach the session whenever I switch workstations, andnever have to close my work in progress. What a great tool! --gsutter]

Author maintains all copyrights on this article.Images and layout Copyright © 1998-2000 Dæmon News. All Rights Reserved.

Page 15: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

February 2000 Search Submit Article Contact Us Join Us Merchandise

Software RAID for BSD: Vinum

By Greg Lehey

This is an abridged version of the introduction to Vinum on the LEMIS web site. See that site for moredetails.

Many BSD systems have storage needs which current generation disks can’t fulfill by themselves:they may want more storage, more performance or more reliability than an individual disk canprovide. There are several alternative solutions to these issues, generally known by the generic termRAID (Redundant Array of Inexpensive Disks).

One solution is to use a special disk controller, called a RAID controller. This controller creates aninterface to a virtual disk which shows the desired characteristics.

The problem with RAID controllers is that they are not compatible with other controllers, so specialdrivers are needed. Currently, only FreeBSD and NetBSD support any RAID controllers, and ineach case only the DPT SmartRAID and SmartCache III and IV are supported. These are oldmodels which are no longer in production. Drivers for newer controllers from DPT, Mylex andAMD are on their way, though.

An alternative is a ‘‘SCSI-SCSI’’ RAID controller. This kind of controller doesn’t interface to thesystem directly, it interfaces to the SCSI bus. This means it doesn’t need a special driver, but it alsolimits performance somewhat.

Vinum

The third alternative is software RAID, which performs the necessary virtualization in software.FreeBSD offers this functionality in Vinum, a device driver which implements virtual disk drives.Vinum is an volume manager implemented under FreeBSD. It was inspired by the VERITAS®volume manager and maintains many of the concepts of VERITAS®. Its key features are:

Vinum implements RAID-0 (striping), RAID-1 (mirroring) and RAID-5 (rotatedblock-interleaved parity). In RAID-5, a group of disks are protected against the failure of anyone disk by an additional disk with block checksums of the other disks.

Drive layouts can be combined to increase robustness, including striped mirrors (so-called‘‘RAID-10’’).

Vinum implements only those features which appear useful. Some commercial volumemanagers appear to have been implemented with the goal of maximizing the size of the specsheet. Vinum does not implement ‘‘ballast’’ features such as RAID-4. It would have been

Page 16: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

trivial to do so, but the only effect would have been to further confuse an already confusingtopic.

Volume managers initially emphasized reliability and performance rather than ease of use.The results are frequently down time due to misconfiguration, with consequent reluctance onthe part of operational personnel to attempt to use the more unusual features of the product.Vinum attempts to provide an easier-to-use non-GUI interface.

Let’s look again at the problems we’re trying to solve:

Disks are too small

The ufs file system can theoretically span more than a petabyte (2**50 or 1,125,899,906,842,624bytes) of storage, but no current disk drive comes close to this size. Although this problem is not asacute as it was ten years ago, there is a simple solution: the disk driver can create an abstract devicewhich stores its data on a number of disks.

Vinum solves this problem with virtual disks, which it calls volumes, a term borrowed fromVERITAS. These disks have essentially the same properties as a UNIX disk drive, though there aresome minor differences. Volumes have no size limitations.

Access bottlenecks

Modern systems frequently need to access data in a highly concurrent manner. For example,wcarchive.cdrom.com maintains up to 5000 concurrent FTP sessions and transfers about 1.4 TB ofdata a day.

Current disk drives can transfer data sequentially at up to 30 MB/s, but this value is of littleimportance in an environment where many independent processes access a drive, where they mayachieve only a fraction of these values. In such cases it’s more interesting to view the problem fromthe viewpoint of the disk subsystem: the important parameter is the load that a transfer places on thesubsystem, in other words the time for which a transfer occupies the drives involved in the transfer.

In any disk transfer, the drive must first position the heads, wait for the first sector to pass under theread head, and then perform the transfer. These actions can be considered to be atomic: it doesn’tmake any sense to interrupt them.

Consider a typical transfer of about 10 kB: the current generation of high-performance disks canposition the heads in an average of 6 ms. The fastest drives spin at 10,000 rpm, so the averagerotational latency (half a revolution) is 3 ms. At 30 MB/s, the transfer itself takes about 350 µs,almost nothing compared to the positioning time. In such a case, the effective transfer rate drops toa little over 1 MB/s and is clearly highly dependent on the transfer size.

The traditional and obvious solution to this bottleneck is ‘‘more spindles’’: rather than using onelarge disk, it uses several smaller disks with the same aggregate storage space. Each disk is capableof positioning and transferring independently, so the effective throughput increases by a factorclose to the number of disks used.

The exact throughput improvement is, of course, smaller than the number of disks involved:although each drive is capable of transferring in parallel, there is no way to ensure that the requestsare evenly distributed across the drives. Inevitably the load on one drive will be higher than on

Page 17: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

another.

The evenness of the load on the disks is strongly dependent on the way the data is shared across thedrives. In the following discussion, it’s convenient to think of the disk storage as a large number ofdata sectors which are addressable by number, rather like the pages in a book. The most obviousmethod is to divide the virtual disk into groups of consecutive sectors the size of the individualphysical disks and store them in this manner, rather like taking a large book and tearing it intosmaller sections. This method is called concatenation and has the advantage that the disks do notneed to have any specific size relationships. It works well when the access to the virtual disk isspread evenly about its address space. When access is concentrated on a smaller area, theimprovement is less marked. The following figure illustrates the sequence in which storage unitsare allocated in a concatenated organization.

Concatenated organization

An alternative mapping is to divide the address space into smaller, even-sized components and storethem sequentially on different devices. For example, the first 256 sectors may be stored on the firstdisk, the next 256 sectors on the next disk and so on. After filling the last disk, the process repeatsuntil the disks are full. This mapping is called striping or RAID-0, though the latter term issomewhat misleading: it provides no redundancy. Striping requires somewhat more effort to locatethe data, and it can cause additional I/O load where a transfer is spread over multiple disks, but itcan also provide a more constant load across the disks. The following figure illustrates the sequencein which storage units are allocated in a striped organization.

Page 18: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

Striped organization

Vinum implements both concatenation and striping. Since it exists within the UNIX disk storageframework, it would be possible to use UNIX partitions as the building block for multi-disk plexes,but in fact this turns out to be too inflexible: UNIX disks can have only a limited number ofpartitions. Instead, Vinum subdivides a single UNIX partition into contiguous areas called subdisks,which it uses as building blocks for plexes.

Data integrity

The final problem with current disks is that they are unreliable. Although disk drive reliability hasincreased tremendously over the last few years, they are still the most likely core component of aserver to fail. When they do, the results can be catastrophic: replacing a failed disk drive andrestoring data to it can take days.

The traditional way to approach this problem has been mirroring, keeping two copies of the data ondifferent physical hardware. Since the advent of the RAID levels, this technique has also beencalled RAID level 1 or RAID-1. Any write to the volume writes to both locations; a read can besatisfied from either, so if one drive fails, the data is still available on the other drive.

Mirroring has two problems:

The price. It requires twice as much disk storage as a non-redundant solution.

The performance impact. Writes must be performed to both drives, so they take up twice thebandwidth of a non-mirrored volume. Reads do not suffer from a performance penalty: it evenlooks as if they are faster.

An alternative solution is parity, implemented in the RAID levels 2, 3, 4 and 5. Of these, RAID-5 isthe most interesting. As implemented in Vinum, it is a variant on a striped organization whichdedicates one block of each stripe to parity of the other blocks: As implemented by Vinum, aRAID-5 plex is similar to a striped plex, except that it implements RAID-5 by including a parityblock in each stripe. As required by RAID-5, the location of this parity block changes from onestripe to the next. The numbers in the data blocks indicate the relative block numbers.

Page 19: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

RAID-5 organization

Compared to mirroring, RAID-5 has the advantage of requiring significantly less storage space.Read access is similar to that of striped organizations, but write access is significantly slower,approximately 25% of the read performance. If one drive fails, the array can continue to operate indegraded mode: a read from one of the remaining accessible drives continues normally, but a readfrom the failed drive is recalculated from the corresponding block from all the remaining drives.

Vinum implements both mirroring and RAID-5. It implements mirroring by providing objectscalled plexes, each of which is a representation of the data in a volume. A volume may containbetween one and eight plexes.

From an implementation viewpoint, it is not practical to represent a RAID-5 organization as acollection of plexes. We’ll look at this issue below.

The big picture

As a result of these considerations, Vinum provides a total of four kinds of abstract storagestructures:

At the lowest level is the UNIX disk partition, which Vinum calls a drive. With the exceptionof a small area at the beginning of the drive, which is used for storing configuration and stateinformation, the entire drive is available for data storage.

Next come subdisks, which are part of a drive. They are used to build plexes.

A plex is a copy of the data of a volume. It is built out of subdisks, which may be organized inone of three manners:

A concatenated plex uses the address space of each subdisk in turn.

A striped plex stripes the data across each subdisk. The subdisks must all have the samesize, and there must be at least two subdisks to distinguish it from a concatenated plex.

Page 20: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

Like a striped plex, a RAID-5 plex stripes the data across each subdisk. The subdisksmust all have the same size, and there must be at least three subdisks, since otherwisemirroring would be more efficient.

Although a plex represents the complete data of a volume, it is possible for parts of therepresentation to be physically missing, either by design (by not defining a subdisk for partsof the plex) or by accident (as a result of the failure of a drive).

A volume is a collection of between one and eight plexes. Each plex represents the data in thevolume, so more than one plex provides mirroring. As long as at least one plex can providethe data for the complete address range of the volume, the volume is fully functional.

RAID-5

Conceptually, RAID-5 is used for redundancy, but in fact the implementation is a kind of striping.This poses problems for the implementation of Vinum: should it be a kind of plex or a kind ofvolume? In the end, the implementation issues won, and RAID-5 is a plex type. This means thatthere are two different ways of ensuring data redundancy: either have more than one plex in avolume, or have a single RAID-5 plex. These methods can be combined.

Which plex organization?

Vinum implements only that subset of RAID organizations which make sense in the framework ofthe implementation. It would have been possible to implement all RAID levels, but there was noreason to do so. Each of the chosen organizations has unique advantages:

Concatenated plexes are the most flexible: they can contain any number of subdisks, and thesubdisks may be of different length. The plex may be extended by adding additional subdisks.They require less CPU time than striped or RAID-5 plexes, though the difference in CPUoverhead from striped plexes is not measurable. On the other hand, they are most susceptibleto hot spots, where one disk is very active and others are idle.

The greatest advantage of striped (RAID-0) plexes is that they reduce hot spots: by choosingan optimum sized stripe (empirically determined to be in the order of 256 kB), the load on thecomponent drives can be made more even. The disadvantages of this approach are(fractionally) more complex code and restrictions on subdisks: they must be all the same size,and extending a plex by adding new subdisks is so complicated that Vinum currently does notimplement it. Vinum imposes an additional, trivial restriction: a striped plex must have atleast two subdisks, since otherwise it is indistinguishable from a concatenated plex.

RAID-5 plexes are effectively an extension of striped plexes. Compared to striped plexes,they offer the advantage of fault tolerance, but the disadvantages of higher storage cost andsignificantly higher CPU overhead, particularly for writes. The code is an order of magnitudemore complex than for concatenated and striped plexes. Like striped plexes, RAID-5 plexesmust have equal-sized subdisks and cannot currently be extended. Vinum enforces aminimum of three subdisks for a RAID-5 plex, since any smaller number would not make anysense.

Some examples

Page 21: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

Vinum maintains a configuration database which describes the objects known to an individualsystem. Initially, the user creates the configuration database from one or more configuration fileswith the aid of the vinum(8) utility program. Vinum stores a copy of its configuration database oneach disk slice (which Vinum calls a device) under its control. This database is updated on eachstate change, so that a restart accurately restores the state of each Vinum object.

The configuration file

The configuration file describes individual Vinum objects. The definition of a simple volume mightbe:

drive a device /dev/da3hvolume myvol plex org concat sd length 512m drive a

This file describes four Vinum objects:

The drive line describes a disk partition (drive) and its location relative to the underlyinghardware. It is given the symbolic name a. This separation of the symbolic names from thedevice names allows disks to be moved from one location to another without confusion.

The volume line describes a volume. The only required attribute is the name, in this casemyvol.

The plex line defines a plex. The only required parameter is the organization, in this caseconcat. No name is necessary: the system automatically generates a name from the volumename by adding the suffix .px, where x is the number of the plex in the volume. Thus thisplex will be called myvol.p0.

The sd line describes a subdisk. The minimum specifications are the name of a drive onwhich to store it, and the length of the subdisk. As with plexes, no name is necessary: thesystem automatically assigns names derived from the plex name by adding the suffix .sx,where x is the number of the subdisk in the plex. Thus Vinum gives this subdisk the namemyvol.p0.s0

After processing this file, vinum(8) produces the following output:

vinum -> create config1Configuration summary

Drives: 1Volumes: 1Plexes: 1Subdisks: 1

D a State: up Device /dev/da3h Avail: 2061/2573 MB (80%)

V myvol State: up Plexes: 1 Size: 512 MB

P myvol.p0 C State: up Subdisks: 1 Size: 512 MB

S myvol.p0.s0 State: up PO: 0 B Size: 512 MB

This output shows the brief listing format of vinum(8). It is represented graphically in the following

Page 22: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

figure.

A simple Vinum volume

This figure, and the ones which follow, represent a volume, which contains the plexes, which inturn contain the subdisks. In this trivial example, the volume contains one plex, and the plexcontains one subdisk.

This particular volume has no specific advantage over a conventional disk partition. It contains asingle plex, so it is not redundant. The plex contains a single subdisk, so there is no difference instorage allocation from a conventional disk partition. The following sections illustrate various moreinteresting configuration methods.

Increased resilience: mirroring

The resilience of a volume can be increased either by mirroring or by using RAID-5 plexes. Whenlaying out a mirrored volume, it is important to ensure that the subdisks of each plex are ondifferent drives, so that a drive failure will not take down both plexes. The following configurationmirrors a volume:

drive b device /dev/da4h

Page 23: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

volume mirror plex org concat sd length 512m drive a plex org concat sd length 512m drive b

In this example, it was not necessary to specify a definition of drive a again, since Vinum keepstrack of all objects in its configuration database. After processing this definition, the configurationlooks like:

Drives: 2Volumes: 2Plexes: 3Subdisks: 3

D a State: up Device /dev/da3h Avail: 1549/2573 MB (60%)D b State: up Device /dev/da4h Avail: 2061/2573 MB (80%)

V myvol State: up Plexes: 1 Size: 512 MBV mirror State: up Plexes: 2 Size: 512 MB

P myvol.p0 C State: up Subdisks: 1 Size: 512 MBP mirror.p0 C State: up Subdisks: 1 Size: 512 MBP mirror.p1 C State: initializing Subdisks: 1 Size: 512 MB

S myvol.p0.s0 State: up PO: 0 B Size: 512 MBS mirror.p0.s0 State: up PO: 0 B Size: 512 MBS mirror.p1.s0 State: empty PO: 0 B Size: 512 MB

The following figure shows the structure graphically.

Page 24: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

A mirrored Vinum volume

In this example, each plex contains the full 512 MB of address space. As in the previous example,each plex contains only a single subdisk.

Optimizing performance

The mirrored volume in the previous example is more resistant to failure than an unmirroredvolume, but its performance is less: each write to the volume requires a write to both drives, usingup a greater proportion of the total disk bandwidth. Performance considerations demand a differentapproach: instead of mirroring, the data is striped across as many disk drives as possible. Thefollowing configuration shows a volume with a plex striped across four disk drives:

drive c device /dev/da5hdrive d device /dev/da6hvolume stripe plex org striped 512k sd length 128m drive a sd length 128m drive b sd length 128m drive c sd length 128m drive d

As before, it is not necessary to define the drives which are already known to Vinum. After

Page 25: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

processing this definition, the configuration looks like:

Drives: 4Volumes: 3Plexes: 4Subdisks: 7

D a State: up Device /dev/da3h Avail: 1421/2573 MB (55%)D b State: up Device /dev/da4h Avail: 1933/2573 MB (75%)D c State: up Device /dev/da5h Avail: 2445/2573 MB (95%)D d State: up Device /dev/da6h Avail: 2445/2573 MB (95%)

V myvol State: up Plexes: 1 Size: 512 MBV mirror State: up Plexes: 2 Size: 512 MBV striped State: up Plexes: 1 Size: 512 MB

P myvol.p0 C State: up Subdisks: 1 Size: 512 MBP mirror.p0 C State: up Subdisks: 1 Size: 512 MBP mirror.p1 C State: initializing Subdisks: 1 Size: 512 MBP striped.p1 State: up Subdisks: 1 Size: 512 MB

S myvol.p0.s0 State: up PO: 0 B Size: 512 MBS mirror.p0.s0 State: up PO: 0 B Size: 512 MBS mirror.p1.s0 State: empty PO: 0 B Size: 512 MBS striped.p0.s0 State: up PO: 0 B Size: 128 MBS striped.p0.s1 State: up PO: 512 kB Size: 128 MBS striped.p0.s2 State: up PO: 1024 kB Size: 128 MBS striped.p0.s3 State: up PO: 1536 kB Size: 128 MB

This volume is represented in the following figure. The darkness of the stripes indicates the positionwithin the plex address space: the lightest stripes come first, the darkest last.

Page 26: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

A striped Vinum volume

Increased resilience: RAID-5

The alternative approach to resilience is RAID-5. A RAID-5 configuration might look like:

drive e device /dev/da6hvolume raid5 plex org raid5 512k sd length 128m drive a sd length 128m drive b sd length 128m drive c sd length 128m drive d sd length 128m drive e

Although this plex has five subdisks, its size is the same as the plexes in the other examples, sincethe equivalent of one subdisk is used to store parity information. After processing the configuration,the system configuration is:

Drives: 5Volumes: 4Plexes: 5Subdisks: 12

D a State: up Device /dev/da3h Avail: 1293/2573 MB (50%)D b State: up Device /dev/da4h Avail: 1805/2573 MB (70%)

Page 27: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

D c State: up Device /dev/da5h Avail: 2317/2573 MB (90%)D d State: up Device /dev/da6h Avail: 2317/2573 MB (90%)D e State: up Device /dev/da6h Avail: 2445/2573 MB (95%)

V myvol State: up Plexes: 1 Size: 512 MBV mirror State: up Plexes: 2 Size: 512 MBV striped State: up Plexes: 1 Size: 512 MBV raid5 State: up Plexes: 1 Size: 512 MB

P myvol.p0 C State: up Subdisks: 1 Size: 512 MBP mirror.p0 C State: up Subdisks: 1 Size: 512 MBP mirror.p1 C State: initializing Subdisks: 1 Size: 512 MBP striped.p0 S State: up Subdisks: 1 Size: 512 MBP raid5.p0 R State: up Subdisks: 1 Size: 512 MB

S myvol.p0.s0 State: up PO: 0 B Size: 512 MBS mirror.p0.s0 State: up PO: 0 B Size: 512 MBS mirror.p1.s0 State: empty PO: 0 B Size: 512 MBS striped.p0.s0 State: up PO: 0 B Size: 128 MBS striped.p0.s1 State: up PO: 512 kB Size: 128 MBS striped.p0.s2 State: up PO: 1024 kB Size: 128 MBS striped.p0.s3 State: up PO: 1536 kB Size: 128 MBS raid5.p0.s0 State: init PO: 0 B Size: 128 MBS raid5.p0.s1 State: init PO: 512 kB Size: 128 MBS raid5.p0.s2 State: init PO: 1024 kB Size: 128 MBS raid5.p0.s3 State: init PO: 1536 kB Size: 128 MBS raid5.p0.s4 State: init PO: 1536 kB Size: 128 MB

The following figure represents this volume graphically.

Page 28: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

A RAID-5 Vinum volume

As with striped plexes, the darkness of the stripes indicates the position within the plex addressspace: the lightest stripes come first, the darkest last. The completely black stripes are the paritystripes.

On creation, RAID-5 plexes are in the init state: before they can be used, the parity data must becreated. Vinum currently initializes RAID-5 plexes by writing binary zeros to all subdisks, though aconceivable alternative would be to rebuild the parity blocks, which would allow better recovery ofcrashed plexes.

Resilience and performance

With sufficient hardware, it is possible to build volumes which show both increased resilience andincreased performance compared to standard UNIX partitions. Mirrored disks will always givebetter performance than RAID-5, so a typical configuration file might be:

volume raid10 plex org striped 512k sd length 102480k drive a sd length 102480k drive b sd length 102480k drive c

Page 29: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

sd length 102480k drive d sd length 102480k drive e plex org striped 512k sd length 102480k drive c sd length 102480k drive d sd length 102480k drive e sd length 102480k drive a sd length 102480k drive b

The subdisks of the second plex are offset by two drives from those of the first plex: this helpsensure that writes do not go to the same subdisks even if a transfer goes over two drives.

The following figure represents the structure of this volume.

A mirrored, striped Vinum volume

Object naming

As described above, Vinum assigns default names to plexes and subdisks, although they may beoverridden. Overriding the default names is not recommended: experience with the VERITASvolume manager, which allows arbitrary naming of objects, has shown that this flexibility does notbring a significant advantage, and it can cause confusion.

Names may contain any non-blank character, but it is recommended to restrict them to letters, digitsand the underscore characters. The names of volumes, plexes and subdisks may be up to 64

Page 30: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

characters long, and the names of drives may up to 32 characters long.

Vinum objects are assigned device nodes in the hierarchy /dev/vinum. The configuration shownabove would cause Vinum to create the following device nodes:

The control devices /dev/vinum/control and /dev/vinum/controld, which are used by vinum(8)and the Vinum dæmon respectively.

Device entries for each volume. These are the main devices used by Vinum, and the namecorresponds to the name of the volume. Thus the configuration above would include thedevices /dev/vinum/myvol, /dev/vinum/mirror, /dev/vinum/striped, /dev/vinum/raid5 and/dev/vinum/raid10.

A directory /dev/vinum/drive with entries for each drive. These entries are in fact symboliclinks to the corresponding disk nodes.

A directory /dev/vinum/volume with entries for each volume. It contains subdirectories foreach plex, which in turn contain subdirectories for their component subdisks.

The directories /dev/vinum/plex and /dev/vinum/sd, which contain device nodes for each plexand subdisk.

Although it is recommended that plexes and subdisks should not be allocated specific names,Vinum drives must be named. This makes it possible to move a drive to a different location and stillrecognize it automatically.

Creating file systems

Volumes appear to the system to be identical to disks, with one exception. Unlike UNIX drives,Vinum does not partition volumes, which thus do not contain a partition table. This has requiredmodification to some disk utilities, notably newfs, which previously tried to interpret the last letterof a Vinum volume name as a partition identifier. For example, a disk drive may have a name like/dev/wd0a or /dev/da2h. These names represent the first partition (a) on the first (0) IDE disk (wd)and the eighth partition (h) on the third (2) SCSI disk (da) respectively. By contrast, a Vinumvolume may be called /dev/vinum/concat.

Normally, newfs(8) interprets the name of the disk and complains if it cannot understand it. Forexample:

# newfs /dev/vinum/concatnewfs: /dev/vinum/concat: can’t figure out file system partition

In order to create a file system on this volume, use the -v option to newfs(8):

# newfs -v /dev/vinum/concat

Startup

Vinum stores configuration information on the disk slices in a form similar to that in theconfiguration files. Once you have configured Vinum, it remembers the configuration even afterrebooting. You can even move the disks to other locations (giving them different IDs), and Vinumwill still find the correct drives. This is the reason why drives must have a name separate from the

Page 31: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

device name. To start Vinum automatically, place this line in your /etc/rc.conf file:

start_vinum=YES

Where to get more information

This article just scrapes the surface of using Vinum. There’s a lot more information in the manpages vinum(4) and vinum(8), and also on the web site. If you’re interested in having Vinum onNetBSD or OpenBSD, please contact me.

Author maintains all copyrights on this article.Images and layout Copyright © 1998-2000 Dæmon News. All Rights Reserved.

Page 32: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

February 2000 Search Submit Article Contact Us Join Us Merchandise

Help, I’ve Fallen

by Gary Kline and David Leonard

With this issue the Help, I’ve Fallen column enters its third year, and of course the column, likeeverything else, continues to evolve. This year we’ll probably devote two or three columns tosingle-topic questions --this month we’re discussing Time-and-the-single-computer. The remainingcolumns will be dedicated to random (actually, pseudo-random :-) and frequently-asked questions.The December issue will probably have both questions and links to this year’s bunch of Q&A’s.

Dirk Myers is departing from the column, hopefully temporarily. Dirk had covered the NetBSDside of things and his tenure for the year-plus was certainly sterling. For those who don’t know,Dirk is a real writer and self-taught compu-geek. David and Gary only pretend to be writers. Here’swishing you the best, Dirk.

If there are any NetBSD-savvy volunteers out there who are willing to share their insights andknowledge on the NetBSD side and who have a few hours a month, this column could use yourinput! Drop a line.

And now, without further ado, we’re into some cosmic mysteries regarding

Time

How do I set my entire system’s time zone? How do I change the timezone used by just one program (like xclock)? What is the difference between GMT and UTC? Why, when I boot into Windows (or some other OS), is its clock off by some integer numberof hours? How can I keep my LAN-attached system time automatically synchronised? How can I keep my dialin system time automatically synchronised? How can I keep my isolated system time automatically synchronised? How do I run a program periodically? How do I get my program to run exactly once, later after I have logged off and gone home? Why isn’t make working? I modified some source files, but make just refuses to make! I’verecently changed the clock but that shouldn’t matter. I was told that Unix clocks would be fine for the year 2000, (and they were), but that Unixclocks will stop in 2038? Why is that?

Page 33: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

How do I set my entire system’s time zone?

The file /etc/timezone should be a symlink pointing at one of the files under /usr/share/zoneinfo.(See also: symlinks.)

The files under the /usr/share/zoneinfo directory tree contain information about various world timezones--when daylight savings begins and ends, how far the time zone is ahead or behind UTC, andso forth.

But, which file under /usr/share/zoneinfo? Simply choose the city name that is closest (or mostrelevant) to you. For example, if you live in El Paso, Texas, your timezone is locally known asMountain Time so the file you want is /usr/share/zoneinfo/America/Denver:

# rm /etc/timezone # ln -s /usr/share/zoneinfo/America/Denver /etc/timezone

Similarly, if you live in London, England, use /usr/share/zoneinfo/Europe/London, or if you liveout the back of Coonabarabran use /usr/share/zoneinfo/Australia/Sydney.

Tid bit: Timezone files used to be organised by country or state name, but are now organised by city orgeographic feature names. This is because too many countries tend to change their names, while citynames tend to be more long lived.

How do I change the timezone used by just one program (like xclock)?

The TZ environment variable overrides the /etc/timezone setting. For example, to show the currenttime in Tokyo:

$ env TZ=Asia/Tokyo xclock -digital & Sat Jan 15 09:34:57 JST 2000

Look in /usr/share/zoneinfo for other timezones.

What is the difference between GMT and UTC?

The old GMT (Greenwich Mean Time) has effectively been replaced with UTC (UniversalCoordinated Time).

GMT was the time recorded by an atomic clock kept at Greenwich, England. UTC is based onanother time standard called TAI which is derived from hundreds of atomic clocks in the nationalstandards laboratories of many countries. TAI is slightly different from UTC: UTC is correctedroughly every eighteen months with a ’leap second’ to agree with TAI.

Still, GMT and UTC seem to be used interchangeably in the literature, except by hard core timegeeks.

Before the Unix zoneinfo files people had to specify their zone as an offset to GMT, e.g. as"GMT-10" or "GMT+7". You can still do this, but because the zoneinfo files so comprehensivelycover all the world’s inhabited timezones, there is little reason to do this.

Page 34: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

Tid bit: Because of an oversight in the POSIX standard, POSIX-compliant systems (like BSD Unix triesto be) are not allowed to take into account leap seconds! Today, POSIX systems must report a time that isabout 15 seconds behind UTC. This explains why on some systems there is a zoneinfo directory called’ posix’, and another directory called ’right’.

Why, when I boot into Windows (or some other OS), is its clock off by someinteger number of hours?! It gets annoying having to set the clock every time Iswitch between the two.

This happens because BSD Unix stores the UTC time in the hardware clock chip, which is a chipwith a battery that keeps time while your computer is switched off. The other operating systemsreads the time stored in this chip (known as the RTC, or ’real time clock’ chip) but interprets it asbeing the ’local’ time (also known as the ’wall time’.)

There are a few ways that you can get Unix to compensate for this.

Kernel TIMEZONE

In all three BSDs a kernel setting called TIMEZONE can be used to correct for this. Simplyre-configure your kernel, setting the TIMEZONE option to the number of minutes to adjust thehardware clock by. (See also: How do I rebuild a new kernel?)

For example, in Brisbane, at GMT-10, which is -600 minutes offset, the following would go intothe KERNEL config file:

option TIMEZONE=-600

This matches the -10 hours in GMT-10. Thereafter MS Windows’ clock will agree with Unix (aslong as I don’t leave Brisbane time and ignore Windows’ Queensland daylight savings wrongness.)

OpenBSD

In very recent versions of OpenBSD, you can avoid recompiling a kernel by running config to editthe kernel image. In the example below, I change my kernel’s offset for the RTC to -600 minutes:

$ config -e -o /tmp/new-bsd /bsdOpenBSD 2.6-current (KENNY) #4: Sun Jan 9 09:01:20 EST 2000 d@it2:/home/d/KENNYEnter ’help’ for informationukc> timezonetimezone = 0, dst = 0ukc> timezone -600timezone = -600, dst = 0ukc> quit

FreeBSD

In FreeBSD systems, if you set the CMOS time to your local time, you can use /stand/sysinstall toset your system time as appropriate. Then, as root, type

Page 35: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

# tzsetup

and your time concerns are taken care of.

In /etc/crontab, the system utility adjkerntz will adjust the local time in the CMOS clock andmaintain the timezone offset for the kernel. This will insure that your DOS clock and files arecorrectly timestamped.

How can I keep my LAN-attached system time automatically synchronised?

If you’re permanently connected to the internet, or company LAN, you should consider using NTP(Network Time Protocol).

ntpd (FreeBSD, OpenBSD) or xntpd (NetBSD) is a process that periodically contacts an NTPserver to accurately correct for local clock drifts. It is very wise to set this up for security reasons (itmakes later analysis easier, and closes some security holes based on inter-host timing. ntpd (xntpd)may be installed as part of the base system (NetBSD, FreeBSD), or as an add-on package(OpenBSD).

First you’ll need to find out from your network people which NTP server they recommend. For meits wasabi.it.uq.edu.au, so my /etc/ntp.conf file looks like this:

server wasabi.it.uq.edu.au disable auth

Once you have the above information and have created your /etc/ntp.conf, make sure that xntpd isinstalled and will run on restart, and reboot. The reboot is only necessary in OpenBSD to bring thekernel out of secure mode so that the sensitive clock adjustment hooks are exposed. /etc/rc.conf(NetBSD, OpenBSD) controls whether or not ntpd (xntpd) is started at boot time. In FreeBSD thedefault start ups are given in /etc/defaults/rc.conf (ntpd and xntpd are turned off by default), butyou should put your local changes in /etc/rc.conf or /etc/rc.conf.local.

How can I keep my dialin system time automatically synchronised?

There are two relatively easy options for synchronising time for dialin systems: ntpdate and rdate.

ntpdate

NTP can be used here as well, but there is no need to have ntpd (xntpd)running all the time. Toprevent ntpd (xntpd) from running at boot time, remove /etc/ntp.conf or turn off the service in/etc/rc.conf (FreeBSD, NetBSD, OpenBSD), or /etc/rc.conf.local (FreeBSD) - note the defaultgiven in /etc/defaults/rc.conf is to have both services off. Instead of running the backgrounddaemon, you should run the ntpdate program whenever you dial in. It will set the clock just once.

If you are using kernel ppp, automation of this can be achieved for PPP connections by creating ashell script called /etc/ppp/ip-up and putting in lines similar to this:

#! /bin/sh

Page 36: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

/usr/local/sbin/ntpdate ntphost.myisp.net &

where you replace ’ntphost.myisp.net’ with the name of your closest NTP server (ask your ISP forthis information, or look for one on http://www.ntp.org/.)

If you are using the user-side ppp, this script will fit the bill:

#!/bin/shif [ ‘id -u‘ != 0 ]then echo "must exec this script as root" echo "id is ‘id -u‘" exit 1;fi

if [ -f /var/spool/lock/LCK..cuaa1 ]then ps -gax | grep -v grep | grep ppp if test "$?" = "0" then /usr/local/sbin/ntpdate ntphost.myisp.net exit 0; else # # we’ve got a bogus LCK..cuaaN file. # exit 1; fi

else exit 1;fi

Don’t forget to use chmod to make /etc/ppp/ip-up executable!

Tid bit: Some architectures, like the Apple Macintosh, lose time very badly because the periodic (60Hz)interrupt that drives the kernel is hard-wired to a low priority, and gets lost when the computer is undermoderate I/O load (like network or disk transfers). As a consequence, time gets skewed very badly on themac68k - I have seen skews of about 2 hours over one day. The bad news is that ntp gradually becomesafraid to reset the clock to the right time because the deviation gets so large and it prefers to trust the localcomputer. This is a good security measure but can get annoying.

rdate

An alternative to NTP is to use rdate. It’s a lot simpler and is bundled with the OpenBSD operatingsystem. For FreeBSD rdate is in the ports tree. You only need to know the name of a computer thatyour ISP has that runs the RFC868 protocol (a very common protocol with Unix computers). If youask and they have no idea what you are talking about, just try using one of their main servers--liketheir web server or mail host--to see if it works.

Once you have found a computer you can run rdate against, you can automate it, too, by writing an/etc/ppp/ip-up script:

#! /bin/sh ( /usr/sbin/rdate -sa some-computer.myisp.net; /usr/sbin/rdate -p >>/var/log/rdate ) &

For the sake of completeness, here is a script for rdate if you are using user-ppp.

Page 37: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

#!/bin/shRDATE=/usr/local/sbin/rdate ;if [ ‘id -u‘ != 0 ]then echo "must exec this script as root" echo "id is ‘id -u‘" exit 1;fi if [ -f /var/spool/lock/LCK..cuaa1 ]then ps -gax | grep -v grep | grep ppp if test "$?" = "0" then ( $RDATE -sa athena > /dev/null 2>&1; $RDATE -p athena >> /var/log/rdate.LOG ) exit 0; else exit 1; fielse exit 1;fi

Be sure to read the short man page entry on rdate if you decide to use it.

How can I keep my isolated system time automatically synchronised?

In this situation, I’m assuming that you have more than one Unix computer on a small, private,isolated network.

If you have an atomic clock, or special radio clock hardware you could run your own NTP server!But that’s not very likely...

Instead, BSD Unix normally comes with the timed server. If you run a timed server on all of yourUnix computers on the one network, they will talk amongst each other and come up with an"average network time".

That is, they all average the speed of their own clocks and use that to adjust for relative skew. Itworks reasonably well... on small networks.

In NetBSD and OpenBSD, timed is enabled by editing /etc/rc.conf (look for "timed_flags"). InFreeBSD, timed is enabled by modifying your /etc/rc.conf or /etc/rc.local.conf . The defaultbehavior from /etc/defaults/rc.conf is off.

How do I run a program periodically?

Use cron(8). Cron is well documented in the manual pages.

Page 38: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

How do I get my program to run exactly once, later after I have logged off andgone home?

A: The "at" command will do what you’re looking for. at can run a list of commands eithercontained in a file or specified through standard input (interactively, through a pipe, or viaredirection).

The at command is an excellent way to time-shift any number of tasks that you can automate. Thiswas touched upon in an earlier "Help I’ve Fallen" column and is worth revisiting here.

What the man page may not make clear is that the interactive command line use syntax is:

% at [time-shift specifications]<return>

command list with one-per-line and end commands with a control-D, (^D).

Example 1:

$ at 6am tomorrowmail -s "March, 2000 Plans" [email protected] < March00.Plansecho "Plans sent to entire work group" | ~jqs/recordslpr -PMyDesk summary^DJob 1 will be executed using /bin/sh$

Example 2:

$ at now + 25 minutesmail -s "uu.save tarball" [email protected] < /tmp/uu.saveecho "uu.save sent to fubarr on ‘date‘" | [email protected]^DJob 2 will be executed using /bin/sh

You can also specify a file of commands with the -f flag as well as the time-shift spec and at willexecute the commands consecutively for you at that time:

% at -f FileList [time-shift specification]<return>

or, it can accept commands from standard input given by the following syntax:

% echo command | at [time-shift specification]

Example 3:

echo lpr /tmp/list* | at 10pm

In all cases, the results of your run will be neatly mailed to you by the "Atrun Service" rather thansent to your screen.

NOTE

Page 39: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

Necessary at.allow or at.deny files:

As root you must create the file /var/at/at.allow and enter your login name (and the login of anyonewho wishes to use the at command). Alternatively, you can create /var/at/at.deny and enter thenames of anyone who shouldn’t be allowed to use this utility.

For the fine-grain details, see the manual page:

% man at

Why isn’t make working? I modified some source files, but make just refuses tomake! I’ve recently changed the clock but that shouldn’t matter.

Of course it matters. Make decides what to do by looking at the timestamps on files and the currenttime.

Perhaps the clock has gone backwards in time since the last time you typed ’make’? This is likely ifyou have been playing with the clock or compiling things in single-user mode when the clock couldhave been wrongly set!

Carefully examine all the files to see if they have strange dates. Look at the current time with thedate command.

Tid bit: When viewing a listing with Amiga DOS, files with times ahead of the system time were printedas FUTURE!. This made it easy to see clock glitches.

I was told that Unix clocks are fine for the year 2000, (and they were), but theclocks will stop in 2038? Why is that?

Well they won’t exactly ’stop’, but dates after approximately Jan 19 03:14:00 2038 UTC are notrepresentable in some variants of Unix, including BSD.

That’s because, in those Unixen, time is stored as the number of seconds since the ’epoch’(Jan 1 00:00:00 1970 GMT). If you use a signed 32-bit number, (called time_t) you can expressapporximately 68 years worth of seconds on either side of 1970. Hence the year 2038, and hencethe secret to Unix’s resilience to the Y2K "millenium bug".

Other variants of Unix, including DEC Unix and some versions of Linux, have migrated from32-bit time representations to 64-bit. This allows future time to be expressed up to about the year292271023045, which is well after the sun has exploded, consumed the earth, and removed alltraces of MS-DOS.

There are some major reasons why switching to a 64-bit time_t is hard and isn’t lightly approached.

Page 40: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

These mainly involve having to repair all the places where the size of the 32-bit time_t has beenused in persistent/standard data structures such as filesystems, network protocols and so on.

By the time this article goes to publication, the Year 2000 will have come and gone, and February28 will be imminent. Based on your observations and media hype, what do you think you’ll bedoing in the year 2038?

Hey man, chill out, we’ve got 38 years!

:)

About the Authors

Gary Kline has been porting code since the late 1970’s when he helped port several V6 utilities toV7 at Cal Berkeley. When he isn’t hacking code, he’s hacking prose, or listening to jazz radio anddrinking espresso.

For more than three years he has been writing the software equivalent of a mind-machine andthreatening to release an alpha port to FreeBSD--RSN.

[home| mail]

David Leonard is a PhD student in the Department of Computer Science and Electrical Engineeringat the University of Queensland, Brisbane, Australia.

His area of research is QoS-adaptive component software architectures, and in his spare time is adeveloper for the OpenBSD project. That said, David enjoys living the quiet life with his wife,Kylie and cat, Mu. He especially enjoys frequenting Moreton Bay’s many fabulous places to eat.Mmmmm!

[home| mail]

Author maintains all copyrights on this article.Images and layout Copyright © 1998-2000 Dæmon News. All Rights Reserved.

Page 41: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

Darby Daemon Current News Search Contact Us Join Us Merchandise

By Susannah Coleman, <[email protected]>and Seth Claybrook, <[email protected]>

Previous

Page 42: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

Previous

Page 43: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

February 2000 Search Submit Article Contact Us Join Us Merchandise

Commercial BSD support

by Greg Lehey

One of the most unusual things about Open Source software is that it is free. You can legally pickup the software off the net, or pay a small charge for the convenience of having it on CD-ROM.You can take a CD-ROM and use it to install hundreds of machines. Do either of these things withcommercial software (and get caught), and you’ll be in serious trouble.

This doesn’t mean that Open Source software is completely non-commercial, of course. The onlyway to ensure that it is completely non-commercial would be for the licence to restrict it tonon-commercial use. Some commercial companies offer free copies of their software with this kindof restriction, so it’s obviously possible (at least in theory). Increasingly, UNIX vendors areoffering more and more of the source code. The latest is Sun Microsystems, who are about torelease the source code for Solaris 8--with strings attached, of course: you won’t be able toredistribute the code, with or without your fixes.

That’s not what we want. Not only do we allow the commercial use of Open Source software, weactively encourage it. You just need to hear the crowing of the ‘‘Yahoo! runs on our software’’crowd to realise how proud we are that some of the world’s biggest Internet companies run BSD. Ithink we have every right to be proud of this achievement.

But how does Yahoo! support its machines? It’s one thing to get free software, but free support is adifferent matter altogether. You can get very good, professional quality support for free on mailinglists such as FreeBSD-questions, NetBSD-users or [email protected], but you can’t rely ongetting a timely answer or even an answer at all. You can’t run a big organization with that kind ofsupport.

Yahoo!’s answer is simple: it has its own support staff, some of whom are very knowledgeablepeople--they even have a FreeBSD core team member working for them. That works fine for aslarge an operation as Yahoo!, but clearly it isn’t the solution for most people. How does a mid-sizedcompany handle support?

I don’t know a good general answer to that question. I suspect that mid-sized companies fall intoone of the following categories:

They support themselves in the same way Yahoo! does, but on a smaller scale: They hire oneor more good BSD or other UNIX people to support their systems. There’s a lower size limitfor this approach, of course: when the support requirements aren’t enough to keep a singleperson busy, it becomes impracticable. Even with only two or three people, though, problemsarise: what happens if somebody is sick or on vacation? What do they do if they come acrossa problem which isn’t in their area of expertise?

Page 44: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

They use a contract UNIX support service which can also support BSD. This is quite areasonable solution for companies which have a number of different systems. We’ll look atthis more further down.

They do most of the work themselves, but they have a contract with an independentconsultant, such as myself, who looks after problems they can’t solve.

It’s difficult, but important, to make a distinction between a commercial support organization (thesecond group above) and a consultant (the third group). In many cases, a first-rate BSD man willget together with some other people and form a company, but they’re usually small. You’ll findlists of consultants for FreeBSD, NetBSD and OpenBSD. Kudos to OpenBSD, by the way, forincluding links to the FreeBSD and NetBSD pages. As you can see from these pages, a number ofnames crop up in more than one of the pages. You’ll also note that most companies are attached tothe name of a specific person.

On the one hand it’s a good idea for prospective customers to know who they’re dealing with on apersonal level. UNIX, and BSD in particular, is very much people-based, and the names can givemore of an idea of what kind of service you will get than a flashy company name. People whomention their names are placing their reputation on the line; other things being equal, they will tryharder if their name is known.

The down side of this personal approach is what worries most mid-sized companies: what happensif the person is sick or overloaded, or decides to go on vacation? Not surprisingly, these are thesame issues that concern a smaller company which has its own support staff. The small companiestend to have only one expert in a specific area, and if they’re not available, it’s possible that nobodyelse can help. In addition, there may be areas that they don’t cover at all. That’s not the kind ofcompany that a company would want to trust to support a mission-critical application.

Meanwhile, in the Linux camp...

As in many other areas, the Linux world is ahead of us: The commercial spirit is alive and kicking.Most distributions offer support for their particular flavour of Linux, and possibly for others aswell. For example, Red Hat has offered support for their products for some time, and they recentlyacquired Cygnus Solutions, a well-known Open Source support organization. Reading the hypeabout the merger, I have an uneasy feeling that something of the spirit of Open Source has been lostin the deal. That may be the way we’re going, of course, but I can’t say it makes me feel good. I’mnot alone in this respect; Ian Darwin states in his template:

N Notes about your experience, with UNIX, with BSD, and specifically withOpenBSD. Notes continued. Not more than three or four lines: NO MARKETINGBAFFLEGAB, like "Our highly-trained staff ensure the realization of ..." :-( Cancontain simple HTML tags; don’t overdo it!

Fortunately, most other Linux support vendors are less blatantly commercial.

Let’s look at Yahoo! again. They don’t just run (Free)BSD, they also run Apache. Apache is notpart of FreeBSD, it’s a separate package. Many people use it in conjunction with Linux instead ofFreeBSD. There are a large number of other important open source packages that run on multipleplatforms; this synergy is part of what the Open Source movement is about. Samba and XFree86are some of the more prominent ones, but when you look at the over 3,000 ports in the BSD PortsCollection, you realise that this is just the tip of the iceberg. The Gartner group described this well

Page 45: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

in a recent article entitled ‘‘Debunking Open-Source Myths: Origins and Players’’. One ‘‘myth’’ itmentions is the idea that Open Source and Linux are the same thing. It estimates that only about 2%of a typical Linux distribution is really Linux; the rest consists of other Open Source products.Similar considerations apply to BSD, though the percentage might be as high as 10% for BSD,since the ‘‘standard’’ userland is also included.

The Gartner group publish such web pages for a short time only; after that, they chargereal money for them. For this reason, I’m not quoting the original text. In anotherarticle, I referred to a different report by Gartner, but it quickly went away. If you’requick, though, you might find ‘‘Debunking Open-Source Myths: Origins and Players’’here.

A complete operating system?

If you ask a BSD person what the real difference is between BSD and Linux, you’ll get a lot ofanswers. One of the more popular ones is ‘‘<my>BSD is a complete Operating System; Linux isjust a kernel.’’ The same people will frequently disparage Linux and its various distributions forthis reason. In fact, I don’t know of any company which uses the operating system alone withoutany third party software. It’s good to have a unified approach to an operating system, but it’s notthe whole story. I prefer to think of the main conceptual difference between Linux and BSD as thefact that Linux has only one kernel and different implementations of userland; BSD has differentimplementations of both userland and kernel. That doesn’t interest the the average user very much:for him, the real issue is ‘‘will my programs work?’’

This has obvious implications for support organizations: it’s not enough just to support the kernel or"operating system". They must support common applications as well. Given the relative size ofoperating system and applications, it’s not surprising that application support is frequently moreintensive than supporting the base operating system.

This fact has an obvious implication for the operating system suppliers: to a certain extent theoperating system is irrelevant. It’s nice to have a choice, especially if you have problems with theapplication that appear to be OS related, but for the user the real issue is the application, not the OS.So why offer support for one OS only?

For these reasons I don’t think it makes much sense to support only BSD, or only a single versionof BSD. A complete support operation should be able to support all the major packages, most ofwhich run both on Linux and BSD.

Imminent demise of BSD? (take 2)

This sounds like sacrilege: I’m advocating nothing less than giving up our identity and becomingjust another of a group of operating systems? Well, in a word, yes.

This is a column for advocating BSD, and I intend to carry on doing just that. But we shouldn’tforget that most people who use BSD do it because of some very specific edge that they have overother operating systems, for example, performance, portability, security or total cost. They’re notinterested in intellectual things like the difference between Linux and BSD, let alone the differencebetween OpenBSD and BSD/OS. If they have more choice and more professional support, they’remore likely to use an appropriate BSD. And if they find that Linux is more appropriate, good luckto them. One way or another, this kind of support is going to result in all the BSDs being better

Page 46: more storage, more performance or more reliability than angwdu111.gwdg.de/misc/dnews/dnews_0002.pdf · computer shop to carry a BSD-related item, to something as big as writing an

known.

Author maintains all copyrights on this article.Images and layout Copyright © 1998-2000 Dæmon News. All Rights Reserved.