Unix-based operating systems, like Linux, are just the greatest thing out there. The creators of Unix knew full well that they could never guess every need their users would have, or everything that Unix would be used for. So they designed it for total flexibility, with a clear methodology for extending and changing the system. When you think about it, Unix is very much like the Constitution of the United States, in its adaptability to change.
Unix is entirely flexible. It comes with many basic useful commands, many of which are small components that can be tied together with other components, to solve interesting problems. For example, there's a command "wc", which stands for Word Count. It can count characters, words, or lines in a data file or data stream. By itself, not too useful. But used with other commands, really valuable!
Small But Powerful Example
Here's how you could count the number of files in your current directory:
ls | wc -l
This works because the "ls" command, when output is sent thru the pipe ("|"), prints one file per line of output; then "wc" simply counts the number of lines (-l) and prints the answer, which might be:
14
Notice something about this output - there were no extraneous words or characters, just pure data - a number. This is so important. It gives you the ability to pipe THAT result into other programs, or store the number in a data file, or perform math on it (summing it up with other numbers), or a hundred other things you may think of.
The creators of Unix have no idea what you're going to do - and the way they designed it, they didn't have to.
Overriding Built-in Commands
Unix has the ability for anyone to develop more commands of their own. But that's not all - it's even possible for anyone to override built-in commands with commands of their own, should they ever want to!
In this Unix Trick I'm going to show you how to write your own commands, put them in your own "bin" directory, and even override built-in commands.
Then we're going to look at how to make your commands available to other people on the system, and even other Unix systems at your site or company.
When I say "Unix", I'm really referring to Linux, BSD, SunOS, HP/UX, MacOS (which is based on Unix these days), and every other flavor of Unix out there. I know that they can't officially call themselves a "Unix" operating system, since that term is trademarked by AT&T or whoever. But you know what I mean. The concepts in this Unix Trick apply to all of them, even if some of the specific details vary a little.
Let's Write a Command
It's very easy to write a Unix command. The easiest way is to write a script, which is just a text file written in a "shell" language (kind of like a batch file in DOS). You simply create a text file in your favorite editor containing the commands.
Here's an interesting problem we can write a command script for. Do you know about the "head" and "tail" commands in Unix? Head is used to obtain the top N lines of a text file. Tail is used to obtain the bottom N lines of a text file. A few years ago I wished I had a command that would display the middle N lines of a text file! No problem, let's create it.
Designing the "mid" Command
Let's name our program "mid", since it extracts the middle of a file or data stream. To do this, we need two pieces of data:
We want the user to pass these two numbers to us on the command-line.
Basic idea:
These requirements are important, because we don't know how our "users" (us, in the future) are going to try to use the command. We want it to feel like other Unix commands, so we specified that to our requirements.
Writing the "mid" Command
Now we're ready to create our script. Go to your home directory (type "cd"). Now type "vi mid". This should put us in the VI editor. You can use Emacs, or any other text editor you are familiar with. You could even write it in Notepad, I suppose, then upload it to your Unix server (but that sounds like a lot of work).
Into this file, add the following lines:
#!/bin/sh
if test $# -lt 2; then
echo "$0: insufficient arguments on the command line." >&2
echo "usage: $0 startlinenum numlines [filename]" >&2
exit 1
fi
tail +$1 $3 | head -$2
exit $?
Believe it or not, that's all there is to it! I put a lot of sneaky things in this brief script. If you don't care about the inner workings, you can skip to the next section below. Go ahead, I won't mind.
The first line, #!/bin/sh, is required. It tells the system what language this script is written in. Because there are many dozens of languages available on all Unix systems today! You can write scripts in Bourne Shell, C Shell, Perl, Python, Ruby, and many others. You can even write your own scripting language if you want to, and let people write scripts in that language! Oh yeah - this is Unix, baby.
So that first line tells Unix to launch the "/bin/sh" language to interpret this script.
The next thing we do is test the arguments passed in by the user. If they typed any less than 2 things, we reject them immediately - but we tell them how to use this script. This is so important! Don't leave your users hanging. If they screwed up, it's no big deal. Gently help them figure out what they did wrong.
The $0 variable stores our command name (and sometimes the path to it). This way we can rename "mid" later to a different name, without having to edit the script.
The >&2 part redirects the output of the echo commands to the Standard Error channel. True output (data) should to go Standard Output only. Errors and Warnings should go to Standard Error only. This is important to understand, because think about when you're piping the output into a file or another command: you want the data going thru the pipe, but you want to see any error messages on your screen! The worst thing in the world is to have a command fail, and the error message was sent thru the pipe as data. You're corrupting the data stream if you do that, and confusing the user, because they will discover the problem somewhere downstream - and may not have any idea where to start looking for the problem!
The "exit 1" part makes the script stop running of course, but also returns an error number. (any number other than 0 is an error). This is important too. Future scripts that use your script as a building block will want to know if you generated an error or not, by checking the Exit Status. In Unix, exit status of 0 means success, anything else is failure. Always return a sensible exit status.
The tail and head line is the meat of the program. We were able to do all the work in 1 line of code!
The Tail command can work 2 ways, here - either the filename was specified as the 3rd argument to the script (in which case it will see it as $3), or there was no 3rd argument, so $3 will be a null string - causing Tail to read from Standard Input (read the man page: "man tail"). It's important to know all the ways your building-blocks like Head and Tail can be used. This gives you power, without making you waste your time writing a lot of code.
The Head command is pretty standard; whatever data we send to it, it will only display the first so-many lines of text, and then exit. We don't redirect the output of head - it will go to standard output, allowing the user of our script to redirect it wherever they want to. If they don't redirect it or pipe it, it's going onto the screen.
When both Tail and Head are finished (exit), our script continues on.
The "exit $?" on the last line is powerful, too. We could have just done "exit 0", to mean Success. But how do we know that we were successful? We would have to check the exit status of the tail/head commands, to see. If they exited with a status of 0, then we can exit with 0. If they exited with non-0 to mean error, then we should exit non-0 to mean error. So, why not just pass their exit status along to whomever ran us? It's sheer genius, I tell you. Too bad I didn't invent it myself. :) Shell scripts have been using this trick for over a decade.
Testing Our "mid" Command
Can we try out our script now? No, because it's not executable yet. You have to turn on the "Execute" permission setting on the file, so that the operating system will allow you to run it as a program. Because, data files should never be executed as programs! Data should only be read and written, by real programs. Running something as a program that's not meant to be a program can be very dangerous. That's why the execute permission bit is turned off, by default, on all newly created files.
It's easy to turn on the Execute permission bit:
chmod +x mid
One other thing - we probably cannot type "mid" to run the mid command, because it's not in our Command Path yet. We'll worry about that later. For testing purposes, we'll specify the exact path to this script. Since it's in the current directory, we'll add "./" to the front of the command when we run it. Try it now:
./mid
It should give you the warning message about not-enough-arguments on the command line. It's important to test the failure modes, as well as the success of any program you write! Now try it with proper arguments:
./mid 5 5 somefile
You have to use a real file name in place of "somefile". One suggestion - for a brand new script you haven't tested before, there's always the chance that it will destroy the file it operates on! So that you don't suffer a loss, I recommend copying a sample file and run "mid" on that copy:
cp /etc/hosts testfile
(you can pick any file you want; this file happens to have over 30 lines of data on my system, so I'm using it.)
./mid 5 5 testfile
And voila! We see 5 lines of data, that begin with the 5th line of the file. But remember, there's 1 other syntax to check - the data-piped-in-on-stdin syntax. Try that now:
./mid 5 5 < testfile
Another syntax could be:
cat testfile | ./mid 5 5
(however, from your script's viewpoint, those are both reading from Standard Input; so if one works, they both will work).
Using the Command from Any Directory
The problem now is, this command only works when we're in the current directory. We need to put it in a well known "bin" directory that's in our path, so we can run it from anywhere by just typing "mid".
Why don't we create our own "bin" directory to house commands we write in the future? Type this:
mkdir $HOME/bin
That created a directory just off of your home directory. (if it already existed, that's fine.)
Now we have to tell your PATH environment variable to look in this new directory when searching for Unix commands, from now on. The syntax for it differs based on what shell you're using in Unix. You can tell which shell you're using by running "echo $SHELL".
For C Shell users (csh, tcsh):
set path=($HOME/bin $path)
For Bourne Shell and related users (sh, bash, ksh):
PATH="$HOME/bin:$PATH" export PATH
Note that this is only good for your current X-window, and it will go away when you log out of Unix! To make this change permanent, you'll need to edit one of your hidden dot-files in your $HOME directory. They are loaded each time you log in.
For C Shell users, edit (or create) your .cshrc or .tcshrc file. For Bourne Shell users, edit (or create) your .profile or .bash_profile file or .bashrc.
These file names have a dot on the front, making them hidden (you can't see them with regular "ls" command). You have to use "ls -a" to see hidden files. But you don't have to see them to use them - just try editing one with Vi.
Testing Your Personal Command Directory
Now that your path is set up, try running the command this way:
mid
You should see the Usage error message.
If it says "command not found" and you're a csh/tcsh user, run the Unix command "rehash" and try again. If it still doesn't work, check your work on the steps above.
Now you're all set up to put more commands into your personal command "bin" directory.
Overriding Existing Unix Commands
What would happen if you wrote a script and named it, oh, say, "cat"? There's already a command named "cat" that is used all the time in Unix. If you wrote something called "cat" and put it in your bin directory, you would have overridden the operating system "cat" command with your own! Now, this can be very powerful, but also quite dangerous. You can "break" access to system commands this way. Well, yes, you can always get to the system commands if you know where they are. For example, for the "cat" command, you can always just type the full path to it:
/bin/cat
How did I know this? I used the "which" command:
which cat
The Which command looks through your path one directory at a time, and returns the first occurrence of the command you specified, since that's the one your shell will run if you type the command.
Most system commands live in /bin or /usr/bin, or sometimes /usr/ucb. Not all of them, but most of them. Good to remember if you ever get stuck like this.
A Warning About "test"
One mistake people make frequently is to write a program, and name it "test". Don't do that. Name it something else, "t", for example (like I do). Why? Because there's already a system command named "test", which is a vital part of the Bourne Shell! Nearly every Bourne Shell script uses "test" - and will fail dramatically if your "test" doesn't support every single option that the system one has!
You don't want to break the "test" command by writing your own and putting it in your path, believe me! (Do I sound like I might be speaking from experience? Naw... :)
Improving the "cat" Command
Let's say you wanted a "cat" command that could also display the output in the color of your choice. You want to add a "--color" option, with possible arguments "red" and "black", with a default of black. The thing is, you want your command to support all the current options that "cat" supports - but you don't want to have to completely rewrite all those options! Someone else wrote cat, and debugged it. You might as well harness all that work.
The basic steps your "cat" script would take are:
Do you see what we've done here? We isolated just the parts that we want to add, and added them. We leave all the programming genius of the "cat" command to the author of "cat". Why should we waste our time reinventing the wheel? ...and re-debugging it, too? It's not worth it.
We can now put our "cat" command in our bin directory, and test it out. If we screwed something up, we can always move it back out of there. You may have to type the "rehash" command to make it recognized (or making the removal of it be recognized).
We'll leave the programming details of this assignment for another time.
And now, Congratuations - you just improved your Unix world!
Sharing Your Command with Everyone on the System
After you've tested and debugged your script as much as possible, you can unleash it on your computer system as a whole. Hopefully your system administrator has set up the system to work as follows:
There is a central "bin" directory that is in everyone's path, such as
/usr/local/bin
Any programs placed in there will become immediately available to all users of the Unix computer.
If this is not the case, and you're the system administrator, I highly recommend setting up everyone's accounts this way. You can edit the "skeleton" startup files, too, so new user accounts will automatically have /usr/local/bin in their path environment variable. The skeleton directory is /etc/skel, on many Linux systems.
/usr/local/bin is often not world-writable, and for good reason. You have to talk your system administrator into putting your script in there. If you have data files that go along with it, often those go into /usr/local/lib. If you have written man-pages, they probably go into /usr/local/man. Misc documentation, /usr/local/doc or docs. Source code too, sometimes, /usr/local/src. Look in your /usr/local directory for more detail, or ask your sysadmin.
Sharing Your Command with Other Systems
Some Unix systems are set up to have a common software area, across many similar systems within the department or company. There are two common ways of doing this:
Each method has its good and bad points. NFS sharing is a little old fashioned nowadays, since it's slower to read files over the network than from a local directory, and the difficulties it causes every other computer when the NFS server goes down.
The Syncing method is better in a number of ways. You use "rsync" or "unison" to copy the files regularly from a master computer to all other computers. Syncing is better than copying, becaue only the files that have changed are copied. If nothing changed, nothing is copied - syncing is very quick and efficient. And, if the master host goes down, it doesn't affect the other sytsems. They just don't get sync'd until the master is back up again.
With syncing, everything still operates at full speed - it's just a regular directory full of files, on each local system - no network delays.
Syncing is sometimes called mirroring.
Open Source Contributions
When you write useful utilities like this, why not share them with the world? There are many reasons to open your useful commands up to the public, as Open Source.
If you developed your script at work, on work computers, you should get your boss's permission first before making any of it available to the public, outside your company. Some companies are very anal about this; you don't want to get into trouble.
A very large tech company I worked for a while back had a specific policy on this. You had to get permission from a central department, which could take weeks. Then, if approved, you could only host it on the company's open-source external web server computer.
In Open Source, there is a saying: "With enough eyes, all bugs are shallow". In other words, if only you and a few other people use your program, it can be real hard to find and fix the last bug or two. But with enough people on the Internet looking at your work, anything can be solved.
Open Source is a great way to contribute to the world, and make a name for yourself on the Internet. I've written many utilities myself and contributed them, such as Dlint and Domtools. Some can be found in major Linux distributions like Debian, today.
Unix is Great
Unix allows you to change the very world around you to suit your needs, tastes and desires. Given a rich set of commands, you can build newer and greater commands using the old ones as building blocks, enabling you to accomplish more work with less effort.
This is one of the things that makes Unix so great.
Don't miss the latest unix tips and tricks!
Subscribe to our low-volume mailing list:
Privacy Policy
| Copyright © 2006 Fastech Learning LLC, all rights reserved. |
| Phone toll free 1-866-464-6688, Phoenix Metro area 480-895-6688 |
| Problem with this web site? please let us know |