Piping hot commands
Translations of this material:
- into Russian: Команды с использованием конвейера. 38% translated in draft.
-
Submitted for translation by MaxPv 27.10.2009
Published 2 years, 3 months ago.
Text
Pipes let programs work together by connecting the output from one to be the input for another. Pipes are built using a vertical bar (|) as the pipe symbol.
Say you help your eccentric Aunt Hortense manage her private book collection. You have a file named books containing a list of her holdings, one per line, in the format 'author:title', something like this:
$ cat books
Carroll, Lewis:Through the Looking-Glass
Shakespeare, William:Hamlet
Bartlett, John:Familiar Quotations
Mill, John Stuart:On Nature
London, Jack:John Barleycorn
Bunyan, John:Pilgrim's Progress, The
Defoe, Daniel:Robinson Crusoe
Mill, John Stuart:System of Logic, A
Milton, John:Paradise Lost
Johnson, Samuel:Lives of the Poets
Shakespeare, William:Julius Caesar
Mill, John Stuart:On Liberty
Bunyan, John:Saved by Grace
This is somewhat untidy, as they are in no particular order. But we can use the sort command to straighten that out:
$ sort books
Bartlett, John:Familiar Quotations
Bunyan, John:Pilgrim's Progress, The
Bunyan, John:Saved by Grace
Carroll, Lewis:Through the Looking-Glass
Defoe, Daniel:Robinson Crusoe
Johnson, Samuel:Lives of the Poets
London, Jack:John Barleycorn
Mill, John Stuart:On Liberty
Mill, John Stuart:On Nature
Mill, John Stuart:System of Logic, A
Milton, John:Paradise Lost
Shakespeare, William:Hamlet
Shakespeare, William:Julius Caesar
Ah, now you have a list nicely sorted by author. How about getting a list just of authors, without titles? You can do that with the cut command:
$ cut -d: -f1 books
Carroll, Lewis
Shakespeare, William
Bartlett, John
Mill, John Stuart
London, Jack
Bunyan, John
Defoe, Daniel
Mill, John Stuart
Milton, John
Johnson, Samuel
Shakespeare, William
Mill, John Stuart
Bunyan, John
A little explanation here. The -d option chose a colon as the delimiter (separator). This tells cut to break up each line wherever a delimiter appears, and each separate part of the line is called a field. In our format, the author's name appears as the first field, so we have put a 1 with the -f option to tell cut that we want to see just that field.
But you'll notice the list is unsorted again. Pipelines to the rescue!
$ sort books | cut -d: -f1
Bartlett, John
Bunyan, John
Bunyan, John
Carroll, Lewis
Defoe, Daniel
Johnson, Samuel
London, Jack
Mill, John Stuart
Mill, John Stuart
Mill, John Stuart
Milton, John
Shakespeare, William
Shakespeare, William
Voila! You've taken the alphabetized list, which is the output of the sort command, and fed it as input to the cut command. Don't give the cut command a filename to use, because you want it to operate on the text that's piped out of the sort command.
Pipes are just that simple--text flows down the pipe from one command to the next.
How about if you wanted a sorted list of titles instead? Since the title is the second field, let's try using -f2 with the cut command instead of -f1:
$ sort books | cut -d: -f2
Familiar Quotations
Pilgrim's Progress, The
Saved by Grace
Through the Looking-Glass
Robinson Crusoe
Lives of the Poets
John Barleycorn
On Liberty
On Nature
System of Logic, A
Paradise Lost
Hamlet
Julius Caesar
Oops. What happened? When looking at a pipeline, you need to go left-to-right. In this case, we sorted the file first before extracting the titles. So it dutifully sorted the lines starting with the author at the beginning of each line. To get the titles in the proper order, you need to do the sort after extracting them:
$ cut -d: -f2 books | sort
Familiar Quotations
Hamlet
John Barleycorn
Julius Caesar
Lives of the Poets
On Liberty
On Nature
Paradise Lost
Pilgrim's Progress, The
Robinson Crusoe
Saved by Grace
System of Logic, A
Through the Looking-Glass
Much better. Now this is all very nice, but you may be thinking you could have done these things with a spreadsheet. For simpler tasks, this is probably true. But suppose that Aunt Hortense is in the habit of asking odd questions about her collection. For example, she wants to know how many books she has from each author named John. A spreadsheet or other graphical program may have difficulty handling a request that wasn't anticipated by the program's authors. But the shell offers us many small, simple commands that can be combined in unforeseen ways to accomplish a complex task.
To find a particular string in a line of text, use the grep command. Now remember that when you combine commands, they need to go in the proper order. You can't run grep against the file first, because it will match the title 'John Barleycorn' in addition to authors named John. So add it to the end of the pipeline:
$ cut -d: -f1 books | sort | grep "John"
Bartlett, John
Bunyan, John
Bunyan, John
