|
The Subtle Substitute Command |
|
By Walter Alan Zintz .
[
Editor's Note:
We'll indicate a line-mode command in
the paragraph text by prefixing it with a colon
(
After the
Making Changes Within LinesMost of you already know the
s/previous/former/ %s/Smith/Lee and Smith/ to make some change within the line you are on, in the first case, or change every instance in the file in the second. If you use both forms you are already ahead of the game. Too many class instructors and textbook writers try to tell you that the way to change some phrase throughout the file is to type something like: global/Smith/s//Lee and Smith/ This is wasteful nonsense. Both forms accomplish exactly the
same thing, but the second version involves extra typing for you
and an extra run through the file for your computer. I
t does not
matter that not every line in your file will contain a ``Smith''
to be changed -- the
But neither form as it stands
is sure
to change every ``Smith'' in the file. The
inure to Smith's benefit only if Smith shall will be changed by either version of the command to read: inure to Lee and Smith's benefit only if Smith shall Line mode has a built-in solution for this problem: place a lower-case letter ``g'' at the very end of the command, immediately after the last ``/'' mark, in order to make the change on every such string in each line. So typing this: % substitute /Smith/Lee and Smith/g will make that text line come out as: inure to Lee and Smith's benefit only if Lee and Smith shall Finer tuning of the instances can be done by a little trickery. Suppose you are working on tables, and want to change only the very last ``k37'' on each line to ``q53''. This command will do it: % substitute /\(..*\)k37/\1q53 If this seems surprising, remember that in a search pattern with a wild card, the editor always extends the match to the greatest length it can. In this case that means the string starting at the beginning of the line and ending with the last ``k37'' in the line. Now you should be able to extend this example. What command would change only the second-to-last ``k37'' on each line? This requires a shrewd guess from you, so I've written a solution you can compare to your own. A Few More MetacharactersYou probably already know that you don't always have to type the search pattern that indicates the text to be replaced by a substitution command. If you want to reuse your very last search pattern, whether that was in a substitution command or not, you can use an empty search pattern string to stand for the last search pattern, so the two commands below are actually identical. /Murphy/ substitute /Murphy/Thatcher/ /Murphy/ substitute //Thatcher/ Either command will go to the next line containing ``Murphy'' and there replace the first ``Murphy'' with ``Thatcher''. Within a substitution command's search pattern to find the text to be removed, all the normal search-pattern metacharacters apply. So do two more that are reserved only for substitution commands: the ``\('' and ``\)'' metacharacters. These two metacharacters don't match anything themselves, so: substitute /^The cat and small dog show/ substitute /^The \(cat\) and \(small dog \) show/ are exactly the same command as far as they go. But the substitution command remembers what it finds to match the text between a pair of ``\('' and ``\)'' metacharacters, for use in the replacement text. Whenever your replacement pattern contains ``\1'' the editor replaces that metacharacter with whatever matched the characters that were between the first pair of ``\('' and ``\)'' metacharacters. A ``\2'' in the replacement pattern is removed and replaced by whatever was matched the characters between the second pair. And so on -- you can have up to nine pairs in one substitution command. These metacharacter pairs can even be nested in the to-be-replaced text; the one that starts first will be represented by ``\1'' and so on. So if you extend that second substitution command above to read: substitute /^The \(cat\) and \(small dog\) show/My \2-\1 fair the substitution command will produce a line that begins: My small dog-cat fair Or if you type: substitute :up \(and \)\(over \)\(the sky\):\2\1\2\1\2\3 then your command will change the first line below to read as the second line, just beneath it: up and over the sky over and over and over the sky (I used the colon ``:'' character to separate the parts of the command, instead of the slash ``/'' character, solely to make it more readable for you. There is no danger of the editor confusing ``/'' with ''\'' or ``l'' (el) with ``1'' (one) etcetera.) As the preceding examples show, the ``\('' and ``\)'' are not too useful with plain-text search patterns; about their only real value there is when you are searching for something that's difficult to spell correctly, and don't want to type it into the replacement pattern with possible spelling errors. (Spelling errors aren't so dangerous in the to-be-replaced text, because they only cause the pattern match to fail.) These metacharacters save the day, t hough, when you are dealing with other search metacharacters in searching for text that you will want to put back in. (Often the only way to specify the exact spot you want the replacement done is to include some neighboring text in the search pattern, and tell the editor that after the neighboring text has been taken out it is to be put back in right where it was.) Here are three examples of this kind of substitution: % substitute :\([Ss]ection\) \([0-9][0-9]*\):\1 No. \2:g /\([Ss]ection\) \([0-9][0-9]*\)/ substitute ::\1 No. \2 % substitute ,[Aa]nswer: \([TtFf] \),ANSWER: \1,g The first of these simply inserts ``No.'' in the middle of phrases that are section numbers, throughout the document. But the ``\('' and ``\)'' notation is essential to preserve the section number in each case, and also to leave unchanged the capitalization or noncapitalization of the first letter of ``section''. The second command does the same thing, but only on the very next line t hat has a section number to change. The surprise here is that I put the ``\('' and ``\)'' in the address pattern to find the correct line. A line address doesn't use these metacharacters, of course, but it does not object to them, either. It just ignores them in its own line search, but does pass them along when a following substitution command reuses the last search pattern, as happens in this example. The third example is useful in editing a list of answers to exercises. It stops at each answer to a true-or-false question and capitalizes the entire word ``answer''. The innovative aspect of this command is that it finds the letter ``T'' or ``t'' or ``F'' or ``f'' following the word ``answer'', so it will not change the capitalization where an answer is numerical rather than true or false. And yet, the letter indicating whether ``true'' or ``false'' is the correct answer is not discarded as a side effect. This is primarily an example of a change that can be done more simply by using other metach aracters in the replacement pattern. Those other metacharacters are described below. Replacement-Pattern MetacharactersThe string of characters you want to put in via a substitution command can use its own list of metacharacters. They're entirely different from the metacharacters used in searching for a pattern you want to take out of a line.
One more thing that's important to know about reusing patterns in substitution commands. When all or part of a text-to-be-replaced pattern is going to be used as a replacement pattern, or vice versa, the command reuses the result of the original pattern, after all the metacharacters have been evaluated in the original situation. Since the metacharacters in either of those two types of patterns have no meaning in the other type, it could hardly be otherwise. But when the reuse involves a text-to-be-replaced pattern being used a second time as a text-to-be-replaced pattern, or a replacement pattern being reused as a replacement pattern, the command brings along all the original metacharacters and evaluates them afresh in the new situation. Thus, in either of the cases in this paragraph, the second use is unlikely to produce exactly the same results as the first use did. Now another exercise for you. Suppose that lines 237 through 289 of a file have unknown capitalization--any line could be all caps, all lower case, or some mixture. These lines are to be changed so that the first letter of every word is a capital and all other letters are lower case. To simplify this, words are separated by space characters. What is the easy way to do this with one line-mode substitution command? This exercise depends on something I did not state directly, so don't feel bad if my solution is a little simpler than yours. Other Uses for SubstitutionDespite the name, the
537 , 542 substitute /^/WARNING: / so that text which originally looked like this: The primary output line carries very high voltage, which does not immediately dissipate when power to the system is turned off. Therefore, after turning off the system and disconnecting the power cord, discharge the primary output line to ground before servicing the output section. now looks like this: WARNING: The primary output line carries very high voltage, WARNING: which does not immediately dissipate when power to WARNING: the system is turned off. Therefore, after turning WARNING: off the system and disconnecting the power cord, WARNING: discharge the primary output line to ground before WARNING: servicing the output section. It's just as practical to pull some text out of lines without putting anything back in its place. Here are two command lines that do just that: % substitute / uh,//g . , $ substitute / *$ The latter command removes superfluous spaces at the ends of lines. It doesn't need the final two slashes because there is no suffix to be distinguished from a replacement pattern. At times you might use both the previous principles, to create
% substitute /^$ Now here's a different kind of exercise for you. I've already given you the command, above. It obviously makes no change whatsoever in the file. So why do I run this command? You need a goodly dose of inspiration to answer this, so don't be embarrassed if you have to look at my answer to this one. A Start on Script WritingAlready you know enough about the editor to be able to plan some fairly complex edits. Here's a short introduction to the art of writing editing scripts for this editor. BOTTOM-UP PROGRAMMING . That's usually the best way to build a complex editor command or command script. That's a programmer's term that means putting all the little details in separately and then pulling them all together into a unified whole, rather than starting with a grand overall plan and forcing the details to fit. For example, reader R.T. of San Francisco, California asks how to use the editor to automatically add HTML paragraph tags to each paragraph of manuscripts. This requires inserting the string ``<P>'' at the start of the first line of each paragraph, and the string ``</P>'' at the end of the last line. In these manuscripts, a completely empty line (not even a non-print ing character on it) separates one paragraph from another. This looks pretty easy. All that seems to be needed is to go to each empty line, then move up to the preceding line to insert the end-of-paragraph string and down to the following line to put in the start-of-paragraph string. But there are flaws in the obvious command to do this: global /^$/ - substitute :$:</P>: | ++ substitute /^/<P>/ The first problem is that when the editor goes to the empty first line that commonly begins a file, it will be unable to move up a line to do the first substitution. No substitution is needed there, of course, but since the editor doesn't leave that empty first line, moving down two lines will put it on the second line of the following paragraph -- definitely the wrong place for a start-of-paragraph tag. There are several ways to fix this problem:
Problem number two is that there may be several empty lines
between two paragraphs, since HTML interpretation is not affected
by them. If the editor is on the first of two or more
consecutive empty lines, the command I first proposed above will
perform its second substitution on the second empty line just
below it. When it moves to the second previously-empty line, it
will run the first substitution co
mmand on the empty line it just
left. (Yes, the second line is no longer empty, but it has
already been marked by the
at this meeting, so be sure to be there! At next month's meeting we'll hear from the new and should have been edited to look like this: at this meeting, so be sure to be there!</P> <P>At next month's meeting we'll hear from the new actually turns out like this: at this meeting, so be sure to be there!</P> </P> <P> <P>At next month's meeting we'll hear from the new It may look as though this hazard can be defeated by modifying the number two solution to the first problem above. That is, the address for both substitutions will be a search pattern that looks for a line that already has some text on it. This works properly when the editor is on the fi rst of two consecutive empty lines. From the second line, though, it runs its substitution commands on lines that have already been given their tags, so the sample text now looks like this: at this meeting, so be sure to be there!</P></P> <P><P>At next month's meeting we'll hear from the new COMPLEX CONDITIONALS . What's really needed here is double-conditional execution. That is, substitution commands must run on a given line only if both of these conditions are true:
In this case, the editor can handle it. The
Either the first or third solution can be adapted to satisfy that second condition. I've used the third solution in the example commands below, because the technique is easier to follow than it would be with the first solution: global /^$/ + substitute /^./<P>&/ global /^$/ - substitute :.$:&</P>: Bottom-up techniques can be continued if there are yet other special needs to be accommodated. Reader R.T. may have headlines and subheads mixed in with the paragraphs, and may already have appropriate HTML tags at the beginnings and ends of those heads and subheads. As an exercise, how would you adapt the commands just above so they would not add a para graph tag where any text already begins or ends with an HTML tag? Hint -- an HTML tag always begins with a ``<'' and ends with a ``>'' character. This is a very minor change, so you probably will not need to look at my solution except to confirm your own answer. A LITTLE TRICKERY . At times a command needs to be supercharged by way of a far out use of substitution--something perfectly legitimate, but never intended by the people who wrote this editor. Here are a few that you may find useful. You can't make a substitution that extends over more than a
single line--not directly, that is. Any attempt to put a
``newline'' character in either the to-be-replaced pattern or the
replacement pattern of a substitution command will fail. But by
combining the
Let's suppose that you have to alter a long document so that all references to ``Acme Distributors'' are changed to ``Barrett and Sons''. A simple substitution command will make most of these changes, but it will miss those instances where ``Acme'' appears at the end of one line and the next line starts with ``Distributors''. A followup pair of substitutions, to replace ``Acme'' wherever it appears at the end of a line and to replace ``Distributors'' when it starts a line, would wreak havoc--this document also refers to ``Acme Supply Co.'' and to three other companies whose names end with ``Distributors''. But we can handle this problem nicely with the following two command strings: global /Acme$/ + substitute /^Distributors/and Sons global /^and Sons/ - substitute /Acme$/Barrett The first command goes to every line that ends with ``Acme'' and then moves forward one line--if and only if that next line begins with ``Distributors'', it is changed to begin with ``and So ns''. The next command reverses the process to change ``Acme'' to ``Barrett'', but only in the right instances. (Note well that the second command searches for ``and Sons'', not ``Distributors'', because the first command has changed those line-split ``Acme Distributors'' to ``Acme and Sons''.) Often it is a good strategy to start with a change you definitely don't want in order to wind up with what you do want. Suppose you are a technical writer who has just finished writing a number of lengthy photo captions full of phrases like ``the light spot in the upper righthand corner'' and ``dark areas near the lower lefthand edge''. Along comes the news that the Art Director has decided to flop all the photos: print them in mirror-image form. Suddenly, everything that was on the right is now on the left, and vice versa. Your captions will be accurate again if you change every ``lefthand'' to read ``righthand'' and vice versa. But how to do that without wading through the whole text and making eac h change individually? The obvious pair of substitutions will not work: % substitute /lefthand/righthand/g % substitute /righthand/lefthand/g The second command doesn't just change the original instances of ``righthand'' to ``lefthand''; it also reverses every change your first command made--now everything is described as being on the lefthand side. But the following three substitution commands will do the job nicely. % substitute /lefthand/QQQQ/g % substitute /righthand/lefthand/g % substitute /QQQQ/righthand/g By making the first command change ``lefthand'' temporarily to ``QQQQ'' (or any other string you know will not be found in your document), you keep those changes safe from the effect of your second command. Then, after that second command has finished, the third command changes those Q strings to what you had wanted in the first place. It can even make sense to type in things incorrectly, then change them to what you want via substitution. When I'm writing documents in plain ASCII, to be printed without any formatting, I often use a line across the page to separate major sections of the document. But where others are satisfied with just a string of hyphens, or another single character, I pretty things up with multicharacter dividers like: -=-=-=-=-=-=-=-=-=-=-=-=-=-=-= -+--+--+--+--+--+--+--+--+--+- *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~ [][][][][][][][][][][][][][][] Not that I have the patience and concentration to type in page-wide lines of alternating characters, especially when I would have to constantly get on and off the shift key, too. No, I just use my repeat key to fill the line with whatever character will begin my eventual multicharacter pattern. For those four patterns above, I would have repeat-keyed in these four lines, respectively: ------------------------------ ------------------------------ ****************************** [[[[[[[[[[[[[[[[[[[[[[[[[[[[[[ T hen I only have to run a simple repeated substitution to get the line I actually want. Here are the commands I would run on the four lines above, respectively: substitute /--/-=/g substitute /---/-+-/g substitute /\*\*/*\~/g substitute /\[\[/[]/g SEMI-AUTOMATIC SUBSTITUTIONS . At times you'll have to make changes that are so dependent on human judgment that no substitution tricks could possibly do exactly what's wanted. In those cases there are two ways to have the editor partially automate those changes. The first is to use a variant form
of the
% substitute /^someth ing/something else/c % substitute /something/something else/gc The editor will then display the lines where substitutions are to be made on your screen, one at a time. Each line will have ``^'' marks below the text to be removed, like this: something in the air. The consensus is that ^^^^^^^^^ and if there are two or more places on the line where the substitution could be performed, the line will be displayed on your screen two or more times, with a different potential substitution marked each time.. After displaying a line on your screen, the editor will wait for you to type something ending with a carriage return. If whatever you type begins with a lower-case ``y'', the change will be made. If it begins with anything else, the text will be left as it is. Even this substitution variant may not give you enough
control. You may need to see more than one line to verify your
judgment, or the text to be put in may vary from one place to
another. In those c
ases, you can use one of the qualities of the
If you are editing in screen mode, as usual, you must start by typing a capital ``Q'' to go into line mode. From line mode's colon prompt, give a command like the following (if you want to make the same substitution as in our previous examples): global /something/ visual This command
will bring you in turn to each line
in the
file that contains the string ``something'' and put you in screen
-editing mode there. After you've looked around, and made a
substitution if you think it justified, typing a capital ``Q''
takes you out of screen-editing mode and returns you to the
There is an indirect hazard in leaving screen editing mode, though. And that brings us to the whole dismal subject of preventing loss of your changes, or of your entire file, while you are in the editor. Don't Lose Your FilesThe vi/ex editor is not strong on protecting you from the consequences of your own mistakes. In part that's just the natural result of giving you a lot of editing power. But when it comes to losing all the changes you've made in a file during a session, or even losing the original file you started with, the editor could be a lot more responsible without hamstringing your subtle edits. Still, there are ways you can comfortably protect yourself from those hazards, and many of those ways I explain below. IN EMERGENCIES
. Consider one of the editor's
safety features that can accidentally but quite easily turn into
a disaster. You may alr
eady know that when you edit with this
editor, you are working on a copy of the file, not the original.
Your changes do not affect the original unless you use the
That copy you are working on lives in a volatile place,
though, where it can easily be erased when the system crashes or
your link into the system goes down. That could cost you all the
additions and changes you'd made in that session with the editor.
Your first line of defense against this is to run the
And if you don't intend to change the original? Your ed
ited
version is to be a new file, with the original left untouched?
Well, you can use a modified form of writing the file, by typing
That method of preserving the original file is dangerous,
though. If you forget even once to add the filename to the
write nufile write and then go back to editing and writing to the file as usual. The sane way to protect your original file from any changes is
to start your editing with a
CRASHES WILL HAPPEN
. Still, a crash may catch
you by surprise, with a lot of additions and changes that you
have not written to any file.. To protect against this, the
editor always attempts to save your current working copy of the
file when a crash is imminent. You can
even launch an emergency
save yourself when you face a sticky situation, such as being
unable to do a normal write because it would exceed your
filespace quota. Just type a
The preservation function puts the saved copy in a specific
directory, and it will fail if that directory does not exist or
is not writable. (The path name of that directory varies
between versions of the editor, although
Can't open /var/preserve Preserve failed! there is a problem you will have to take up with your system administrator. (To speed up that discussion, bring along the path name of the directory that couldn't be opened.) If the message reads like this: File preserved. so far, so good. The next question is whether the editor has preserved an accurate copy or a pile of garbage--some editor implementations are broken in this area. To check this, recover the file you've just preserved. RESCUING SAVED FILES
. There are two ways to
recover a rescued working copy, whether it was saved in a crash
or because you used the
vi -r novel.chap3 vi -r The first of these commands puts you in the editor, with the latest rescued copy of your file ``novel.chap3'' before you. The latter command doesn't put you in the editor at all; it displays a list of all files that have been emergency-saved and then returns you to the shell command line. This list is useful wh en, say, the system crashed while you were editing a file that you hadn't given a name. (Yes, you can enter the editor without giving a filename to edit; the editor will simply bring up an empty buffer and assume you will give it a name later.) In this case the preservation process will give the file a name, and you must know this name to recover it. I said that the first command would bring up the
latest
rescued copy of the file you named. If the
system has been staggering for a while, there may be more than
one occasion when either you or the system caused the editor to
preserve the working copy of that file. If the latest version is
not the best copy, you can discard it and pull up the next most
recent version, without leaving the editor. Just give a
When you've recovered a file either way, look it over. If the editor version you're using has a broken preservation function, you'll only find garbage characters or a display like this: LOST LOST LOST LOST LOST If that be the case, the file you preserved is hopelessly lost and you'd better have a talk with your system administrator about getting a better version of the editor. But if what you see looks like what you had, then all you have to do is write the copy you've recovered to a file somewhere--the preserved copy was erased when you used one of the recovery commands, so it can't be recovered that way again. And that brings up the l
ast gotcha. You may believe that any
of the three commands
The first two attempt some checking, but their checks are not
very complete. In particular, they and the
The gotcha in the case of a recovered file is that pulli
ng a
new file into the buffer, whether normally or by recovering an
emergency-saved copy, is not an editing change. If your version
of the editor has a weak version of
A FEW MORE HAZARDS AND SOLUTIONS . Worse yet can befall you. You may accidentally lose both your own editing changes and the original file you were working from. Suppose one of your global editing commands went astray and
trashed your working copy of the file, but didn't happen to
affect the part that is on your screen. If you then wrote the
working copy to the file, the garb
age replaced your original file
contents. Oh, misery! And with any but the smallest file, it's
not practical to look over the working copy carefully before each
Or perhaps you did discover the disaster before you wrote the
working copy to the file. Seeing that undoing the errors was not
feasible, you decided either to run an
But since you were not creating an editor script here, you
probably typed the short form of your command, either
You are not lost yet, though, if you have been editing along in screen mode all the while. At any time you can type a short sequence to put the working copy back the way it was when you started this editing session. Then you only need to write the working copy to the file to expunge the trash there. Start by typing
So all the time you've been editing, the editor has been
holding a complete copy of the original file, in case you go back
to line mode and want to reverse the effect of that initial
Qu w vi One last hazard, which may seem childish to experienced Unix users but trips up many a refugee from single user systems. Unless you're on one of those rare Unix implementations that offers file locking, there is little to prevent another user on the system from editing the same file at the same time as you do. You will each be editing on a separate working copy, so there will be nothing to tell you that someone else is also editing the same file. But each time you write your changed version to the file, you will wipe out whatever changes the other user has already written to file, and vice versa. The ultimate victor in this unknowing war will be the user who finishes editing last. The other user can come back an hour later and find no indication that he/she ever touched the file. There's no real technical solution to this danger. You'll just have to coordinate carefully with other users on files that more than one of you may have occasion to edit. Reader FeedbackOne of our readers raised a significa nt point about this technique ; important enough to deserve a reply published in this article. Dear Walter... In your tutorial you write that you can use the command global/XXX/visual to search for the pattern "XXX" and edit/move around (remember, Hal needed this command to edit the linted spaghetti-code...) But there's one problem: suppose I found, after the 10th XXX of 100, that I do not want to view the remaining 90 occurences. It works as long as I don't type 'Q'. But now I want to view/edit the code where my lint report is something like "illegal", I have to type Q and then global/illegal/visual. And now there's the problem: typing Q doesn't prompt for a new input, it moves to the 11th occurence of "XXX". Do you know my problem? Is there a way to stop vi moving on with the global command after typing Q? Thanks a lot in advance! Chris... As Chris clearly realizes, ordinarily there is no problem with
omitting the remaining 90 stops. Each time th
is command puts you
into visual mode somewhere in the file, you are not restricted to
fixing one limited problem. You may move anywhere in the file,
edit whatever you like, and keep doing this as long as you
please. When you finally finish all the edits you've decided to
do, you can write the file and quit the editor in your usual
way--the suspended
But going into a second string of
The best way out of this predicament starts with writing your
changes to the file. Then, instead of typing
Now you can use the
In The Next InstallmentIn this tutorial to date, you've undoubtedly seen some aspects of the editor that you wish had been designed differently. The good news is that many of these features are yours to change at will--without hacking up the source code and recompiling. In Part 5 of this tutorial , I'll elucidate the editor's built-in facilities for setting up your own editing environment, and the many factors you can modify this way. |
Print This Page Send as e-mail |
Best of the Web
Data deduplication: Declawing the clones
Data deduplication is emerging as a critically important new arrow in the storage administrator's quiver to answer hard questions about the increasing problem in storage growth costs.
Compression, Encryption, Deduplication, and Replication: Strange Bedfellows
One of the great ironies of storage technology is the inverse relationship between efficiency and security: Adding performance or reducing storage requirements almost always results in reducing the confidentiality, integrity, or availability of a system.
WAN Optimization Whitelists and Blacklists
Optimization is a fantastic way of saving money and creating really happy customers at the same time, but it doesn't work flawlessly for all applications.
WAN Optimization as a Managed Service: It's Not About the Cost
This insight examines how organizations outsourcing their WAN optimization initiatives to a third-party go about achieving their goals for application performance, reducing operational costs, and streamlining enterprise infrastructure.





