Replacing multiple strings with sed

I was chatting recently with a colleague about a data processing task he needed help with. After discussing things for a bit, it was clear that he simply needed a massive search and replace job. Of course, it would be easy to write a Perl or Python script to do this, but it would be a shame not to take advantage of the speed and convenience of the tools that already exist as part of the UNIX command line!

My colleague sent me a file that mapped each of the target strings to their replacement, in this format.

replacement1,target1a,target1b,target1c,...
replacement2,target2a,target2b,target2c,...
...

I saved this file as mapping.txt, and then created a simple sed script from it using the following Perl one-liner (I’ve added a line break for readability).

perl -ne 'chomp(); s/\s*$//; @v = split(/\s*,\s*/); if(@v > 0){ $k = shift(@v);
foreach $val(@v){printf("s/%s/%s/g\n", $val, $k)} }' < mapping.txt > replacements.sed

The sed script looks like this–easy enough to create manually if you have a small number of replacements, but I definitely needed to script it for this task.

s/target1a/replacement1/g
s/target1b/replacement1/g
s/target1c/replacement1/g
...
s/target2a/replacement2/g
s/target2b/replacement2/g
s/target2c/replacement2/g
...

Finally, here is the command I used to do the replacement.

sed -f replacements.sed < data.txt > data-new.txt

That’s all!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s