November 11th, 2006 by lucas
I often need to do list-like operations on files in shell, for example:
- substract lines in one file from the lines in another file
- add lines from two lines, suppressing duplicates
- keep only lines which are not in both files
- keep only lines which are in both files
Such operations are easy to do with a combination of sort, uniq, cut, diff, etc. But they are so basic operations that it is a bit annoying to write the small shell script each time I need to do one of them.
Isn’t there a tool out there already providing all of them ?
Also, it would be great if such operations could be achieved considering only the first n characters or words (a bit like uniq -w, or the removed uniq -W option). It would be an easy way to do :
Comments are opened.
Update: many people pointed me to moreutils‘ combine. It looks good, bug not exactly what I need, so I filed wishlist bugs #398187 (combine: provide aliases for set theory operators) and #398193 (combine: allow to compare only on a subset of the lines). I won’t have time to provide patches, so if somebody want to work on them …. :-)