Delete lines containing repeated text

Question

I have a file containing two paths on each line. I want to remove the lines containing the same path twice.

I work on Linux and Solaris. I would like a one-liner in sed or awk or perl.

Example input file:

 /usr/lib/libgmp.so.3.3.3 /usr/lib/libgmp.so.3.3.3 /usr/lib/libxslt.so.1.1.17 /usr/lib/libxslt.so.1.1.17 /usr/lib/sse2/libgmp.so.3.3.3 /usr/lib/sse2/libgmp.so.3.3.3 /usr/local/swp-tomcat-6.0/lib/commons-logging-1.1.1.jar /usr/local/swp-tomcat-6.0/lib/commons-logging-1.1.1.jar /usr/share/doc/libXrandr-1.1.1 /usr/share/doc/libXrandr-1.1.1 /usr/share/doc/libxslt-1.1.17 /usr/share/doc/libxslt-1.1.17 /etc/3.3.3.255 /etc/172.17.211.255 /etc/1.1.1.255 /etc/172.17.213.255

Expected output:

 /etc/3.3.3.255 /etc/172.17.211.255 /etc/1.1.1.255 /etc/172.17.213.255

grep isn't good? That is the dedicated tool to output lines from a file matching certain condition: grep -vx '\s*$\S\+$\s\+\1\s*' file. — manatwork
– manatwork, Commented Jun 4, 2013 at 10:47

BitsOfNix · Accepted Answer · 2013-06-05 07:04:32Z

5

awk '{ if ($1 != $2 ) print $1" "$2; }' file

Just replace file for the appropriate file.

Or as @manatwork mentioned in the comments and simpler

awk '$1!=$2' file

edited Jun 5, 2013 at 7:04

answered Jun 4, 2013 at 10:38

BitsOfNix

5,2173 gold badges28 silver badges34 bronze badges

5

Why not just awk '$1!=$2' file?

manatwork
– manatwork

2013-06-04 10:40:31 +00:00
Commented Jun 4, 2013 at 10:40
I was not aware it could be that simple. :/

BitsOfNix
– BitsOfNix

2013-06-04 10:43:47 +00:00
Commented Jun 4, 2013 at 10:43

Add a comment |

Gilles 'SO- stop being evil' · Accepted Answer · 2013-06-05 00:25:49Z

2

You can express repeated text in grep's regexps (this is an extension to the mathematical notion of regular expression).

grep -v '^ *\([^ ][^ ]*\) *\1 *$'

[^ ][^ ]* matches one or more non-space character. The backslash-parentheses make this a group, and \1 means “the same text as the first group”.

answered Jun 5, 2013 at 0:25

Gilles 'SO- stop being evil'

867k205 gold badges1.8k silver badges2.3k bronze badges

1

Except from being an alternative, can you give an argument why one should use grep over awk here?

Bernhard
– Bernhard

2013-06-05 08:35:16 +00:00
Commented Jun 5, 2013 at 8:35
2

@Bernhard Both are good methods. More people know grep basics than awk basics, but on the other hand the awk code is clearer, so I think there's no clear winner.

Gilles 'SO- stop being evil'
– Gilles 'SO- stop being evil'

2013-06-05 08:55:17 +00:00
Commented Jun 5, 2013 at 8:55

Add a comment |

potong · Accepted Answer · 2013-06-10 20:49:05Z

1

This might work for you (GNU sed):

sed -r '/(\S+)\s\1/d' file

answered Jun 10, 2013 at 20:49

potong

2661 silver badge2 bronze badges

Though untested, your (elegant!) solution should also work in non-GNU sed when written like this: sed '/$\S\+$\s\1/d' file. Just a little more escaping needed, and you'll be 100% compatible everywhere.

syntaxerror
– syntaxerror

2015-06-19 11:18:33 +00:00
Commented Jun 19, 2015 at 11:18

Add a comment |

Stack Exchange Network

Delete lines containing repeated text

3 Answers 3

You must log in to answer this question.

Hot Network Questions

Delete lines containing repeated text

3 Answers 3

You must log in to answer this question.

Related

Hot Network Questions