Return to Answer

added 23 characters in body

edited Oct 13, 2015 at 12:45

88.6k
16
124
180

These answers don't require the input to be sorted:

Store the count and last-line-seen in arrays. Requires a lot of memory for large files, and requires GNU awk

awkgawk ' {count[$1]++; line[$1]=$0} END { PROCINFO["sorted_in"]="@val_str_asc" for (key in line) if (count[key] == 1) print line[key] } ' file

Scan the file twice, first to get the count, next to print the lines with count 1

awk 'NR == FNR {count[$1]++; next} count[$1]==1' file file

This will be the fastest and require the least memory, taking advantage of the sorted input:

awk ' prev_key && prev_key != $1 {if (count==1) print prev_line; count=0} {prev_key=$1; prev_line=$0; count++} END {if (count==1) print prev_line} ' file

These answers don't require the input to be sorted:

Store the count and last-line-seen in arrays. Requires a lot of memory for large files

awk ' {count[$1]++; line[$1]=$0} END { PROCINFO["sorted_in"]="@val_str_asc" for (key in line) if (count[key] == 1) print line[key] } ' file

Scan the file twice, first to get the count, next to print the lines with count 1

awk 'NR == FNR {count[$1]++; next} count[$1]==1' file file

This will be the fastest and require the least memory, taking advantage of the sorted input:

awk ' prev_key && prev_key != $1 {if (count==1) print prev_line; count=0} {prev_key=$1; prev_line=$0; count++} END {if (count==1) print prev_line} ' file

These answers don't require the input to be sorted:

Store the count and last-line-seen in arrays. Requires a lot of memory for large files, and requires GNU awk

gawk ' {count[$1]++; line[$1]=$0} END { PROCINFO["sorted_in"]="@val_str_asc" for (key in line) if (count[key] == 1) print line[key] } ' file

Scan the file twice, first to get the count, next to print the lines with count 1

awk 'NR == FNR {count[$1]++; next} count[$1]==1' file file

This will be the fastest and require the least memory, taking advantage of the sorted input:

awk ' prev_key && prev_key != $1 {if (count==1) print prev_line; count=0} {prev_key=$1; prev_line=$0; count++} END {if (count==1) print prev_line} ' file

Source Link

answered Oct 12, 2015 at 15:57

glenn jackman

88.6k
16
124
180

These answers don't require the input to be sorted:

Store the count and last-line-seen in arrays. Requires a lot of memory for large files

awk ' {count[$1]++; line[$1]=$0} END { PROCINFO["sorted_in"]="@val_str_asc" for (key in line) if (count[key] == 1) print line[key] } ' file

Scan the file twice, first to get the count, next to print the lines with count 1

awk 'NR == FNR {count[$1]++; next} count[$1]==1' file file

This will be the fastest and require the least memory, taking advantage of the sorted input:

awk ' prev_key && prev_key != $1 {if (count==1) print prev_line; count=0} {prev_key=$1; prev_line=$0; count++} END {if (count==1) print prev_line} ' file