Skip to main content
added 23 characters in body
Source Link
glenn jackman
  • 88.6k
  • 16
  • 124
  • 180

These answers don't require the input to be sorted:

Store the count and last-line-seen in arrays. Requires a lot of memory for large files, and requires GNU awk

awkgawk ' {count[$1]++; line[$1]=$0} END { PROCINFO["sorted_in"]="@val_str_asc" for (key in line) if (count[key] == 1) print line[key] } ' file 

Scan the file twice, first to get the count, next to print the lines with count 1

awk 'NR == FNR {count[$1]++; next} count[$1]==1' file file 

This will be the fastest and require the least memory, taking advantage of the sorted input:

awk ' prev_key && prev_key != $1 {if (count==1) print prev_line; count=0} {prev_key=$1; prev_line=$0; count++} END {if (count==1) print prev_line} ' file 

These answers don't require the input to be sorted:

Store the count and last-line-seen in arrays. Requires a lot of memory for large files

awk ' {count[$1]++; line[$1]=$0} END { PROCINFO["sorted_in"]="@val_str_asc" for (key in line) if (count[key] == 1) print line[key] } ' file 

Scan the file twice, first to get the count, next to print the lines with count 1

awk 'NR == FNR {count[$1]++; next} count[$1]==1' file file 

This will be the fastest and require the least memory, taking advantage of the sorted input:

awk ' prev_key && prev_key != $1 {if (count==1) print prev_line; count=0} {prev_key=$1; prev_line=$0; count++} END {if (count==1) print prev_line} ' file 

These answers don't require the input to be sorted:

Store the count and last-line-seen in arrays. Requires a lot of memory for large files, and requires GNU awk

gawk ' {count[$1]++; line[$1]=$0} END { PROCINFO["sorted_in"]="@val_str_asc" for (key in line) if (count[key] == 1) print line[key] } ' file 

Scan the file twice, first to get the count, next to print the lines with count 1

awk 'NR == FNR {count[$1]++; next} count[$1]==1' file file 

This will be the fastest and require the least memory, taking advantage of the sorted input:

awk ' prev_key && prev_key != $1 {if (count==1) print prev_line; count=0} {prev_key=$1; prev_line=$0; count++} END {if (count==1) print prev_line} ' file 
Source Link
glenn jackman
  • 88.6k
  • 16
  • 124
  • 180

These answers don't require the input to be sorted:

Store the count and last-line-seen in arrays. Requires a lot of memory for large files

awk ' {count[$1]++; line[$1]=$0} END { PROCINFO["sorted_in"]="@val_str_asc" for (key in line) if (count[key] == 1) print line[key] } ' file 

Scan the file twice, first to get the count, next to print the lines with count 1

awk 'NR == FNR {count[$1]++; next} count[$1]==1' file file 

This will be the fastest and require the least memory, taking advantage of the sorted input:

awk ' prev_key && prev_key != $1 {if (count==1) print prev_line; count=0} {prev_key=$1; prev_line=$0; count++} END {if (count==1) print prev_line} ' file