1

file1 contains a list of charges from my credit card:

A B 1/1/2020 $12.50 1/3/2020 $10.00 1/5/2020 $99.15 1/6/2020 $35.50 1/8/2020 $99.00 

file2 contains a list of legitimate purchases, the dates don't necessarily match but the amounts in column B should match

A B 12/31/2020 $12.50 1/4/2020 $99.15 1/6/2020 $99.00 

Using column B to match, how do I find the records in file1 that don't have matching records in file2?

A B 1/3/2020 $10.00 1/6/2020 $35.50 

thanks in advance!

4 Answers 4

1
$ awk 'NR==FNR{cnt[$2]++; next} (FNR==1) || (--cnt[$2] < 0)' file2 file1 A B 1/3/2020 $10.00 1/6/2020 $35.50 
2
  • 1
    thanks! seems too easy when you write it like that! Commented May 16, 2020 at 14:32
  • 1
    Like the pre-decrement to the counter to avoid one test. Sweet. Commented May 16, 2020 at 15:33
0
awk -F'$' ' FNR==NR{ if (FNR>1){ a[$2]++} next } $2 in a && a[$2]{ a[$2]--; next } 1 ' file2 file1 

Save the values of file2 in an array and increment a counter skipping the header line. Continue with the next line.

When file1 is processed, test if the corresponding value exists in the array and the counter is non-zero. If that's the case, decrement the counter and continue with the next line.

Else, print the current line.

1
  • awesome thanks for the explanation and formatting made it super easy to understand! Commented May 16, 2020 at 14:28
0

OK so this is non awk and is a little untidy but it gives a little more info about the matches

join -a 2 -j 2 <(sort -k 2 legit) <(sort -k 2 charged) 

and an awk variant of the others above

awk 'NR==FNR{legit[$2]++; next}{legit[$2]--}legit[$2]<0{legit[$2]=0; print}' legit charged 
1
  • I didn't consider using join, use the simplest tool right? thanks! Commented May 16, 2020 at 14:28
0

command

awk 'NR==FNR{a[$2];next}!($2 in a){print $0}' file2 file1 

output

1/3/2020 $10.00 1/6/2020 $35.50 

You must log in to answer this question.