Skip to main content
added 40 characters in body
Source Link
don_crissti
  • 85.7k
  • 31
  • 234
  • 263

You can't do this with a single join invocation.
You'llWith your approach you have to use join twice e.g. using your code here with minor adjustments(or change your approach to do it with a single join invocation) :

  • print the common lines and the unpairable lines from file1 with join -t'|' -e0 -a1 -o 1.2,1.3,1.5,2.5 <(<file1 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1) <(<file2 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1)
  • print the unpairable lines from file2 with join -t'|' -e0 -v2 -o 2.2,2.3,1.5,2.5 <(<file1 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1) <(<file2 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1)

You can do the same with a single awk invocation, storing $4 in two arrays indexed by e.g. $1|$2 and then in the END block iterating over each array indices, comparing them and printing accordingly:

awk -F'|' 'NR==FNR{z[$1"|"$2]=$4;next}{x[$1"|"$2]=$4} END{for (j in x){if (!(j in z)){print j, "0", x[j]}}; for (i in z){if (i in x){print i, z[i], x[i]} else {print i, z[i], "0"}} }' OFS="|" file1 file2 

You can't do this with a single join invocation.
You'll have to use join twice e.g. using your code here with minor adjustments:

  • print the common lines and the unpairable lines from file1 with join -t'|' -e0 -a1 -o 1.2,1.3,1.5,2.5 <(<file1 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1) <(<file2 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1)
  • print the unpairable lines from file2 with join -t'|' -e0 -v2 -o 2.2,2.3,1.5,2.5 <(<file1 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1) <(<file2 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1)

You can do the same with a single awk invocation, storing $4 in two arrays indexed by e.g. $1|$2 and then in the END block iterating over each array indices, comparing them and printing accordingly:

awk -F'|' 'NR==FNR{z[$1"|"$2]=$4;next}{x[$1"|"$2]=$4} END{for (j in x){if (!(j in z)){print j, "0", x[j]}}; for (i in z){if (i in x){print i, z[i], x[i]} else {print i, z[i], "0"}} }' OFS="|" file1 file2 

With your approach you have to use join twice (or change your approach to do it with a single join invocation) :

  • print the common lines and the unpairable lines from file1 with join -t'|' -e0 -a1 -o 1.2,1.3,1.5,2.5 <(<file1 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1) <(<file2 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1)
  • print the unpairable lines from file2 with join -t'|' -e0 -v2 -o 2.2,2.3,1.5,2.5 <(<file1 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1) <(<file2 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1)

You can do the same with a single awk invocation, storing $4 in two arrays indexed by e.g. $1|$2 and then in the END block iterating over each array indices, comparing them and printing accordingly:

awk -F'|' 'NR==FNR{z[$1"|"$2]=$4;next}{x[$1"|"$2]=$4} END{for (j in x){if (!(j in z)){print j, "0", x[j]}}; for (i in z){if (i in x){print i, z[i], x[i]} else {print i, z[i], "0"}} }' OFS="|" file1 file2 
added 441 characters in body
Source Link
don_crissti
  • 85.7k
  • 31
  • 234
  • 263

You can't do this with a single join invocation.
You'll have to use join twice e.g. using your code here with minor adjustments:

  • print the common lines and the unpairable lines from file1 with join -t'|' -e0 -a1 -o 1.2,1.3,1.5,2.5 <(<file1 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1) <(<file2 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1)
  • print the unpairable lines from file2 with join -t'|' -e0 -v2 -o 2.2,2.3,1.5,2.5 <(<file1 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1) <(<file2 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1)

You can do the same with a single awk invocation, storing $4 in two arrays indexed by e.g. $1|$2 and then in the END block iterating over each array indices, comparing them and printing accordingly:

awk -F'|' 'NR==FNR{z[$1"|"$2]=$4;next}{x[$1"|"$2]=$4} END{for (j in x){if (!(j in z)){print j, "0", x[j]}}; for (i in z){if (i in x){print i, z[i], x[i]} else {print i, z[i], "0"}} }' OFS="|" file1 file2 

You can't do this with a single join invocation.
You'll have to use join twice e.g. using your code here with minor adjustments:

  • print the common lines and the unpairable lines from file1 with join -t'|' -e0 -a1 -o 1.2,1.3,1.5,2.5 <(<file1 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1) <(<file2 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1)
  • print the unpairable lines from file2 with join -t'|' -e0 -v2 -o 2.2,2.3,1.5,2.5 <(<file1 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1) <(<file2 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1)

You can't do this with a single join invocation.
You'll have to use join twice e.g. using your code here with minor adjustments:

  • print the common lines and the unpairable lines from file1 with join -t'|' -e0 -a1 -o 1.2,1.3,1.5,2.5 <(<file1 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1) <(<file2 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1)
  • print the unpairable lines from file2 with join -t'|' -e0 -v2 -o 2.2,2.3,1.5,2.5 <(<file1 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1) <(<file2 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1)

You can do the same with a single awk invocation, storing $4 in two arrays indexed by e.g. $1|$2 and then in the END block iterating over each array indices, comparing them and printing accordingly:

awk -F'|' 'NR==FNR{z[$1"|"$2]=$4;next}{x[$1"|"$2]=$4} END{for (j in x){if (!(j in z)){print j, "0", x[j]}}; for (i in z){if (i in x){print i, z[i], x[i]} else {print i, z[i], "0"}} }' OFS="|" file1 file2 
Source Link
don_crissti
  • 85.7k
  • 31
  • 234
  • 263

You can't do this with a single join invocation.
You'll have to use join twice e.g. using your code here with minor adjustments:

  • print the common lines and the unpairable lines from file1 with join -t'|' -e0 -a1 -o 1.2,1.3,1.5,2.5 <(<file1 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1) <(<file2 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1)
  • print the unpairable lines from file2 with join -t'|' -e0 -v2 -o 2.2,2.3,1.5,2.5 <(<file1 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1) <(<file2 awk -F'|' '{print $1"-"$2"|"$0}' | sort -t'|' -k1,1)