0

relational column processing using awk sed File 1 ---> Scrambled data in column (Old stale files with intermediate test data)

mL9A7hajHyuVIQr1HNP7ThYfj9yBUd Iq4iqnH4UftLgGUSobLeti0hkmdMn7 BlzanDNcIsgru2wNYlO6kDjpuPvs82 eqOZRXfdcxHqd26Raqd6ZOtPhoQp33 CrWSI2eyZZkkYlEbOoHgu2o43tU3xa IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7 YuuRhD5f3xju0RnUCjS66g3X2TNNIj MpJHtG8FjeErwsh6emcCu7B4bHwCnR aQ1DXaG7XSopgxBeBsRRZcCh2xRu5f nXgwlL0p8LEWNFGznIy2NUXBWHzZgS 

File 2---> Same data but in final order (New files after yearly audit and changing intermediate data to final form after test )

BlzanDNcIsgru2wNYlO6kDjpuPvs82 CrWSI2eyZZkkYlEbOoHgu2o43tU3xa IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7 Iq4iqnH4UftLgGUSobLeti0hkmdMn7 MpJHtG8FjeErwsh6emcCu7B4bHwCnR YuuRhD5f3xju0RnUCjS66g3X2TNNIj aQ1DXaG7XSopgxBeBsRRZcCh2xRu5f eqOZRXfdcxHqd26Raqd6ZOtPhoQp33 mL9A7hajHyuVIQr1HNP7ThYfj9yBUd nXgwlL0p8LEWNFGznIy2NUXBWHzZgS 

Trying to explain problem From now on whatever processing we do on Final_results file we wanted to reflect the same change in intermediate test files preserving their order

paste <(cat temp_data | nl) <(cat Final_results) | column

1 CrWSI2eyZZkkYlEbOoHgu2o43tU3xa 3 BlzanDNcIsgru2wNYlO6kDjpuPvs82 2 BlzanDNcIsgru2wNYlO6kDjpuPvs82 5 CrWSI2eyZZkkYlEbOoHgu2o43tU3xa 3 IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7 6 IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7 4 Iq4iqnH4UftLgGUSobLeti0hkmdMn7 2 Iq4iqnH4UftLgGUSobLeti0hkmdMn7 5 aQ1DXaG7XSopgxBeBsRRZcCh2xRu5f 8 MpJHtG8FjeErwsh6emcCu7B4bHwCnR 6 YuuRhD5f3xju0RnUCjS66g3X2TNNIj 7 YuuRhD5f3xju0RnUCjS66g3X2TNNIj 7 MpJHtG8FjeErwsh6emcCu7B4bHwCnR 9 aQ1DXaG7XSopgxBeBsRRZcCh2xRu5f 8 eqOZRXfdcxHqd26Raqd6ZOtPhoQp33 4 eqOZRXfdcxHqd26Raqd6ZOtPhoQp33 9 mL9A7hajHyuVIQr1HNP7ThYfj9yBUd 1 mL9A7hajHyuVIQr1HNP7ThYfj9yBUd 10 nXgwlL0p8LEWNFGznIy2NUXBWHzZgS 10 nXgwlL0p8LEWNFGznIy2NUXBWHzZgS 

Desired relational processing. If i change 3 BlzanDNcIsgru2wNYlO6kDjpuPvs82--SUFFIX ----> File2 (Column 2 in above commands) Then reflect the change 6 IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7-SUFFIX --->File1 (Column1 in above commands)

Eg. What was IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7 before testing was now BlzanDNcIsgru2wNYlO6kDjpuPvs82 in Fina_results. So now whatever processsin we do on BlzanDNcIsgru2wNYlO6kDjpuPvs82 in Final should reflect on IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7 in previous files

PROBLEM1 Desired Output 1

1 CrWSI2eyZZkkYlEbOoHgu2o43tU3xa 3 BlzanDNcIsgru2wNYlO6kDjpuPvs82--SUFFIX 2 BlzanDNcIsgru2wNYlO6kDjpuPvs82 5 CrWSI2eyZZkkYlEbOoHgu2o43tU3xa 3 IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7-SUFFIX 6 IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7 4 Iq4iqnH4UftLgGUSobLeti0hkmdMn7 2 Iq4iqnH4UftLgGUSobLeti0hkmdMn7 5 aQ1DXaG7XSopgxBeBsRRZcCh2xRu5f 8 MpJHtG8FjeErwsh6emcCu7B4bHwCnR 6 YuuRhD5f3xju0RnUCjS66g3X2TNNIj 7 YuuRhD5f3xju0RnUCjS66g3X2TNNIj 7 MpJHtG8FjeErwsh6emcCu7B4bHwCnR 9 aQ1DXaG7XSopgxBeBsRRZcCh2xRu5f 8 eqOZRXfdcxHqd26Raqd6ZOtPhoQp33 4 eqOZRXfdcxHqd26Raqd6ZOtPhoQp33 9 mL9A7hajHyuVIQr1HNP7ThYfj9yBUd 1 mL9A7hajHyuVIQr1HNP7ThYfj9yBUd 10 nXgwlL0p8LEWNFGznIy2NUXBWHzZgS 10 nXgwlL0p8LEWNFGznIy2NUXBWHzZgS 

PROBLEM2 Desired Output 2 Transpose second file to first.

See in paste command output 3 BlzanDNcIsgru2wNYlO6kDjpuPvs82--SUFFIX has 3. So in first colum 3rd row needs to be changed with first row of column 4 which has 3 in third column and likewise entire output transposr columns and finally output changed data

1 mL9A7hajHyuVIQr1HNP7ThYfj9yBUd-Suffix 2 Iq4iqnH4UftLgGUSobLeti0hkmdMn7-Suffix BlzanDNcIsgru2wNYlO6kDjpuPvs82-Suffix 4 eqOZRXfdcxHqd26Raqd6ZOtPhoQp33-Suffix 5 CrWSI2eyZZkkYlEbOoHgu2o43tU3xa-Suffix 6 IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7-Suffix 7 YuuRhD5f3xju0RnUCjS66g3X2TNNIj-Suffix 8 MpJHtG8FjeErwsh6emcCu7B4bHwCnR-Suffix 9 aQ1DXaG7XSopgxBeBsRRZcCh2xRu5f-Suffix 10 nXgwlL0p8LEWNFGznIy2NUXBWHzZgS-Suffix 

##Simplified data (for testing solutions)

1 Mina_Warren 2 Ayden_Silva 2 Jazlene_Gibbs 4 Quintin_Glover 3 Kaleigh_Farley 1 Callum_Mckay 4 Callum_Mckay 7 Jazlene_Gibbs 5 Finn_Nelson 6 Mina_Warren 6 Ayden_Silva 3 Kaleigh_Farley 7 Quintin_Glover 5 Finn_Nelson 

Output (ONLY FOR PROBLEM 1)

1 Mina_Warren--Suffix3 2 Ayden_Silva---Suffix1 2 Jazlene_Gibbs--Suffix1 4 Quintin_Glover--Suffix2 3 Kaleigh_Farley--Suffix6 1 Callum_Mckay--Suffix3 4 Callum_Mckay--Suffix2 7 Jazlene_Gibbs-Suffix4 5 Finn_Nelson--Suffix7 6 Mina_Warren--Suffix5 6 Ayden_Silva--Suffix5 3 Kaleigh_Farley--Suffix6 7 Quintin_Glover--Suffix4 5 Finn_Nelson-Suffix7 

(ONLY FOR PROBLEM 1) Here Suffix --- Means ---> Any kind of processing editing renaming replacing etc. Any Processing in 4th column of certain line will reflect same on 2nd column of nth line. ---> Here nth line is obtained by third column

Logic for Problem 1 for awk --Read first record. --Store column $4 in variable and --then goto line number as shown in column3 $3 --then replce column $2 with varible stored from $4

Consider

1 Mina_Warren 2 Ayden_Silva 2 Jazlene_Gibbs 4 Quintin_Glover 

Third columns are linenumber for relational edit. Logic for problem 1 Read row 1 ---> Store Ayde _Silva in variable ---> Goto row 2 because of 2 in $3 of row 1 ---> Now on row 2 have same prcessing on Jazlene_Gibbs

Desired Output Problem 1

Mina_Warren 2 Ayden_Silva--Suffix1 2 Jazlene_Gibbs--Suffix1 4 Quintin_Glover 

Logic for problem 2 Transposing Read row 1 ---> Store Ayde _Silva in variable ---> Goto row 2 because of 2 in $3 of row 1 ---> Now on row 2 replace Jazlene_Gibbs with processed version of Ayde_Silve ---> Do these for all lines in loop ---> Delete column 3 and column 4

Desired output Problem 2

1 Callum_Mckay--Suffix3 2 Ayden_Silva--Suffix1 3 Kaleigh_Farley--Suffix6 4 Quintin_Glover--Suffix2 5 Finn_Nelson--Suffix7 6 Mina_Warren--Suffix5 7 Jazlene_Gibbs--Suffix4 

Tried

in="$(awk 'END { print NR }' 1)" file awk -v ty=$in '{for (i=1;i<=ty;i++) NR==$i; var=$4; varb=$3; NR == varb; $2=var; print}' file 

but not working as intended logic what i tried to do is in = total numbet of records here 7 then using for loop to loop 7 times---> i =1; NR==1 ##goto rrcord 1 --> store $4 in var and $3 in varb --> NR==varb ## goto record as specified in varb eg 2. them replcace $2 with var. loop over ---> i =2 goto record two and likewise

13
  • Is --suffix always separated by dashes or is the crumbled data string always of the same lenght (30)? This is important for splitting the string so it can be compared between left and right and the suffix applied to left correctly. Commented Jan 22, 2022 at 10:04
  • 1
    @Tathastu "any kind of processing" is too unspecific. How is "processing" visible, how does one know if it was processed or not? Does changing the first letter count as processing? If so how does one know? OR is it really just all at the end of the string and the original string always has the same length? Please be on-point or solutions will not work; e.g. people trusting that the dash exists, but you say it might not be there. Commented Jan 22, 2022 at 10:29
  • 1
    Re: multiple accounts: unix.stackexchange.com/users/508416/tathastu-pandya unix.stackexchange.com/users/509083/tathastu-pandya unix.stackexchange.com/users/509185/tathastu-pandya unix.stackexchange.com/users/414574/tathastu-pandya unix.stackexchange.com/users/510830/kung and more? Please talk to the moderators about how to merge all of your accounts into 1 so you can develop a reputation, gain privileges, edit your own questions, etc. Commented Jan 22, 2022 at 13:45
  • 2
    @EdMorton Thank you for helping the OP get things straightened out! There's actually a self-service form for merging accounts. It's at the bottom of each page under Contact -- "What can we help you with?" = "I want to merge user profiles". It seems like the OP is having trouble registering accounts because they're clearing their browser cookies. It's been a while since I've created an account, but I believe you can register via Google, Facebook, or a username & password. You can then log back in the same way. Hope this helps! Commented Jan 22, 2022 at 16:18
  • 1
    @Tathastu there's no place for rude or abusive language here on Stack Exchange. I've deleted your previous comment. Please abide by the Code of Conduct while you're here, guest or not. Thank you. Commented Feb 3, 2022 at 15:09

1 Answer 1

0

Using cppawk:

This is a new kid on the blawk, developed just over this past month.

cppawk is preprocessed with the C preprocessor, producing input for regular Awk, and comes with some library headers for Lisp-like list processing, fancy iteration, and other utilities.

Problem 1:

#include <cons.h> BEGIN { bag = list_begin() } { left[$1] = $2 right[$3] = $4 leftn[$3] = $1 bag = list_add(bag, $3) } END { finlist = list_end(bag) dolist (i, finlist) { left[i] = left[i] "--Suffix" ++suff right[i] = right[i] "--Suffix" suff } dolist (i, finlist) { print leftn[i], left[leftn[i]], i, right[i] } } 

Output:

cppawk -f prob-1.cwk prob-1-data 1 Mina_Warren--Suffix3 2 Ayden_Silva--Suffix1 2 Jazlene_Gibbs--Suffix1 4 Quintin_Glover--Suffix2 3 Kaleigh_Farley--Suffix6 1 Callum_Mckay--Suffix3 4 Callum_Mckay--Suffix2 7 Jazlene_Gibbs--Suffix4 5 Finn_Nelson--Suffix7 6 Mina_Warren--Suffix5 6 Ayden_Silva--Suffix5 3 Kaleigh_Farley--Suffix6 7 Quintin_Glover--Suffix4 5 Finn_Nelson--Suffix7 

Problem 2:

#include <cons.h> BEGIN { bag = list_begin() } { left[$1] = $2 right[$3] = $4 leftn[$3] = $1 bag = list_add(bag, $3) } END { finlist = list_end(bag) dolist (i, finlist) { left[i] = right[i] "--Suffix" ++suff } dolist (i, finlist) { print leftn[i], left[leftn[i]] } } 

Output:

cppawk -f prob-2.cwk prob-1-data 1 Callum_Mckay--Suffix3 2 Ayden_Silva--Suffix1 3 Kaleigh_Farley--Suffix6 4 Quintin_Glover--Suffix2 5 Finn_Nelson--Suffix7 6 Mina_Warren--Suffix5 7 Jazlene_Gibbs--Suffix4 

You must log in to answer this question.