Hi,
Very good wishes to all!
Could you Please assist on the question on Datastage technically for the below question.
My question:
Source file:
Field1 Field2 Field3
abc 12f sLm
1234 hjd 12d
Hyd 34
Chn
My target file should generate the record count on field wise as below:
Field1 has 4 values and field2 has 3 values and field3 has two values so the target file should be as below.
Target File:
Field1 Field2 Field3
4 3 2
My source file can changes with more columns or less, so the target file should also have the same number of columns with the record count. Please help to resolve this..
Thanks and Regards,
Field wise record count for different Files
Moderators: chulett, rschirm, roy
Field wise record count for different Files
Hi Craig,
Thank you for your response.
the maximum number of columns are 150 columns.
Thanks and Regards,
Cheru.
Thank you for your response.
the maximum number of columns are 150 columns.
Thanks and Regards,
Cheru.
Cheru
I'll have to ponder this to see if there is a more elegant solution out there, hopefully others will chime in as well.
Off the top of my head you could 'brute force' it by handling all 150 columns, it will just be a little tedious to set up but won't affect performance. And best to use a Server job IMHO or a Server Sequential File stage in your Parallel job to handle the 'missing' columns, something the PX version can't do. Either that or read it as one long varchar and then parse it yourself.
The kicker to me is constraining the output to match the input, field wise. I'm thinking you could capture the highest field # with a populated column name but you'd have to read that first record as data to do that. Seems to me you could then write those column headers back out as the first line of the file (don't let the stage create them for you) and then trim the count record down to an appropriate size based on the number of fields in it and write it out as a single string.
Something like that anywho... just pulling things out of thin air right now. Need to get on the road and spend an hour in snowbound traffic to get into work.![Confused :?](./images/smilies/icon_confused.gif)
Off the top of my head you could 'brute force' it by handling all 150 columns, it will just be a little tedious to set up but won't affect performance. And best to use a Server job IMHO or a Server Sequential File stage in your Parallel job to handle the 'missing' columns, something the PX version can't do. Either that or read it as one long varchar and then parse it yourself.
The kicker to me is constraining the output to match the input, field wise. I'm thinking you could capture the highest field # with a populated column name but you'd have to read that first record as data to do that. Seems to me you could then write those column headers back out as the first line of the file (don't let the stage create them for you) and then trim the count record down to an appropriate size based on the number of fields in it and write it out as a single string.
Something like that anywho... just pulling things out of thin air right now. Need to get on the road and spend an hour in snowbound traffic to get into work.
![Confused :?](./images/smilies/icon_confused.gif)
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Source --> Transformer1 --> Pivot Enterprise --> Transformer2 --> Aggregator --> Target
Transformer1 : Create derivation like this for all your fields wherein hard-code your field name in Else
If NullToEmpty(Field1) = '' Then '' Else 'Field1'
Pivot Enterprise : Do horizontal pivoting with Derivation having all your fields. Output of pivot will be single field
Transformer2 : Remove null rows
Aggregator: Perform count on field generated from Pivot
If your source is sequential file, then you have to add field delimiters accordingly to make them 150 fields. If you are on Unix server, then try like this on filter option
Transformer1 : Create derivation like this for all your fields wherein hard-code your field name in Else
If NullToEmpty(Field1) = '' Then '' Else 'Field1'
Pivot Enterprise : Do horizontal pivoting with Derivation having all your fields. Output of pivot will be single field
Transformer2 : Remove null rows
Aggregator: Perform count on field generated from Pivot
If your source is sequential file, then you have to add field delimiters accordingly to make them 150 fields. If you are on Unix server, then try like this on filter option
Code: Select all
awk -F'\t' ' NF != 150 { for(i=NF; i<= 150;i++) { $0=$0 FS } }; 1 '
You are the creator of your destiny - Swami Vivekananda