Page 1 of 3

Packed column from DB2/400

Posted: Tue May 18, 2010 2:28 am
by antonyraj.deva
Hi All,

One of the data sources in my project is DB2 tables in AS/400 system.

The data structure of the table is as follows:

Col A,Col B,Col C,Col D
1234,ab2f,pazz,"&Col E=ZZ&Col F= &Col G=5478&Col H=100.00
2345,cd3f,ddzz,"&Col F=IN&Col G=9801&Col H=250.25&Col J=20060630

The Problems in reading Col D are
1. The record starts with the Quote Character but doesn't end with it.
2. The '&' seperated columns are not following a uniform pattern i.e., it varies from record to record (some start with Col E and some with Col X etc).
3. This field contains around 300 '&' seperated columns in all possible combinations and we have 1 million records per monthly extract. :(

Is there anyway to read this data using the available stages in Datastage? :?

It would be of great help if this issue is resolved.

Thanks in advance...

Posted: Tue May 18, 2010 3:06 am
by priyadarshikunal
have you tried reading it? and how having diff no of '&' in column D affects reading?

Posted: Tue May 18, 2010 3:51 am
by antonyraj.deva
Hi Priyadarshi,

I seperated the Col D into a sequential file and tried to read from that file.

Since the record starts with different columns the mapping with the target variable is not possible.

Also trying to fix a metadata to this long string fails because the columns doesn't follow any fixed pattern.

Posted: Tue May 18, 2010 4:20 am
by battaliou
Read the entire string as a single varchar(1000) in sequntial file mode

Map your fields in a transformer:
ColE : if index(input.col,'COL E=',1) > 1 then field(input.col[index(input.col,'COL E=',1)+7,99],'&',1) else ''

Posted: Tue May 18, 2010 5:39 am
by chulett
Explain to us how you need to 'read' this column. Do you actually mean parse?

Posted: Tue May 18, 2010 6:18 am
by antonyraj.deva
Hi Craig,

Yes I need to parse the data and map it to the target variables.

The target will contain around 300 columns which has to be populated by parsing Col D.

Posted: Tue May 18, 2010 6:27 am
by ray.wurlod
300 stage variables.

Enjoy.

At least you only have to do it once.

Posted: Tue May 18, 2010 7:32 am
by antonyraj.deva
Hi Ray,

Kindly requesting the logic I should be using in the stage variables to get the desired output?

Thanks in advance... :)

Posted: Tue May 18, 2010 8:00 am
by chulett
Field() should work, looks like a "space delimited string" to me.

Posted: Tue May 18, 2010 8:06 am
by antonyraj.deva
Thanks for the advise Craig.

The string is '&' delimited. I'd thought of using DCount and Field functions earlier. But the part which really confuses is the irregular pattern of columns in the data.

What should be the approach if I use stage variables and Field function?

Posted: Tue May 18, 2010 8:14 am
by chulett
OK, then use the "&" as the delimiter. You'll need to set up the maximum number of occurances as stage variables and see which ones get populated.

Posted: Tue May 18, 2010 9:27 am
by antonyraj.deva
Hi Craig,

I tried to sample with one stage variable which is Col_E.
The derivation for Col_E is "Col_E = Field("DSLink5.Col_D","&","2","2")".
The job ran fine but the output was filled only with zeros except for the first record which had the value 1. The actual value of Col_E in the sring is "zz".
Please correct me on what needs to be modified to get the desired output.

Thanks,
Tony

Posted: Tue May 18, 2010 10:09 am
by ray.wurlod
Lose the quotes from around the third and fourth arguments of the Field() functions - these must be integers.

Posted: Tue May 18, 2010 10:54 am
by chulett
And are you certain you need the second "2"? That tells it how many contiguous fields to pull.

Posted: Wed May 19, 2010 7:29 am
by antonyraj.deva
Hi Craig & Ray,

I removed the quote at the third argument and completely the fourth argument. Still the output is coming as "1" for the first record and zeros for all other records.

Kindly looking forward for your advise to resolve this issue.

Thanks,
Tony