Page 1 of 1

Posted: Wed Nov 20, 2002 7:49 am
by chulett
There's usually multiple ways to solve any problem, but the simplest thing for this situation would be to use FIELD. From memory:

FIELD(DATA,",",1) -> Col1
FIELD(DATA,",",2) -> Col2 etc

You tell it the string to parse, what the seperator is and which occurance you want.

-craig

Posted: Wed Nov 20, 2002 4:11 pm
by ray.wurlod
For very many rows it might be faster to write two columns into a text file with no quoting and "," as the separator, then to read six columns from the text file. (DataStage has some clever read-ahead capabilities within the Sequential File stage.)

Posted: Wed Nov 20, 2002 6:56 pm
by cwong
Another suggestion is to use the Pivot stage.
You may assume up to more than 6 columns just in case it will expand in the future.
Then for each column coming out, check if the column is not null then create an output record.

Posted: Wed Nov 20, 2002 10:12 pm
by ray.wurlod
Have you verified this? It's my understanding that the Pivot stage requires the pivoting elements to be in separate columns, not in a comma-delimited list in one column.

Posted: Wed Nov 20, 2002 11:48 pm
by Tony
There's usually multiple ways to solve any problem, but the simplest thing for this situation would be to use FIELD. From memory:

FIELD(DATA,",",1) -> Col1
FIELD(DATA,",",2) -> Col2 etc

You tell it the string to parse, what the seperator is and which occurance you want.

-craig

Hi all,

I got a similar problem like this,but i want to seperate values into different rows:
I got 1 row as input,and need 3 rows return:

col1 col2 col3
==== ====== =====
a b,c,d zzz

col1 col2 col3
===== ===== ======
a b zzz
a c zzz
a d zzz

How can i do this?
Thanks for your kindly response.

Posted: Thu Nov 21, 2002 2:03 am
by ray.wurlod
Split the three columns into five columns using a Transformer stage (derivation expressions using Field() as discussed above), then use a Pivot stage.
Manual for the Pivot stage is Pivot.pdf and is in the Packages folder of your DataStage client CD. It has good examples.

Welcome to the forum, Tony; I suspect you are the first from Taiwan (but would be happy to be proven wrong).

Edited by - ray.wurlod on 11/21/2002 02:05:05

Posted: Thu Nov 21, 2002 6:56 pm
by Tony
Split the three columns into five columns using a Transformer stage (derivation expressions using Field() as discussed above), then use a Pivot stage.
Manual for the Pivot stage is Pivot.pdf and is in the Packages folder of your DataStage client CD. It has good examples.

Welcome to the forum, Tony; I suspect you are the first from Taiwan (but would be happy to be proven wrong).

Edited by - ray.wurlod on 11/21/2002 02:05:05
Hi,Ray

Thanks for your reply,
That's a very helpful information for me.
and...i think you got the right answer :)

Posted: Thu Nov 21, 2002 7:04 pm
by cwong
Ray,

Use a sequencial file stage in front where the field delimiter will be specify accordingly; in this case will be the "," then inside the pivot stage, will list all the columns. Coming out from the pivot stage will be one column at a time. Use a transformer stage to check if the incoming data is not null, create an output record.

For another scenario, any column of data that needs to repeat in subsequent rows, save in a stage variable in the transformer stage and use it to build the output record.

Posted: Thu Nov 21, 2002 8:53 pm
by ray.wurlod
Nice thought, Cristal, but the source data come from two columns in a database table. The comma-delimited list is in one of the columns.

Posted: Fri Nov 22, 2002 10:24 pm
by cwong
I would suggest 2 options:
Option #1:
- bring in all needed columns from database table via the SELECT stmt
- use a transformer stage to concat all columns together with the @FM in between each
- translate delimiter in delimited list into @FM (field mark)
for example:
col1:@FM:col2:@FM:translate(col3,',',@FM)
note: pls check syntax of translate function.

Option #2:
- concatenate the columns into one big column in Select stmt
for example: SELECT col1||','||col2 FROM tab ...
assuming the ',' is the delimiter or use whatever used in the delimited list.
- send the big column into a transformer stage and translate the delimiter into @FM just before passing into the Pivot stage ...

Note: This option seems to be more flexible to handle changing number of incoming columns.



cwong

Posted: Sat Nov 23, 2002 3:39 pm
by ray.wurlod
In option #1 the function you require is Convert (there is no Translate function), for example:
Convert(",", @FM, col3)

Your option #2 in interesting in theory - have you tested it? As far as I know, the field mark character is not the internal delimited used by DataStage generally, but I'm prepared to learn new stuff; if the Pivot stage uses the field mark internally your option #2 would work.

The most flexible approach to handling a variable number of incoming columns is to use the multi-valued column capability of UniVerse tables; load the table un-normalized then select from it normalized on the association or non-associated multi-valued column. Of course this requires some knowledge to set up the UniVerse table with columns defined to be multi-valued and, perhaps, associated.

By and large, though, Cristal, I believe you are trying to add too much complexity to what is really a simple problem with a simple solution. In DataStage simplicity is the foundation of performance.



Edited by - ray.wurlod on 11/23/2002 15:40:52

Posted: Sun Nov 24, 2002 10:04 am
by cwong
Thanks for correcting the use of function ... the time I typed in my reply here, I just finished coding the transalte func in a SQL stmt to DB2 and got mixed up.

Yes, I am not a UV guru and have never use the multi-valued column capability of UniVerse tables. Your suggestion would be complicated to me.