Page 1 of 1

Splitting Concatenated Strings

Posted: Fri Nov 12, 2004 4:33 am
by JezT
I have a file consisting of multiple rows containing multiple Surrogate Keys concatenated together against a single Natural Key.

For example,

Code: Select all

 Surr Key     Nat Key
1:2:3:4       ABC
5:6:7:8       DEF
I am now wanting to split this concatenated string up so that for every instance of a surrogate key, a row is created, as below.

Code: Select all

 Surr Key     Nat Key
          1                ABC
          2                ABC
          3                ABC
          4                ABC
          5                DEF
          6                DEF
          7                DEF
          8                DEF
I am wanting to either action this in a transformer or within an INSERT statement on DB2 if possible as the data will be loaded into a DB2 table.

Any suggestions ?

Jez

Posted: Fri Nov 12, 2004 7:48 am
by T42
Pivot Stage.

Posted: Fri Nov 12, 2004 8:00 am
by sonia jacob
Hi,

you could try replacing the ":" with Nat Key : Char(013) : Char(010).

ereplace(Surr Key : ":" , ":", trim(Nat Key) : Char(013) : Char(010)).

This would work only if you are sure about the delimitor in the Surr Key.

If a output file required is a delimited one then following is to be done
1. Concatinate the delimitor required, in the ereplace statement itself. eg
ereplace(Surr Key : ":" , ":", '|' : trim(Nat Key) : Char(013) : Char(010)). If | (pipe symbol) is the delimitor
2. In the format of the output file (sequential stage) the delimitor is to be supressed (000)

If a output file required is a fixed width one then following is to be done
1. Put appropriate padding for the columns

Hope it helps.

Sonia

Posted: Fri Nov 12, 2004 8:50 am
by chucksmith
Read your file, and write to a hash file. NatKey will be the primary key in the hash file. SurrKey will be a data column in the hash file.

Set the derivation for SurrKey to

Code: Select all

Convert(":", @VM, SurrKey)
This converts the colons to value marks.

After the hash file hash been loaded, read it, but on the Output/General tab of the hash file stage, set the Normalize on drop box to SurrKey, your surrogate key column.

View data, and you will see the results you want. :wink:

better solution

Posted: Fri Nov 12, 2004 9:38 am
by 1stpoint
IF this is a flat file, it is usually better to pre-process the incoming flat file outside of datastage and create a new flat file. Yes, you can use DataStage by creating a Hash stage, etc. but the processing may be slower than a simple script (perl or python (I prefer python)) that will preprocess the file and create an input file for you.

You can then call the script from your DataStage Batch/Sequence.