Normalize a field on Parallel EXTENDER PX

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
androus17121
Participant
Posts: 5
Joined: Thu Jun 02, 2005 1:33 pm

Normalize a field on Parallel EXTENDER PX

Post by androus17121 »

Hello,
I was wondering if anyone know or perhaps provide some solution to the following?

I need normalize each row on a field. I know we can do it with hashed file bu i'm developping on parallel job.

source :
TABLE1 (ID_COUNTRY, LANGUAGE) --> the column LANGUAGE is very variable from NULL to 400 sites
row1 --> [1, "US, FR, IT, GR"]
row2 --> [2, "GR, BG"]


my objectif :
TABLE2 is (ID_COUNTRY, LANGUAGE)

[1, "US"]
[1, "FR"]
[1, "IT"]
[1, "GR"]

[2, "GR"]
[124, "BG"]

I work with DSv7.5.1 EE, AIX 5.2 and ORACLE 921

Thanks in advance
pnchowdary
Participant
Posts: 232
Joined: Sat May 07, 2005 2:49 pm
Location: USA

Post by pnchowdary »

Hi,
I know we can do it with hashed file bu i'm developping on parallel job.
Why don't you try to use Lookup,Join or Merge stage to achieve this, as these are all parallel job stages, equivalent to a hashed file?
Thanks,
Naveen
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi And welcome aboard :),
You could read the data and in a transformer build a row foreach element (use Convert to change the comma to LF) to a seq file;
So you should get something like:
1
US
FR
IT
GR
2
GR
BG
After you have this simply read the file and store the number in a SV (stage Variable) building the new rows by concatenation of SV and values and constrainting the numeric values not to output in a constraint.
After this you should have:
1,US
1,FR
1,IT
1,GR
2,GR
2,BG

IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
androus17121
Participant
Posts: 5
Joined: Thu Jun 02, 2005 1:33 pm

Post by androus17121 »

Thank you Roy your solution,

i succeeded to create a sequential flat like that :

<LINE>
1
US
FR
IT
GR
2
GR
BG

but i didn't find how i can store my number in a stage variable for having in output :

<ID_COUNTRY>,<LANGUAGE>
1,US
1,FR
1,IT
1,GR
2,GR
2,BG

Thanks in advance,
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
The logic is quite simple:
Sv = Stage variable

No EE avail now so can't recall specific routine for testing numeric value!,
So find it and replace that in the "code" below


The code Should be something like this (might need debugging no DS avail):

Code: Select all

SvIsNewNumber  <<
 If <TestNumericRoutine>(Link.col) And Link.col <> SvPrevNumber Then @TRUE Else @FALSE
SvNewNumber     <<
 If SvIsNewNumber Then Link.col Else SvOldNumber
SvOldNumber <<
 If IsNull(SvOldNumber) Or SvIsNewNumber Then Link.col Else SvOldNumber
IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
Post Reply