picking six records from six or more and finally getting one

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ranga1970
Participant
Posts: 141
Joined: Thu Nov 04, 2004 3:29 pm
Location: Hyderabad

picking six records from six or more and finally getting one

Post by ranga1970 »

I have source as

Code:

Regionkey clientid plancode
2 1962 5ab
2 1962 5bc
2 1962 4ac
2 1963 5af
2 1963 4ac
2 1963 5ab
2 1963 5bc
2 1963 4bd
2 1963 5fg
2 1963 4df




regoinkey is same for all, some clients have more than 6 plan codes, some clients have less the 6 plancodes, when a client have more 6 plan codes only min 6 should be picked, if client less than than three i should get others has null, that is i should have exactly 6 reords

that mens above source should become
Code:

Regionkey clientid plancode
2 1962 5ab
2 1962 5bc
2 1962 4ac
2 1962 null
2 1962 null
2 1962 null
2 1963 5af
2 1963 4ac
2 1963 5ab
2 1963 5bc
2 1963 4bd
2 1963 4df




three records with null values are added for client 1962 and 2 2 1963 5fg is deleted for 1963

after this happend all hease records should become one record

Code:

Regionkey clientid code1 code2 code3 code4 code5 code6
2 1962 5ab 5bc 4ac null null null
2 1963 5af 5ac 5ab 5bc 4bd 4df


any suggestion would be appricia
RRCHINTALA
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Re: picking six records from six or more and finally getting

Post by ogmios »

You could do it with some advanced analytic functions in a database but I would start with a BASIC program getting a requirement like that.

Use a query selecting the data in the right order and then you can do the filling of missing plancodes with null or the chopping of more than 6 plancodes and pivoting all at the same time.

Ogmios
In theory there's no difference between theory and practice. In practice there is.
ranga1970
Participant
Posts: 141
Joined: Thu Nov 04, 2004 3:29 pm
Location: Hyderabad

Post by ranga1970 »

Actually it was the reverse case, we develpoed the wharehouse and need populate data from here to SalesForce.com using vitria, for this
I had develpoed a function and stored proceadure to populate data from wharehouse to Vitrai Tables and its taking 48 minutes to populate about 2000 records, since we have hundreds of thousands records for initial load we are trying figure a way out, so we thought of using DS for that agian and we held up datastage logic creation as you said we could use advance functions for that could you please advise me what are those functions so that I could do some R&D
RRCHINTALA
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Post by ogmios »

ranga1970 wrote:... and we held up datastage logic creation as you said we could use advance functions for that could you please advise me what are those functions so that I could do some R&D
With analytic functions I mean SQL advanced magic (I was thinking of Oracle but any database will do).

1) select all distinct regionkeys/clientid combinations and cartesian join them with a dummy (inline SQL) table containing 6 rows with 1 numeric column: 1, 2, 3, 4, 5, 6. These will be the base rows for all regions/clients
2) select all the data from the original table using partitioning over regionkey/clientid and ordering by regionkey/clientid/plancode (Oracle and DB2 for sure have this statement). This will give you the original table with an additional numeric number that breaks over regionkey/clientid (a number that will be reset whenever there's a change of regionkey/clientid). THIS IS THE ANALYTICAL PART.
3) Left outerjoin 1) with 2) over the regionkey/clientid and numeric number. When there are less than 6 entries for a regionkey/clientid you will still get rows with NULL being filled in (because of the left outer join). If there are more than 6 entries per regionkey/clientid you only have the first 6 (in item 1 only rows from 1 to 6 were generated).

Now you can push this data through a DataStage pivot stage, but I would do the pivot also in the database. This would be something as:

Code: Select all

SELECT regionkey, clientid,
  MAX(CASE WHEN nr = 1 THEN plancode else NULL END) AS code1,
  MAX(CASE WHEN nr = 2 THEN plancode else NULL END) AS code2,
  MAX(CASE WHEN nr = 3 THEN plancode else NULL END) AS code3,
  MAX(CASE WHEN nr = 4 THEN plancode else NULL END) AS code4,
  MAX(CASE WHEN nr = 5 THEN plancode else NULL END) AS code5,
  MAX(CASE WHEN nr = 6 THEN plancode else NULL END) AS code6
FROM table_done_in_3
GROUP BY regionkey, clientid
Ogmios

Alternatively:
- Skip 1)
- Do 2) but built in a where clause "where nr <= 6"
- Skip 3 and do the SQL pivot (the pivot will still work with "missing" rows)
In theory there's no difference between theory and practice. In practice there is.
ranga1970
Participant
Posts: 141
Joined: Thu Nov 04, 2004 3:29 pm
Location: Hyderabad

Post by ranga1970 »

Thanks I got till filtering six records with your suggestion, but now coming to pivot stage, I never used and When i was going through help it says use derivation

Say for example now i have

Code: Select all

Regionkey clientid Plancode
2 1963 5af 
2 1963 4ac 
2 1963 5ab 
2 1963 5bc 
2 1963 4bd 
2 1963 4df 
2 1962 5ab 
2 1962 5bc 
2 1962 4ac 
2 1962 null 
2 1962 null 
2 1962 null 

i have to get this as 
2 1962 5ab 5bc 4ac null null null 
2 1963 5af 5ac 5ab 5bc 4bd 4df 
in the pivot
i get the input, and at the out putside now i have columns
Regionkey clientid code1 code2 code3 code4 code5 code6 

what derivation you would sugest for code1 code2 code3 code4 code5 code6 , since client id and regionkey can be directly sent in, does pivot help in this regard

RRCHINTALA
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Post by ogmios »

Don't use the pivot stage, first try the pivot in SQL as shown above (it's faster).

You can do it with pivot stage as well, but the last time I used that one is about 2 years ago :wink: so here you would have to look at the documentation of the stage. (you can find the documentation on your installation cd).

Ogmios
In theory there's no difference between theory and practice. In practice there is.
Post Reply