Search found 34 matches
- Thu Aug 03, 2006 8:31 pm
- Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
- Topic: How to preventing the join stage from sorting the records
- Replies: 18
- Views: 7847
- Wed Aug 02, 2006 5:16 pm
- Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
- Topic: Lookup Error Failed to match node
- Replies: 5
- Views: 3092
- Tue Aug 01, 2006 7:34 pm
- Forum: IBM<sup>®</sup> Infosphere DataStage Server Edition
- Topic: Error while compiling a job
- Replies: 3
- Views: 1845
- Sat Jul 29, 2006 11:45 pm
- Forum: IBM<sup>®</sup> Infosphere DataStage Server Edition
- Topic: Duplicate Records
- Replies: 30
- Views: 15727
Alternatively you can do it in a single job using the sql below. SQL is written for oracle. Should also work for db2. Not sure of other databses. src-------xfm------seq1 | | seq2 with duplicate_rows as (select cola, colb, count(1) as cnt from srcTable group by cola, colb having count(1)>1) select sr...
- Sat Jul 29, 2006 3:01 pm
- Forum: IBM<sup>®</sup> Infosphere DataStage Server Edition
- Topic: Duplicate Records
- Replies: 30
- Views: 15727
For example you have a,b a,b x,y y,x c,d c,d select col, col2, count(1) from table group by col1, col2 having count(1) > 1. Load these into a hash file by specifying col1 and col2 as keys. thsi will yield a, b, 2 c, d, 2 This you will load in a hash file specifying col1, col2 as he keys In step 2 yo...
- Sat Jul 29, 2006 2:37 pm
- Forum: IBM<sup>®</sup> Infosphere DataStage Server Edition
- Topic: Catch Duplicate rows
- Replies: 12
- Views: 7011
Did you try both steps?? Step 1 identifies the the values of col1 and col2 that are duplicates. select col, col2, count(1) from table group by col1, col2 having count(1) > 1. Load these into a hash file by specifying col1 and col2 as keys. For the example you have given a,b a,b x,y y,x c,d c,d thsi ...
- Fri Jul 28, 2006 5:40 pm
- Forum: IBM<sup>®</sup> Infosphere DataStage Server Edition
- Topic: Duplicate Records
- Replies: 30
- Views: 15727
- Fri Jul 28, 2006 4:33 pm
- Forum: IBM<sup>®</sup> Infosphere DataStage Server Edition
- Topic: Catch Duplicate rows
- Replies: 12
- Views: 7011
Step 1. Identify the duplicate rows using the following sql select col, col2, count(1) from table group by col1, col2 having count(1) > 1. Load these into a hash file by specifying col1 and col2 as keys. Step 2. Read the source again. Use the hash file created above as a lookup. For each row see if ...
- Thu Jul 27, 2006 10:00 pm
- Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
- Topic: Unable to convert integer date (YYYYMMDD) to date
- Replies: 14
- Views: 19449
Re: Unable to convert integer date (YYYYMMDD) to date
Transformer:StringToDate(lkup_cust_out.OHEDAT,"%yyyy-%mm-%dd")-- Output is *********. Since you said your date is in the format YYYYMMDD you will need to specify StringToDate(lkup_cust_out.OHEDAT,"%yyyy%mm%dd") in you transformer I also has a Julian date YYYYDDD (2006207) need to...
- Thu Jul 20, 2006 8:15 pm
- Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
- Topic: DB2 equivalent of substr()
- Replies: 9
- Views: 4960
- Thu Jun 29, 2006 9:28 am
- Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
- Topic: Splitting a file into multiple files based on first column
- Replies: 5
- Views: 5752
My 2 cents, I would not do it using Datastage, simply because it will be too tedious and the job very cumbersome. I would rather use a shell script (perl script / a java program) to achieve what you want to. A simple algorithm would be as follows Preferably (but not necessary) Sort the data based on...
- Thu Jun 29, 2006 8:13 am
- Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
- Topic: Difference between Explicit sort and Sort on partition
- Replies: 3
- Views: 2052
Difference between Explicit sort and Sort on partition
This is a question that was lingering in my mind for a long time and thought it best to take the opinions from the forum. This question is related to stages like Join/Aggregator where the data need to be partitioned and sorted. In the chapter pertaining to the Join stage for the parallel job develop...
- Wed Jun 07, 2006 4:41 pm
- Forum: IBM<sup>®</sup> Infosphere DataStage Server Edition
- Topic: Appending values from two columns into one
- Replies: 10
- Views: 3739
- Wed Jun 07, 2006 3:49 pm
- Forum: IBM<sup>®</sup> Infosphere DataStage Server Edition
- Topic: Appending values from two columns into one
- Replies: 10
- Views: 3739
- Fri Jun 10, 2005 12:38 pm
- Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
- Topic: ABS() function
- Replies: 14
- Views: 14590