Page 1 of 1

Posted: Tue Jan 21, 2003 12:41 am
by raviyn
U can split on the basis of the @INROWNUM System variable.
In the transformer put constraints on the above variable for the two links.

Posted: Tue Jan 21, 2003 7:46 am
by chulett
That assumes, of course, that you know the number of rows coming in. If you don't, you are going to need to do something to capture that number (assuming that it is a variable) and send it in to your job. You could write a job to count the rows and pass that information, suitably divided, to the splitting job. On Unix, you can take advantage of commands launched by Job Control to do "wc -l" on the file to get the number of records. Depending on what needs to happen with the two files, you could script something using both "wc" and "split" and then process the split files with DataStage.

As they say, more than one way to skin a cat.

-craig

Posted: Tue Jan 21, 2003 3:56 pm
by ray.wurlod
FWIW, it is also possible, using DataStage BASIC code, to obtain the number of lines in a file opened for sequential access. This might be relevant, for example, on a Windows platform where you don't have access to the wc command.

Posted: Tue Jan 21, 2003 11:50 pm
by ray.wurlod
Aggregate (rule "Last") with output into a hashed file record with key of "X".
Refer to this record in subsequent Transformer stage, in which its output link contstraints are drive by "< (value/2)" and ">= (value/2)"


Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518

Posted: Tue Jan 21, 2003 11:52 pm
by ray.wurlod
Does it HAVE to be first half and second half?
Would odd rows and even rows be an acceptable split?
In this case, you don't need to pre-count the rows, and can use the Mod(@INROWNUM,2) and Not(Mod(@INROWNUM,2)) constraint expressions.


Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518

Posted: Wed Jan 22, 2003 3:00 am
by raviyn
I think ray has explained the first part of it
What i meant by splitting with the help of @INROWNUM is something like in one link in transformer the constraint would be @INROWNUM < 500 and in another @INROWNUM >=500....Now if plan to split it exactly into two with the no. of records unkown first u need to get the count of records which if u are not comfortable with routines might look in two jobs :
1) in this job u will count the no. of records with the help of aggregator possibly then pass a parameter(VALUE) to next job with this count
2) in the other job ur constraints might be like < Value/2 and >=Value/2 which is as told by ray earlier only.

But if u are just trying to split based on the way the way the i/p sequential file is coming then whatz the second part of the question u are asking?[?]

Posted: Wed Jan 22, 2003 4:29 am
by ray.wurlod
It CAN be done all in one job, because DataStage cannot open the output link from the Hashed File stage until its input link is closed (by the fact of receiving an "end of data" token).


Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518