File Stage

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
vzmz
Participant
Posts: 36
Joined: Sun Nov 23, 2003 12:10 pm
Location: Dallas

File Stage

Post by vzmz »

Hi folks,
I am been playing with file stage from couple of days.
I have a fixed length file which may have extra length of chars.
In the columns of the stage i have mention all the fields as char with their respective sizes since all of my records are of fixed width.
And i have a reject link.
Lets take the total size of the record is 12 char

fields one f1 of 4
fields one f2 of 4
fields one f3 of 4
--------------------
Total 12

Lets take some samples

1234ABCD1234(size 12) Fixed width
ABCD1234ABCD(size 12) Fixed Width
1234567890ab(size 12) Fixed Width
1234abcd1234a( szie 13) One extra char
abcd1234abc(size 11) One char less

Its a unix based file
I have to do some validation and transformation (ETL process)

The first the foremost step is to have a stage and define the schema in which i will be reading it.

I wanted read only those records whose size is 12 and rest all to be rejected(I mean pushed to the reject link).
My question is how shoud my schema be ?

Under Property Tab
-----------------------
I specify the file name and reject mode to be output


Under Format Tab
---------------------
I am stuck here
we have records level and field defualt. What i am doing is
Under record level --
Record delimitor as "unix new line"(since its a unix base file)
Record Length as "fixed"(We have "end "and "none" options too)

Under Field Delimiter--
Delimitor = "None"

Under Columns Tab
-----------------------
f1 char (4)
f2 char(4)
f3 char(4)


I wanted to run this job without any warnings, what more modification i have to do and i wanted to reject all records whose size is less or more than 12 char.

When i click on View Data. Any Records whose size is more or less than 12 are rejected.
When i run the job i get warnings at those location were the data is more or less

Warning like

-did not consume entire input
-Input buffer overrun

I don't want this warrings.
What changes i have to make or do in my step to ge the rejected records as well as no warning when i execute the job.

Thanks in advance
Teej
Participant
Posts: 677
Joined: Fri Aug 08, 2003 9:26 am
Location: USA

Re: File Stage

Post by Teej »

vzmz wrote:I wanted read only those records whose size is 12 and rest all to be rejected(I mean pushed to the reject link).
My question is how shoud my schema be ?
You are asking how to handle dynamic lengths for a flat file input (i.e. bad fields).

1. Use QualityStage.

OR

1. Set everything to varchar of an undefined length.

2. Establish a transform stage after the input. Within this transform stage, put a constraint (using len() function). Send the rejected records to whatever you want to do with it. Experiment a bit with this concept, so you can fully understand how it works.

-T.J.
Developer of DataStage Parallel Engine (Orchestrate).
vzmz
Participant
Posts: 36
Joined: Sun Nov 23, 2003 12:10 pm
Location: Dallas

Re: File Stage

Post by vzmz »

Teej wrote:
vzmz wrote:I wanted read only those records whose size is 12 and rest all to be rejected(I mean pushed to the reject link).
My question is how shoud my schema be ?
You are asking how to handle dynamic lengths for a flat file input (i.e. bad fields).

1. Use QualityStage.

OR

1. Set everything to varchar of an undefined length.

2. Establish a transform stage after the input. Within this transform stage, put a constraint (using len() function). Send the rejected records to whatever you want to do with it. Experiment a bit with this concept, so you can fully understand how it works.

-T.J.
But that way i won't be able to make use of the reject link. I am doing something else in the transformations.
If i add this into transofrmation its getting more complex

If i make it a varchar then no rows will be rejected, so i end up doing that in a transofrmation stage.

Apart from this how can i make use of reject link in a file stage withour warnings
vzoubov
Participant
Posts: 28
Joined: Tue Feb 05, 2002 12:30 pm
Location: Boston, MA

Re: File Stage

Post by vzoubov »

vzmz wrote:
Teej wrote:
vzmz wrote:I wanted read only those records whose size is 12 and rest all to be rejected(I mean pushed to the reject link).
My question is how shoud my schema be ?
You are asking how to handle dynamic lengths for a flat file input (i.e. bad fields).

1. Use QualityStage.

OR

1. Set everything to varchar of an undefined length.

2. Establish a transform stage after the input. Within this transform stage, put a constraint (using len() function). Send the rejected records to whatever you want to do with it. Experiment a bit with this concept, so you can fully understand how it works.

-T.J.
But that way i won't be able to make use of the reject link. I am doing something else in the transformations.
If i add this into transofrmation its getting more complex

If i make it a varchar then no rows will be rejected, so i end up doing that in a transofrmation stage.

Apart from this how can i make use of reject link in a file stage withour warnings
Instead of a transformer you can use modify and switch:

osh "
import
-schema record
(
s:string;
)
-file 'var_length.dat'
-rejects continue
-reportProgress yes
|
modify
'l = string_length(s); s=s;'
|
switch
-case 12=0
-ifnotfound allow
-key l

0> good_record.v
1> bad_record.v;

peek
-name
<good_record.v;

peek
-name
<bad_record.v;
"
Vitali.
Teej
Participant
Posts: 677
Joined: Fri Aug 08, 2003 9:26 am
Location: USA

Re: File Stage

Post by Teej »

vzmz wrote:But that way i won't be able to make use of the reject link. I am doing something else in the transformations.
If i add this into transofrmation its getting more complex
This is the type of philosophy that you need to end. This is the ETL world, where things are far easier to implement and try. Why are you so concerned with using a 'reject' link, when you can practically handle the 'rejected' records on your own with only 10 minutes work?

And why do you have to do everything within one transform stage, when you can use two?

I have struggled with this mindset at my job that it is driving me to the point where I am ready to quit. It is bad enough to even have to explain why using parameters are _GOOD_, not _BAD_, as one person seems to have a fixtation on.

Instead of painting yourself in a corner by requiring that you MUST use a reject link, try questioning on what your requirements really are. I bet it does not even care that there's a broken line link out your Oracle stage, as long as something gets reported somewhere.

-T.J.
Developer of DataStage Parallel Engine (Orchestrate).
vzmz
Participant
Posts: 36
Joined: Sun Nov 23, 2003 12:10 pm
Location: Dallas

Re: File Stage

Post by vzmz »

Teej wrote:
vzmz wrote:But that way i won't be able to make use of the reject link. I am doing something else in the transformations.
If i add this into transofrmation its getting more complex
This is the type of philosophy that you need to end. This is the ETL world, where things are far easier to implement and try. Why are you so concerned with using a 'reject' link, when you can practically handle the 'rejected' records on your own with only 10 minutes work?

And why do you have to do everything within one transform stage, when you can use two?

I have struggled with this mindset at my job that it is driving me to the point where I am ready to quit. It is bad enough to even have to explain why using parameters are _GOOD_, not _BAD_, as one person seems to have a fixtation on.

Instead of painting yourself in a corner by requiring that you MUST use a reject link, try questioning on what your requirements really are. I bet it does not even care that there's a broken line link out your Oracle stage, as long as something gets reported somewhere.

-T.J.
Thanks Teej

As u told me i have done the validation using the transformation stage instead of reject link. Basically i have to make 2 jobs one for the length validation and the other for nulls and dups and then i sequenced it with a job sequencer.
Teej
Participant
Posts: 677
Joined: Fri Aug 08, 2003 9:26 am
Location: USA

Re: File Stage

Post by Teej »

vzmz wrote:As u told me i have done the validation using the transformation stage instead of reject link. Basically i have to make 2 jobs one for the length validation and the other for nulls and dups and then i sequenced it with a job sequencer.
Hey, that's a pretty good idea. I would have done it in one job, but if one fails, the other would not run the necessary task.

My problem is breaking a job up in manageable pieces. Hehe. You should see some of the jobs I've made especially after quite a few extra requirements came down the pipeline after the job was designed.

-T.J.
Developer of DataStage Parallel Engine (Orchestrate).
Mike3000
Participant
Posts: 24
Joined: Mon Mar 26, 2007 9:16 am

Post by Mike3000 »

Thanks Teej for the excellent lines that you wrote!!!
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
This is the type of philosophy that you need to end. This is the ETL world, where things are far easier to implement and try. Why are you so concerned with using a 'reject' link, when you can practically handle the 'rejected' records on your own with only 10 minutes work?

And why do you have to do everything within one transform stage, when you can use two?

I have struggled with this mindset at my job that it is driving me to the point where I am ready to quit. It is bad enough to even have to explain why using parameters are _GOOD_, not _BAD_, as one person seems to have a fixtation on.

Instead of painting yourself in a corner by requiring that you MUST use a reject link, try questioning on what your requirements really are. I bet it does not even care that there's a broken line link out your Oracle stage, as long as something gets reported somewhere.

-T.J.
""""""""""""""""""""""""""""""""""""""""""""""""""""""""


Many programmers have gone to those trap and continue to go.
I'm not sure why, may be because human nature is to try to
do perfect things that everybody have in theirs minds, even if it
doesn't match the reality (the BRD), but it satisfies their vision
of the problem and they are ready to protect it by any means...
Edited by Mike3000
Post Reply