Transformer output rows reduced!

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

dsproj2003
Participant
Posts: 21
Joined: Wed Oct 01, 2003 11:53 am

Transformer output rows reduced!

Post by dsproj2003 »

Hi,

I have a transfomer stage which has 'x' # of rows in input stream.
There is no constraint on the transformer.

Even then, the output is 'y' # of rows where y < x.

There are no corresponding warnings/messages in the log file.

Any pointers to this?

Thanks in advance.

Regards,
Nitin
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Nitin

If you are talking about in DataStage monitor then I have never seen that before. If you are talking about doing counts on the tables before and after a job runs then you could have 2 records with the same key and the update the same record in the target table.

If the first option then what OS and what version of DS and what database?

Kim.

Kim Duke
DwNav - ETL Navigator
www.Duke-Consulting.com
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Please supply your source and target stage types. Where are you getting your counts? From the Monitor or are you line counting the source and targets?

Kenneth Bland
badhri
Participant
Posts: 42
Joined: Tue Mar 19, 2002 8:15 pm

Post by badhri »

May be we should check the Update strategy in the target stage. This can happen if that had a Insert/Update or Update/Insert strategy.

Badhri ...

Badhrinath Krishnamoorthy
www.cognizant.com
dsproj2003
Participant
Posts: 21
Joined: Wed Oct 01, 2003 11:53 am

Post by dsproj2003 »

Okay further details are as below:

Source Stage: Dataset (or an output stream from a previous lookup stage)

Target stage: DataSet

I am viewing the number of records flowing through each link using either of the following:
-'Show performance statistics' Option
-(job) Monitor

Both essentially give the same data.

I suspect if this issue of dropping of records is related to NULL in some field values?

Please suggest.

Regards,
Nitin
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Nitin

If that was true then you would get a warning in the log. Also what type is your source and target like ODBC, OCI or whatever.

Kim.

Kim Duke
DwNav - ETL Navigator
www.Duke-Consulting.com
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Okay folks, he's on Parallel Extender, and he's talking about a ".ds" dataset file. We're all firing off Server answers and he's got a parallel job.

Kenneth Bland
dsproj2003
Participant
Posts: 21
Joined: Wed Oct 01, 2003 11:53 am

Post by dsproj2003 »

Kim,

My source and target stages are simply Data Sets (or a Lookup stage).

I am not using any ODBC, or any database calls for the jobs in question.

As I said I am not getting any warning messages corresponding to the transformer in the log file.

Data Stage is PX 6.0.

[?]

Regards,
Nitin
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Nitin, you're going to have to be VERY explicit with your posts. There's over 2000 DataStage installations out there of DataStage Server, and only tens of PX, so everyone assumes Server based questions unless PX is stated.

What is your partitioning scheme? Have you specified unique? Did you switch node pools in between?

Kenneth Bland
dsproj2003
Participant
Posts: 21
Joined: Wed Oct 01, 2003 11:53 am

Post by dsproj2003 »

I understand Kenneth.

I will try to be more explicit now..

-partitioning scheme: It is set to 'Auto' in all prior stages in that job. Data sets being used are set with Preserve partitioning = 'Default (Propagate)'.

-Job Nodes Pools: 4 nodes i.e. 4x4

No, I am not switiching node pools in between. I am keeping 4x4 nodes for all jobs.

Btw I did not understand your following question:
Have you specified unique?

Regards,
Nitin
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Just a thought. It might be worth checking whether a DataSet can have duplicate rows or whether, like hashed files in server jobs, a duplicate (key) performs a destructive overwrite (with no warnings).

Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

There is a checkbox somewhere when specifying the partitioning to eliminate duplicate rows (UNIQUE option?). I'm speaking from memory, as I haven't worked with PX in a year. Check your documentation.

Kenneth Bland
dsproj2003
Participant
Posts: 21
Joined: Wed Oct 01, 2003 11:53 am

Post by dsproj2003 »

Based on the ongoing discussion, I have certain related questions:

-When we talk of transformer, there is no key like thing isnt it.
So where does the question of two records being same come in here.

Is it something related to the dataset?

I even tried having the sequential file as output from transformer. And even that has same probelm( receving lesser # of records than in input with no constraint)

What is the solution? I mean my requirement is to have all the records in input stream (unique or non unique) sent to the output stream of the transformer.

Could you please clarify the concept if I am missing out on something.

Regards,
Nitin
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

I would refer you to www.datastagexchange.com, where there's a PX specific forum moderated by bigpoppa. You'll probably get the best answers there.

Kenneth Bland
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It's really difficult to diagnose this without an explicit explanation of the job design. For example, you did not mention whether there is any constraint on the Transformer stage's output link. If there were, it would be expected to limit the number of rows output.
Were I consulting to solve this, I would need to look at the job in detail, either on site or by having had an export of the job plus sample data mailed to me.
Post Reply