Equivalent to Incomplete Column in Parallel Sequential File

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
thiagol
Participant
Posts: 3
Joined: Wed Feb 16, 2005 8:08 am

Equivalent to Incomplete Column in Parallel Sequential File

Post by thiagol »

I have a sequential file that has different record types.

F1,RT45,TRADED,20050215
F2,RT45,TRADED,459999959,5389522.660000,L,USD
F3,20040926,1

I am using a Sequential File to read the file and then I plan to break up the file with a switch stage or a Transformer stage.

My metadata is very generic where every column is a varchar.

In the documentation it says that I can set the Incomplete Column property to fill the null values, but I can only find that property in a Server Job.

Question is: What's the equivalent to the Incomplete Column property in a Parallel Job. Is there another way to achieve the same goal ?
richdhan
Premium Member
Premium Member
Posts: 364
Joined: Thu Feb 12, 2004 12:24 am

Post by richdhan »

Hi Thiagol,

How about using runtime column propagation. The runtime column propagation can be enabled using Datastage Administrator and since you are using a sequential file you need a schema file.

HTH
Rich
thiagol
Participant
Posts: 3
Joined: Wed Feb 16, 2005 8:08 am

Post by thiagol »

Rich, Thank you for your prompt response, but I am not sure using runtime column propagation will solve my problem.

The documentation states this about the runtime column propagation:
You can define part of your schema
and specify that, if your job encounters extra columns that are not defined
in the meta data when it actually runs, it will adopt these extra columns
and propagate them through the rest of the job.

I have defined a generic metadata where the number of columns is the same as the number of columns in the largest record, which are my F2s records. What I am trying to achieve is to fill with null values the columns that are not present in the other records.
battaliou
Participant
Posts: 155
Joined: Mon Feb 24, 2003 7:28 am
Location: London
Contact:

Post by battaliou »

Hello thiagol

If you've established your number of columns, you could always "doctor" your incoming data in a transformer which adds the extra columns onto your source. To do this, you simply read your input data as a single string (ie only one column in the source stage), use dcount to establish the number of columns, and append any extra ,'s as required.

Alternatively, you could concatenate the maximum number of coulmns ',s onto every string you read and simply ignor any columns outside of the required range.

Use the field command to extract individual coulmns inside the transformer.

Regards
Martin Battaliou
3NF: Every non-key attribute must provide a fact about the key, the whole key, and nothing but the key. So help me Codd.
dsxuserrio
Participant
Posts: 82
Joined: Thu Dec 02, 2004 10:27 pm
Location: INDIA

Post by dsxuserrio »

Thiagol
I dont think there is an equivalent to incomplete column action item in Parallel version. If you take a closer look there is another thing missing. Data Element drop down and other items.

The basic reason is underlying infrastructure for all Parallel Extender stages is Orchestrate library. In PX sequential file is a simple import with a given schema. Nothing more.

Code: Select all

#### STAGE: Sequential_File_1
## Operator
import
## Operator options
-schema record
  {final_delim=end, delim=',', quote=double}
(
  a:string[];
  b:ustring;
)
-file  '/tmp/test.dat'
-rejects continue
-reportProgress yes

Where as Server jobs have a different underlying model or whatever. That explains some of the differences like hash file , BASIC .
dsxuserrio

Kannan.N
Bangalore,INDIA
Post Reply