I have a sequential file that has different record types.
F1,RT45,TRADED,20050215
F2,RT45,TRADED,459999959,5389522.660000,L,USD
F3,20040926,1
I am using a Sequential File to read the file and then I plan to break up the file with a switch stage or a Transformer stage.
My metadata is very generic where every column is a varchar.
In the documentation it says that I can set the Incomplete Column property to fill the null values, but I can only find that property in a Server Job.
Question is: What's the equivalent to the Incomplete Column property in a Parallel Job. Is there another way to achieve the same goal ?
Equivalent to Incomplete Column in Parallel Sequential File
Moderators: chulett, rschirm, roy
Rich, Thank you for your prompt response, but I am not sure using runtime column propagation will solve my problem.
The documentation states this about the runtime column propagation:
You can define part of your schema
and specify that, if your job encounters extra columns that are not defined
in the meta data when it actually runs, it will adopt these extra columns
and propagate them through the rest of the job.
I have defined a generic metadata where the number of columns is the same as the number of columns in the largest record, which are my F2s records. What I am trying to achieve is to fill with null values the columns that are not present in the other records.
The documentation states this about the runtime column propagation:
You can define part of your schema
and specify that, if your job encounters extra columns that are not defined
in the meta data when it actually runs, it will adopt these extra columns
and propagate them through the rest of the job.
I have defined a generic metadata where the number of columns is the same as the number of columns in the largest record, which are my F2s records. What I am trying to achieve is to fill with null values the columns that are not present in the other records.
Hello thiagol
If you've established your number of columns, you could always "doctor" your incoming data in a transformer which adds the extra columns onto your source. To do this, you simply read your input data as a single string (ie only one column in the source stage), use dcount to establish the number of columns, and append any extra ,'s as required.
Alternatively, you could concatenate the maximum number of coulmns ',s onto every string you read and simply ignor any columns outside of the required range.
Use the field command to extract individual coulmns inside the transformer.
Regards
Martin Battaliou
If you've established your number of columns, you could always "doctor" your incoming data in a transformer which adds the extra columns onto your source. To do this, you simply read your input data as a single string (ie only one column in the source stage), use dcount to establish the number of columns, and append any extra ,'s as required.
Alternatively, you could concatenate the maximum number of coulmns ',s onto every string you read and simply ignor any columns outside of the required range.
Use the field command to extract individual coulmns inside the transformer.
Regards
Martin Battaliou
3NF: Every non-key attribute must provide a fact about the key, the whole key, and nothing but the key. So help me Codd.
-
- Participant
- Posts: 82
- Joined: Thu Dec 02, 2004 10:27 pm
- Location: INDIA
Thiagol
I dont think there is an equivalent to incomplete column action item in Parallel version. If you take a closer look there is another thing missing. Data Element drop down and other items.
The basic reason is underlying infrastructure for all Parallel Extender stages is Orchestrate library. In PX sequential file is a simple import with a given schema. Nothing more.
Where as Server jobs have a different underlying model or whatever. That explains some of the differences like hash file , BASIC .
I dont think there is an equivalent to incomplete column action item in Parallel version. If you take a closer look there is another thing missing. Data Element drop down and other items.
The basic reason is underlying infrastructure for all Parallel Extender stages is Orchestrate library. In PX sequential file is a simple import with a given schema. Nothing more.
Code: Select all
#### STAGE: Sequential_File_1
## Operator
import
## Operator options
-schema record
{final_delim=end, delim=',', quote=double}
(
a:string[];
b:ustring;
)
-file '/tmp/test.dat'
-rejects continue
-reportProgress yes
Where as Server jobs have a different underlying model or whatever. That explains some of the differences like hash file , BASIC .
dsxuserrio
Kannan.N
Bangalore,INDIA
Kannan.N
Bangalore,INDIA