Working with variable length record

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
vzmz
Participant
Posts: 36
Joined: Sun Nov 23, 2003 12:10 pm
Location: Dallas

Working with variable length record

Post by vzmz »

I have a file of variable length records out of which i have to take out only 2 types of records of size 263 and 265, means all the record whose size are of either 263 or 265 should be filtered out.
How can i do it ?
Thanks
girishoak
Participant
Posts: 65
Joined: Wed Oct 29, 2003 3:54 am

Post by girishoak »

In transformation stage put a constraint as follows

Code: Select all

 LEN(record string) =! 263 and len(record string) !=265
If you have multiple columns, you need to concatinate them to get a single record.

All the best

Girish Oak
Amos.Rosmarin
Premium Member
Premium Member
Posts: 385
Joined: Tue Oct 07, 2003 4:55 am

Post by Amos.Rosmarin »

The is no stage that can handle variable record format in an open environment but you can do the following :

You can read the file as a single column of 265.
then split the row according to the record id (using substring)
and write it a a single field into a sequential stage.
Now you can read each file with it's own table definition using the original delimiter.


HTH,
Amos
vzmz
Participant
Posts: 36
Joined: Sun Nov 23, 2003 12:10 pm
Location: Dallas

Not reading any record of size 263

Post by vzmz »

Folks the thing is when i have to a squential stage for the file with max len of 265 define for columns.
For LEN(record string) =! 263 and len(record string) !=265 to work
The stage has to out put both the record size to the transform stage, whereas it just gives you only those records that are 265 to the transform stage leaving 263 behind so u end up having only those records of 265.

Even if i tried 263 it will give all 263 and 265 removing 2 char from the end of 265 record size.

What i have figure out is if i am able to read both the records from the sequential stage i can easzly segregate the 2.
Thanks
clshore
Charter Member
Charter Member
Posts: 115
Joined: Tue Oct 21, 2003 11:45 am

Post by clshore »

In a PX job you can use the 'Column Import' stage to 'parse apart a single binary or string column into multiple columns'.
Use a Transformer stage, with Constraints for the input record length to route the record to a CI stage that will convert it (or maybe a Filter stage, though I have not yet used one myself). You can define the input record to the Transformer as a single column varchar or varbinary to handle the variable length without truncating. You create a CI stage for each of the record formats that you will process.

A CI stage can also use an Orch schema file to define the target at runtime, which can be supplied as a job parm. Be sure to enable Runtime Column Propagation if you do this.

Carter
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Can't you just use a contraint in the first transform and have two output links - one for 263 and one for 265?

Depending on your data and how you setup the sequential stage, you may need to Trim the record string to tell the difference between the two.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Teej
Participant
Posts: 677
Joined: Fri Aug 08, 2003 9:26 am
Location: USA

Post by Teej »

This is a Parallel Extender Question and Answer

What is asked:

How can I drop records from a sequential file that is of a specific length? How can I drop records that have a multiple set of specified lengths?

Answer:

Create a simple and independent job. For this job, you will have a Sequential Stage outputting to a Transform. You can have either one or two output link, depending on whether you wish to preserve those dropped records or not. If you do not need to preserve those dropped records, you only need one output link. If you do, you will need to create a reject LINK (not reject constraints, which is different).

On your sequential stage, have a single VARCHAR field of an undefined length. You do not wish to have any field defaults, so set both delimiter and quote to none. On the record level property, you need Record Delimiter to be an UNIX newline (which is what you noted as your operating system of choice).

On the transform stage, output the field, and have a constraint defined:

Code: Select all

len(inputlink.fieldname) <> [length you wish to filter out]
If you have more than one lengths, add an "or" after the above statement, and repeat.

Any fields that fail the above constraint are dropped, not sent to the rejected constraint. This may be a 6.x issue, and not present in 7.x. I would advise that you run a small test to verify this behavior.

On your output link, I would advise preserving it to another sequential file, however, it is possible to split up the fields into defined format that you may wish to use for this set of data.

-T.J.
Developer of DataStage Parallel Engine (Orchestrate).
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Teej wrote:This is a Parallel Extender Question and Answer
Hey T.J. - What in your answer is exclusive to PE? Isn't this a generic solution that would apply equally well to PE and Server jobs? :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
Teej
Participant
Posts: 677
Joined: Fri Aug 08, 2003 9:26 am
Location: USA

Post by Teej »

chulett wrote:Hey T.J. - What in your answer is exclusive to PE? Isn't this a generic solution that would apply equally well to PE and Server jobs? :?
I qualified it as a PX solution because I only tested that particular solution on PX. I suspect it would work on Server too, but I am not that confident, especially with how PX drop instead of reject the constrainted records.

-T.J.
Developer of DataStage Parallel Engine (Orchestrate).
vzoubov
Participant
Posts: 28
Joined: Tue Feb 05, 2002 12:30 pm
Location: Boston, MA

Re: Working with variable length record

Post by vzoubov »

[quote="vzmz"]I have a file of variable length records out of which i have to take out only 2 types of records of size 263 and 265, means all the record whose size are of either 263 or 265 should be filtered out.
How can i do it ?
Thanks[/quote]

A variable length record file can be imported by a PX External Source stage. You can easily filter out the relevant records in the program called by the stage on file import.

Vitali.
Post Reply