Working with variable length record

vzmz · Post by **vzmz** » Thu Dec 04, 2003 2:23 am

I have a file of variable length records out of which i have to take out only 2 types of records of size 263 and 265, means all the record whose size are of either 263 or 265 should be filtered out.
How can i do it ?
Thanks

girishoak · Post by **girishoak** » Thu Dec 04, 2003 5:39 am

In transformation stage put a constraint as follows

Code: Select all

 LEN(record string) =! 263 and len(record string) !=265

If you have multiple columns, you need to concatinate them to get a single record.

All the best

Girish Oak

Amos.Rosmarin · Post by **Amos.Rosmarin** » Thu Dec 04, 2003 7:14 am

The is no stage that can handle variable record format in an open environment but you can do the following :

You can read the file as a single column of 265.
then split the row according to the record id (using substring)
and write it a a single field into a sequential stage.
Now you can read each file with it's own table definition using the original delimiter.

HTH,
Amos

vzmz · Post by **vzmz** » Thu Dec 04, 2003 8:17 am

Folks the thing is when i have to a squential stage for the file with max len of 265 define for columns.
For LEN(record string) =! 263 and len(record string) !=265 to work
The stage has to out put both the record size to the transform stage, whereas it just gives you only those records that are 265 to the transform stage leaving 263 behind so u end up having only those records of 265.

Even if i tried 263 it will give all 263 and 265 removing 2 char from the end of 265 record size.

What i have figure out is if i am able to read both the records from the sequential stage i can easzly segregate the 2.
Thanks

clshore · Post by **clshore** » Thu Dec 04, 2003 8:43 am

In a PX job you can use the 'Column Import' stage to 'parse apart a single binary or string column into multiple columns'.
Use a Transformer stage, with Constraints for the input record length to route the record to a CI stage that will convert it (or maybe a Filter stage, though I have not yet used one myself). You can define the input record to the Transformer as a single column varchar or varbinary to handle the variable length without truncating. You create a CI stage for each of the record formats that you will process.

A CI stage can also use an Orch schema file to define the target at runtime, which can be supplied as a job parm. Be sure to enable Runtime Column Propagation if you do this.

Carter

chulett · Post by **chulett** » Thu Dec 04, 2003 9:25 am

Can't you just use a contraint in the first transform and have two output links - one for 263 and one for 265?

Depending on your data and how you setup the sequential stage, you may need to Trim the record string to tell the difference between the two.

Teej · Post by **Teej** » Thu Dec 04, 2003 10:16 am

This is a Parallel Extender Question and Answer

What is asked:

How can I drop records from a sequential file that is of a specific length? How can I drop records that have a multiple set of specified lengths?

Answer:

Create a simple and independent job. For this job, you will have a Sequential Stage outputting to a Transform. You can have either one or two output link, depending on whether you wish to preserve those dropped records or not. If you do not need to preserve those dropped records, you only need one output link. If you do, you will need to create a reject LINK (not reject constraints, which is different).

On your sequential stage, have a single VARCHAR field of an undefined length. You do not wish to have any field defaults, so set both delimiter and quote to none. On the record level property, you need Record Delimiter to be an UNIX newline (which is what you noted as your operating system of choice).

On the transform stage, output the field, and have a constraint defined:

Code: Select all

len(inputlink.fieldname) <> [length you wish to filter out]

If you have more than one lengths, add an "or" after the above statement, and repeat.

Any fields that fail the above constraint are dropped, not sent to the rejected constraint. This may be a 6.x issue, and not present in 7.x. I would advise that you run a small test to verify this behavior.

On your output link, I would advise preserving it to another sequential file, however, it is possible to split up the fields into defined format that you may wish to use for this set of data.

-T.J.

chulett · Post by **chulett** » Thu Dec 04, 2003 11:28 am

Teej wrote:This is a Parallel Extender Question and Answer

Hey T.J. - What in your answer is exclusive to PE? Isn't this a generic solution that would apply equally well to PE and Server jobs?

Teej · Post by **Teej** » Thu Dec 04, 2003 3:20 pm

chulett wrote:Hey T.J. - What in your answer is exclusive to PE? Isn't this a generic solution that would apply equally well to PE and Server jobs?

I qualified it as a PX solution because I only tested that particular solution on PX. I suspect it would work on Server too, but I am not that confident, especially with how PX drop instead of reject the constrainted records.

-T.J.

vzoubov · Post by **vzoubov** » Thu Dec 04, 2003 8:09 pm

[quote="vzmz"]I have a file of variable length records out of which i have to take out only 2 types of records of size 263 and 265, means all the record whose size are of either 263 or 265 should be filtered out.
How can i do it ?
Thanks[/quote]

A variable length record file can be imported by a PX External Source stage. You can easily filter out the relevant records in the program called by the stage on file import.

Vitali.

DSXchange

Working with variable length record

Working with variable length record

Not reading any record of size 263

Re: Working with variable length record