load multiformat seq file

srividhya · Post by **srividhya** » Mon Jul 18, 2005 5:08 pm

Hi all,
iam getting a sourefile with header,detail and trailer.its a fixed widthcolumn file.
Input format
FHDR 453 ssb ---------header recid(4) fileid(4) filename(3)
3452671 boston invest--Detail cusip(7)place(10)type(15)
FTLR 6028 0.989899---Trailer trrecid(4)totalfile(5)totalamt(10)
inheader always recid is FHDR
in trailer always recid is FTLR
output :oracle in one table
table format recid,fileid,filename,cusip,place,type,trrecid,totalfile,totalamt
Thanks

sri

aditya · Post by **aditya** » Mon Jul 18, 2005 8:13 pm

Sri,

One way of solving the problem is to replace the spaces using pipe delimiter '|' this can be achived using the replace() function and then having to manipulate using the functions such as ltrim() and rtrim() with the pipe delimiters specified to actually achieve the values into the different columns.

Probably the job that needs to be designed is

OCI Stage -->Transformer --> Sequential File --> Sequential File --> OCI Stage

Good Luck
Aditya.

ray.wurlod · Post by **ray.wurlod** » Mon Jul 18, 2005 8:59 pm

That's a server job solution that's not really relevant in the parallel job environment unless you want to use a UNIX shell script to modify the text file in advance.
In the parallel job the easiest way is to read the file with a Sequential File stage whose schema matches the detail record. The header and trailer will be transmitted as a raw record onto the rejects link, which can feed into other stage types to recover the information from those records. Check out the "datatype_from_raw" possibilities in the Modify stage, then maybe a Switch stage to isolate header from footer, and so on.

aditya · Post by **aditya** » Mon Jul 18, 2005 9:18 pm

Thank you Ray.

- Aditya.

srividhya · Post by **srividhya** » Mon Jul 18, 2005 9:27 pm

Hi Ray,
please can u explain me in more detail , i have to use the shell script in before routine, and reject header,footer in reject link then use modify stage?
Thanks
Sri

ray.wurlod · Post by **ray.wurlod** » Tue Jul 19, 2005 12:52 am

You don't need a shell script - that was pointing out that Aditya's solution was not appropriate in a parallel job.

If you follow the remainder of my advice, you read the file using a Sequential File stage. In parallel jobs a Sequential File stage can have two output links, the one containing valid lines from the file (that match the schema), the other containing raw records that don't match the schema. In your case, the first link contains the detail records, the rejects link gets the header and trailer.

The rejects link ("rejects" is just a name) can nonethelessbe directed into other stage types for further processing. You have the header and trailer; you can process them however you like, simply by using the appropriate stage types.

However, the record is a raw record, so you will need to parse it into its constituent fields. There are quite a few techniques that you could use for this. My earlier post suggested a couple. Research the Parallel Job Developer's Guide and this forum for others.

srividhya · Post by **srividhya** » Tue Jul 19, 2005 2:01 pm

Hi
I did the same as ray said, apply the schema and i got the header and trailer inthe reject liknk, here the data is storting in the raw format(varbinary). now iam trying to modify the datatype from raw to string using the modify stage, so i can seperate the header and trailer.
There are no delimiters, its a fixed format header is having one format and trailer is having its own format .
the input format is
fhdr1.1 fhdr1.2
det1.1a det1.2a det1.3a
det1.1b.det1.2b det1.3b
trlr1.1 trlr1.2
fhdr2.1 fhdr2.2
det2.1a det2.2a det2.3a
det2.1b det2.2b det2.3b
trlr2.1 trlr2.2
the output is
fhdr1.1 fhdr1.2 det1.1a det1.2a det1.3a trlr1.1 trlr1.2
fhdr1.1 fhdr1.2 det1.1b.det1.2b det1.3b trlr1.1 trlr1.2
fhdr2.1 fhdr2.2 det2.1a det2.2a det2.3a trlr2.1 trlr2.2
fhdr2.1 fhdr2.2 det2.1b det2.2b det2.3b trlr2.1 trlr2.2

can u guys please help in this
thanks

Sri

ray.wurlod · Post by **ray.wurlod** » Tue Jul 19, 2005 3:12 pm

This suggests you're not getting detail lines on the other output link. This, in turn, suggests that your schema for the other output link is not correct for the detail lines.
Get that right first.
Then use a Modify stage on the rejects link to convert the raw row to a string, and decompose that string appropriately using substring techniques, perhaps in a second Modify stage.

srividhya · Post by **srividhya** » Tue Jul 19, 2005 3:25 pm

Hi Ray,
I got the detail record perfect,
I am trying in the modify stage and the error is

main_program: Error parsing modify adapter: Error in binding: Error in destination field selector: Expected integer, got: ":"; input:
recid[:string]=rejected

thanks a lot

ray.wurlod · Post by **ray.wurlod** » Tue Jul 19, 2005 3:29 pm

You will need to post the exact derivations you used in the Modify stage. Without these there's not really enough information for accurate diagnosis.

srividhya · Post by **srividhya** » Tue Jul 19, 2005 3:38 pm

Hi Ray
iam trying to change the datatype from raw to string
but according to the paralleljob developer guide , there is no way to convert varbinary to string
1)convert to int32 or int 8
2)from this convert to string I Guess
Thanks
Sri

lshort · Post by **lshort** » Thu Jul 21, 2005 7:57 am

i have a similar job. i took data from sequential reject link and passed it through a "Column Import Stage" with a Schema file that define the output columns all as varchar.
Works great. No Modify stage required.

btw: i got this idea from somewhere else on this forum...dont remember who.....you know who you are...Thx.

Abu@0403 · Post by **Abu@0403** » Mon Jun 15, 2009 10:57 am

You people are unbelievable. Thanks a lot lshort.

I was stuck up with the same problem. When I used Column Import stage it worked perfectly.

Thanks once again.