Page 1 of 1

load multiformat seq file

Posted: Mon Jul 18, 2005 5:08 pm
by srividhya
Hi all,
iam getting a sourefile with header,detail and trailer.its a fixed widthcolumn file.
Input format
FHDR 453 ssb ---------header recid(4) fileid(4) filename(3)
3452671 boston invest--Detail cusip(7)place(10)type(15)
FTLR 6028 0.989899---Trailer trrecid(4)totalfile(5)totalamt(10)
inheader always recid is FHDR
in trailer always recid is FTLR
output :oracle in one table
table format recid,fileid,filename,cusip,place,type,trrecid,totalfile,totalamt
Thanks
:x sri

Posted: Mon Jul 18, 2005 8:13 pm
by aditya
Sri,

One way of solving the problem is to replace the spaces using pipe delimiter '|' this can be achived using the replace() function and then having to manipulate using the functions such as ltrim() and rtrim() with the pipe delimiters specified to actually achieve the values into the different columns.

Probably the job that needs to be designed is

OCI Stage -->Transformer --> Sequential File --> Sequential File --> OCI Stage

Good Luck
Aditya.

Posted: Mon Jul 18, 2005 8:59 pm
by ray.wurlod
That's a server job solution that's not really relevant in the parallel job environment unless you want to use a UNIX shell script to modify the text file in advance.
In the parallel job the easiest way is to read the file with a Sequential File stage whose schema matches the detail record. The header and trailer will be transmitted as a raw record onto the rejects link, which can feed into other stage types to recover the information from those records. Check out the "datatype_from_raw" possibilities in the Modify stage, then maybe a Switch stage to isolate header from footer, and so on.

Posted: Mon Jul 18, 2005 9:18 pm
by aditya
Thank you Ray.

- Aditya.

Posted: Mon Jul 18, 2005 9:27 pm
by srividhya
Hi Ray,
please can u explain me in more detail , i have to use the shell script in before routine, and reject header,footer in reject link then use modify stage?
Thanks
Sri

Posted: Tue Jul 19, 2005 12:52 am
by ray.wurlod
You don't need a shell script - that was pointing out that Aditya's solution was not appropriate in a parallel job.

If you follow the remainder of my advice, you read the file using a Sequential File stage. In parallel jobs a Sequential File stage can have two output links, the one containing valid lines from the file (that match the schema), the other containing raw records that don't match the schema. In your case, the first link contains the detail records, the rejects link gets the header and trailer.

The rejects link ("rejects" is just a name) can nonethelessbe directed into other stage types for further processing. You have the header and trailer; you can process them however you like, simply by using the appropriate stage types.

However, the record is a raw record, so you will need to parse it into its constituent fields. There are quite a few techniques that you could use for this. My earlier post suggested a couple. Research the Parallel Job Developer's Guide and this forum for others.

Posted: Tue Jul 19, 2005 2:01 pm
by srividhya
Hi
I did the same as ray said, apply the schema and i got the header and trailer inthe reject liknk, here the data is storting in the raw format(varbinary). now iam trying to modify the datatype from raw to string using the modify stage, so i can seperate the header and trailer.
There are no delimiters, its a fixed format header is having one format and trailer is having its own format .
the input format is
fhdr1.1 fhdr1.2
det1.1a det1.2a det1.3a
det1.1b.det1.2b det1.3b
trlr1.1 trlr1.2
fhdr2.1 fhdr2.2
det2.1a det2.2a det2.3a
det2.1b det2.2b det2.3b
trlr2.1 trlr2.2
the output is
fhdr1.1 fhdr1.2 det1.1a det1.2a det1.3a trlr1.1 trlr1.2
fhdr1.1 fhdr1.2 det1.1b.det1.2b det1.3b trlr1.1 trlr1.2
fhdr2.1 fhdr2.2 det2.1a det2.2a det2.3a trlr2.1 trlr2.2
fhdr2.1 fhdr2.2 det2.1b det2.2b det2.3b trlr2.1 trlr2.2

can u guys please help in this
thanks
:cry: Sri

Posted: Tue Jul 19, 2005 3:12 pm
by ray.wurlod
This suggests you're not getting detail lines on the other output link. This, in turn, suggests that your schema for the other output link is not correct for the detail lines.
Get that right first.
Then use a Modify stage on the rejects link to convert the raw row to a string, and decompose that string appropriately using substring techniques, perhaps in a second Modify stage.

Posted: Tue Jul 19, 2005 3:25 pm
by srividhya
Hi Ray,
I got the detail record perfect,
I am trying in the modify stage and the error is

main_program: Error parsing modify adapter: Error in binding: Error in destination field selector: Expected integer, got: ":"; input:
recid[:string]=rejected

thanks a lot

Posted: Tue Jul 19, 2005 3:29 pm
by ray.wurlod
You will need to post the exact derivations you used in the Modify stage. Without these there's not really enough information for accurate diagnosis.

Posted: Tue Jul 19, 2005 3:38 pm
by srividhya
Hi Ray
iam trying to change the datatype from raw to string
but according to the paralleljob developer guide , there is no way to convert varbinary to string
1)convert to int32 or int 8
2)from this convert to string I Guess
Thanks
Sri

Posted: Thu Jul 21, 2005 7:57 am
by lshort
i have a similar job. i took data from sequential reject link and passed it through a "Column Import Stage" with a Schema file that define the output columns all as varchar.
Works great. No Modify stage required.

btw: i got this idea from somewhere else on this forum...dont remember who.....you know who you are...Thx.

Posted: Mon Jun 15, 2009 10:57 am
by Abu@0403
You people are unbelievable. Thanks a lot lshort.

I was stuck up with the same problem. When I used Column Import stage it worked perfectly.

Thanks once again.