load multiformat seq file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
srividhya
Participant
Posts: 42
Joined: Tue May 31, 2005 10:11 am

load multiformat seq file

Post by srividhya »

Hi all,
iam getting a sourefile with header,detail and trailer.its a fixed widthcolumn file.
Input format
FHDR 453 ssb ---------header recid(4) fileid(4) filename(3)
3452671 boston invest--Detail cusip(7)place(10)type(15)
FTLR 6028 0.989899---Trailer trrecid(4)totalfile(5)totalamt(10)
inheader always recid is FHDR
in trailer always recid is FTLR
output :oracle in one table
table format recid,fileid,filename,cusip,place,type,trrecid,totalfile,totalamt
Thanks
:x sri
aditya
Charter Member
Charter Member
Posts: 41
Joined: Sat May 28, 2005 7:32 am

Post by aditya »

Sri,

One way of solving the problem is to replace the spaces using pipe delimiter '|' this can be achived using the replace() function and then having to manipulate using the functions such as ltrim() and rtrim() with the pipe delimiters specified to actually achieve the values into the different columns.

Probably the job that needs to be designed is

OCI Stage -->Transformer --> Sequential File --> Sequential File --> OCI Stage

Good Luck
Aditya.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That's a server job solution that's not really relevant in the parallel job environment unless you want to use a UNIX shell script to modify the text file in advance.
In the parallel job the easiest way is to read the file with a Sequential File stage whose schema matches the detail record. The header and trailer will be transmitted as a raw record onto the rejects link, which can feed into other stage types to recover the information from those records. Check out the "datatype_from_raw" possibilities in the Modify stage, then maybe a Switch stage to isolate header from footer, and so on.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
aditya
Charter Member
Charter Member
Posts: 41
Joined: Sat May 28, 2005 7:32 am

Post by aditya »

Thank you Ray.

- Aditya.
srividhya
Participant
Posts: 42
Joined: Tue May 31, 2005 10:11 am

Post by srividhya »

Hi Ray,
please can u explain me in more detail , i have to use the shell script in before routine, and reject header,footer in reject link then use modify stage?
Thanks
Sri
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You don't need a shell script - that was pointing out that Aditya's solution was not appropriate in a parallel job.

If you follow the remainder of my advice, you read the file using a Sequential File stage. In parallel jobs a Sequential File stage can have two output links, the one containing valid lines from the file (that match the schema), the other containing raw records that don't match the schema. In your case, the first link contains the detail records, the rejects link gets the header and trailer.

The rejects link ("rejects" is just a name) can nonethelessbe directed into other stage types for further processing. You have the header and trailer; you can process them however you like, simply by using the appropriate stage types.

However, the record is a raw record, so you will need to parse it into its constituent fields. There are quite a few techniques that you could use for this. My earlier post suggested a couple. Research the Parallel Job Developer's Guide and this forum for others.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
srividhya
Participant
Posts: 42
Joined: Tue May 31, 2005 10:11 am

Post by srividhya »

Hi
I did the same as ray said, apply the schema and i got the header and trailer inthe reject liknk, here the data is storting in the raw format(varbinary). now iam trying to modify the datatype from raw to string using the modify stage, so i can seperate the header and trailer.
There are no delimiters, its a fixed format header is having one format and trailer is having its own format .
the input format is
fhdr1.1 fhdr1.2
det1.1a det1.2a det1.3a
det1.1b.det1.2b det1.3b
trlr1.1 trlr1.2
fhdr2.1 fhdr2.2
det2.1a det2.2a det2.3a
det2.1b det2.2b det2.3b
trlr2.1 trlr2.2
the output is
fhdr1.1 fhdr1.2 det1.1a det1.2a det1.3a trlr1.1 trlr1.2
fhdr1.1 fhdr1.2 det1.1b.det1.2b det1.3b trlr1.1 trlr1.2
fhdr2.1 fhdr2.2 det2.1a det2.2a det2.3a trlr2.1 trlr2.2
fhdr2.1 fhdr2.2 det2.1b det2.2b det2.3b trlr2.1 trlr2.2

can u guys please help in this
thanks
:cry: Sri
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

This suggests you're not getting detail lines on the other output link. This, in turn, suggests that your schema for the other output link is not correct for the detail lines.
Get that right first.
Then use a Modify stage on the rejects link to convert the raw row to a string, and decompose that string appropriately using substring techniques, perhaps in a second Modify stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
srividhya
Participant
Posts: 42
Joined: Tue May 31, 2005 10:11 am

Post by srividhya »

Hi Ray,
I got the detail record perfect,
I am trying in the modify stage and the error is

main_program: Error parsing modify adapter: Error in binding: Error in destination field selector: Expected integer, got: ":"; input:
recid[:string]=rejected

thanks a lot
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You will need to post the exact derivations you used in the Modify stage. Without these there's not really enough information for accurate diagnosis.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
srividhya
Participant
Posts: 42
Joined: Tue May 31, 2005 10:11 am

Post by srividhya »

Hi Ray
iam trying to change the datatype from raw to string
but according to the paralleljob developer guide , there is no way to convert varbinary to string
1)convert to int32 or int 8
2)from this convert to string I Guess
Thanks
Sri
lshort
Premium Member
Premium Member
Posts: 139
Joined: Tue Oct 29, 2002 11:40 am
Location: Toronto

Post by lshort »

i have a similar job. i took data from sequential reject link and passed it through a "Column Import Stage" with a Schema file that define the output columns all as varchar.
Works great. No Modify stage required.

btw: i got this idea from somewhere else on this forum...dont remember who.....you know who you are...Thx.
Lance Short
"infinite diversity in infinite combinations"
***
"The absence of evidence is not evidence of absence."
Abu@0403
Participant
Posts: 32
Joined: Wed Aug 08, 2007 11:21 pm

Post by Abu@0403 »

You people are unbelievable. Thanks a lot lshort.

I was stuck up with the same problem. When I used Column Import stage it worked perfectly.

Thanks once again.
----------------
Abu
Post Reply