Page 1 of 1

Problem in Implementing the logic using buildop

Posted: Wed Sep 28, 2005 10:10 pm
by DEVESHASTHANA
hi

I am facing the problem in implementing this logic:
We are building a buildop and our requirment is :

We want to change the table structure at runtime in the Input of Interface and my output interface schema is fixed ,can it be possible in buildop stage ,or is there any other way to solve this problem,
I want to change the input schema at run time,

PROB:

input i have various schema files:
file1columnsare: A B C D E F
file2columnsare: A B D F C
file3columnsare: A B C D F
.....
.....


output i have fixed schema file:


columns are : A B K F

here K= C+E ,if c is notavailable then only E and viceversa, and if both are available then C+E

Please help in solving this problem,

Regards,

Devesh Asthana

Posted: Fri Sep 30, 2005 2:16 am
by cyh
In our project, we will force the guy to restructure the file before passing to the BuildOp (or Transformer).

For example :
file1columnsare: A B C D E F -> no change
file2columnsare: A B D F C -> A B C D *E F
file3columnsare: A B C D F -> A B C D *E F

As you know, the order does not matter to the actual processing. And a column generator can help to add a NULL field (e.g. E).

Therefore, there will be only 1 single format for the input file of your BuildOp.

HTH.

Posted: Fri Sep 30, 2005 5:48 am
by DEVESHASTHANA
I think ,it will not solve my problem as i need to add columns for some of the retailers(files),moreover i cannot change the columns as this is some retailer's file,we will be using file as a source ,

columns are : A B K F

here K= C+E ,if c is notavailable then only E and viceversa, and if both are available then C+E


can it be possible ,if yes please share your experience and knowledge,

Regards,

Devesh

Posted: Fri Sep 30, 2005 2:08 pm
by bcarlson
I think what cyh is recommending is to force your different input files to look the same. You may not be able to have the retailers reformat the file, but once you recieve them, your DataStage program can reformat them any way you want.

Create one job for each type of file. You can call the same buildop from each of them as long as the input stream has the same schema.

import -> "restructure" -> buildop -> output

The "restructure" stage (or stages) is where you take your input schema and reformat it to look the same as what the buildop needs. Make sure all fields are accounted for (rename with Modify, or add field(s) with the Column Generator). Then call the buildop - it will be the same for each job.

Brad.

Posted: Fri Sep 30, 2005 11:06 pm
by DEVESHASTHANA
Thanks for your input ,

but it is not possible for me to write a job for all the retailers as there can be 800+ retailers ,so what i want to achieve with this job is
1:parameterising the input with retailer's schema file and
2:mapping them with the output fixstructure layout in the buildop stage( is there any way to parameterise the "Interface> Input" in buildop stage,so that we can give different retailers schema file at runtime)


PROBLEM:

In input i have various schema files:
file1columnsare: A B C D E F
file2columnsare: A B D F C
file3columnsare: A B C D F
.....
.....
....




In output i have fixed schema file:


columns are : A B K F

here K= C+E ,if c is notavailable then only E and viceversa, and if both are available then C+E


This is the problem description , :cry:





regards,

Devesh

Posted: Sat Oct 01, 2005 1:07 am
by ray.wurlod
DEVESHASTHANA wrote: is there any way to parameterise the "Interface> Input" in buildop stage,so that we can give different retailers schema file at runtime)
Alas, no, which is going to make your task rather difficult.

You will need to design an approach the uses a standard schema, but which allows for columns to be missing (null?). And you will need to handle transition from individual retailers' record layouts to your standard layout. Perhaps a Switch stage on retailer type (there must be SOME overlaps!) into different Copy stages for different types.

Without seeing/knowing your full requirement it is difficult to provide focussed suggestions, but if you think along these lines I think there is a chance that you will solve your problem, indeed without needing recourse to writing a BuildOp.

Posted: Sat Oct 01, 2005 1:50 am
by DEVESHASTHANA
Ray,

I will be more than happy :lol: if this can be done without buildopstage,

as for now my problem is to generalize the job for all the retailers(parameterising the job),if there are other approaches to solve this problem do suggest ,

Again I am explain my requirements :

We have to design a Datastage utility(i am mentioning utility as we don't want to make job specific to each retailers) which will work for all the Retailers(800+),here there are no Business logic or tranformation required,We have Retailers input file with some columns and in output our layout is standard fixed no. of columns,

So what we want to achieve through this utility is that it takes retailer name as parameter and fetch its file and map it to the output columns

Eg.
In input i have various files:
file1columnsare: A, B, C, D ,E ,F
file2columnsare: A,B, D, F, C
file3columnsare: A, B, C, D, F
file4columnsare:A,B,C,D,F
.....
.....
....




In output i have fixed schema file:
outputcolumns are : A B K F
here K= C+E ,if E is notavailable then only C, and if both are available then C+E ( this is the only logic that needs to be applied there if required for particular retailer)

This is the exact description of our requirement,:cry:


Is there any way out through which we can do tranformation in the Schema file which is fetch by sequential file without mention the columns in the Sequential file in the px job?



Regards,

Devesh

Posted: Sat Oct 01, 2005 5:31 pm
by ray.wurlod
Without having given it a lot of (unpaid) thought, I would tend to plan along the following lines. There is a small number of different input file layouts. I would design a separate job for each of these then, in a job sequence, make the decision about which of these ought to be used based upon the particular retailer.

It's not that much additional development work, since most of it is copy and paste.

Posted: Sun Oct 02, 2005 11:47 pm
by DEVESHASTHANA
Thanks for the inputs everyone(Ray,bcarlson,cyh)
Ray,
i dont think that it will work for us ,as we have more than hundred different layout in input,


Is there any way out through which we can do tranformations in the Schema file which is fetch by sequential file without mentioning the columns in the Sequential file(columns definition) in the px job?
It means can i see the columns of schema file and use it for transformation in the Px job ?

I know we can do one to one data transfer using schema file to output file ,but i want to do pass only N no. of columns to out put and more than N no. of columns are coming from source files , Is the mapping of columns is possible without mentioning the columns name(in column definition) in the Sequential file which is fetching the schema file and datafile with the same structure as schema file? :cry:



Regards,
Devesh

Posted: Tue Oct 04, 2005 9:30 am
by bcarlson
Okay, here's kind of a wierd and propably radical idea.

1. If possible, create a list of fields that could be missing. Sum the lengths of all of them. For the sake of conversation, let 's say there is a total of 100 bytes.
2. Add a 150 byte field to EVERY input, call the field 'MISSING_FIELDS' (use the column generator and set it to spaces, NULL, whatever). The extra 50 bytes would allow for expansion later. Use the datatype of 'UNKNOWN'
3. You mentioned getting a schema file from the retailers for their files. Create a program (DataStage, Unix script, C program, whatever) that determine what fields could be missing from the input file. The default schema would just be the 150 byte FILLER.
4. Dynamically build a schema file (just a text file), with a max 'record' length 150 bytes (see step2) that incorporates those missing fields (with proper datatypes, lengths, nullity, etc) and a final FILLER field.
5. Use the Column Import stage (from the Restructure group), and set the Column Method option to 'Schema File'. This can be parameterized, so your job can pass the name of the schema file you created in step4.
6. After the Column Import, your input stream should have all necessary fields. Use a Modify stage to KEEP all required fields for the buildop.

I hope this makes sense. It makes sense in my mind, but not sure how well I am communicating it in writing. It is kind of thinking outside of the box, but then again, when has programming ever really been straight forward?

HTH,

Brad.

Posted: Tue Oct 04, 2005 9:53 pm
by DEVESHASTHANA
Thanks everyone,

My problem is solved by using Transformation operator in Generic stage,

Regards,

Devesh :)