Page 1 of 1

Dynamic metadata

Posted: Wed May 21, 2008 1:26 am
by tsn
Hi,

Had an requirement like, is it possible to change the metadata dynamically based on the source file which we receive, for example

File 1
-------
header - 0,UK,30112007
detail - 1,customername,accountno,balance amount
tail - 2,recordcount,hashamount

File 2
-------
header - 0,30112007,US,filename
detail - 1,accountno,balance amount,info
tail - 2,hashamount,recordcount

have to develop a DS job to load the source file into the table, when we receive the file1, we need to load the file1 and the same job should load file2 when it receives file2

need to validate the header and tailer record also for the source files.

tks.

Posted: Wed May 21, 2008 1:30 am
by ray.wurlod
For source files that are text files it does not matter what the column names are. The data types have to be correct.
Other than that, I don't believe you can do what this requirement specifies.

Resist stupid requirements.

Posted: Wed May 21, 2008 1:36 am
by tsn
the data types varies from file1 to file2, have a plan to write one DS job to handle the file1 and file2 load into the table. before loading need to check the header (country,date) and tailer need to check the recordcount matches or not

need to know whether we can dynamically change the metadata of the sequential file and table when the job is running based on the filename.

tks.

Posted: Wed May 21, 2008 1:43 am
by ray.wurlod
You can not.

Posted: Wed May 21, 2008 1:55 am
by tsn
Is there any work around to achevie this.

tks.

Posted: Wed May 21, 2008 2:15 am
by ray.wurlod
Yes. Use proper metadata. This will require two separate jobs in your case, the decision about which one to use being made in a job sequence that tests the file name.

Posted: Wed May 21, 2008 2:41 am
by tsn
tks. This is the normal process. Is there any other workaround to get dynamic one to get achieved. I am trying on myside, let me check and try that out.

tks.

Posted: Wed May 21, 2008 2:41 am
by Cr.Cezon
If you are using text files, you can try to use a seq file with only a column defined, then with substrings funtion load the especific value of columns in vbles and then work with them depends on what file you are treating.
example.

file
column varchar 255

in transformer you can use vbles:
for file1
type = substrings(column,1,1)
customername = substrings(column,2,y)
...

for file2
type = substrings(column,1,1)
accountno= substrings(column,2,y)
...

then in your transformer you can identify the type of file and then work with the vbles requiered.

or someting like that ....

also in pararell, you can use schema file to use "dymamic metadata"

Posted: Wed May 21, 2008 2:47 am
by tsn
How to use the schema file, i'm trying with parallel job only. can you pls elaborate that........

tks.

Posted: Wed May 21, 2008 3:47 am
by Cr.Cezon
In Parallel seq file there is a property called Schema File:

Name of a file containing a schema for the data. Setting this property will override any settings on the Format & Columns tabs.

an example of this file is:

record
{final_delim=end, delim=',', quote=double}
(
INICIAL:string[28] {delim=none};
RESTO:string[max=500] {prefix=2};
)

Posted: Mon May 26, 2008 1:14 am
by tsn
I tried using schema,

schema is
record
{record_delim='\n', record_length=fixed, delim=none}
(
name:nullable string[13] {width=13, null_field=' '};
account:nullable int32 {width=7, null_field=' '};
amount:nullable decimal[23,3] {width=23, null_field=' '};
status:nullable string[1] {width=1, null_field=' '};
)

when I tried using "column import" stage, i'm getting the following error,
Column_Import_3,1: Input buffer overrun at field "amount", at offset: 20

tks.

Posted: Mon May 26, 2008 3:09 am
by tsn
tks guys.

Its working............now.....