Page 1 of 1

node and partision problem????

Posted: Sat Mar 20, 2010 1:26 am
by kaushal.kumar@igate.com
Hi ,

I have to read files from load directory.I am passing file location and file name as parameter.My job is able to pick files from partcular location,But details of record is not coming in same sequence what we have in input .
For example in my input i have record as..
HDR JD685675 05\03\2010
DTL 53864845 INDIA KARNATAKA BANGALORE 5749837687
DTL 53864848 INDIA KARNATAKA BANGALORE 5749837
DTL 53864851 INDIA KARNATAKA BANGALORE 574983
DTL 53864545 INDIA KARNATAKA BANGALORE 574983768
DTL 53865846 INDIA KARNATAKA BANGALORE 5749
DTL 53854840 INDIA KARNATAKA BANGALORE 574983
DTL 53564844 INDIA KARNATAKA BANGALORE 57498376
DTL 58564843 INDIA KARNATAKA BANGALORE 574983708
DTL 59864848 INDIA KARNATAKA BANGALORE 57498777
TRL 09.

My input sequential file is reading this record in same sequence.In next stage i am using transformer .In transformer debuging stage i am getting records like..
DTL 53865846 INDIA KARNATAKA BANGALORE 5749
DTL 53854840 INDIA KARNATAKA BANGALORE 574983
DTL 53564844 INDIA KARNATAKA BANGALORE 57498376
DTL 58564843 INDIA KARNATAKA BANGALORE 574983708
DTL 59864848 INDIA KARNATAKA BANGALORE 57498777

HDR JD685675 05\03\2010
DTL 53864845 INDIA KARNATAKA BANGALORE 5749837687
DTL 53864848 INDIA KARNATAKA BANGALORE 5749837
DTL 53864851 INDIA KARNATAKA BANGALORE 574983
DTL 53864545 INDIA KARNATAKA BANGALORE 574983768
TRL 09

i have two node configuration for project.
Please advice how can i get my output in same sequence :(

Posted: Sat Mar 20, 2010 2:31 am
by ray.wurlod
Force the Transformer stage to execute in Sequential mode.

Posted: Sat Mar 20, 2010 2:38 am
by kaushal.kumar@igate.com
ray.wurlod wrote:Force the Transformer stage to execute in Sequential mode. ...
will it give performance problem :?:

Posted: Sat Mar 20, 2010 2:40 pm
by ray.wurlod
Not for me.

Define "performance" in an ETL context.

What's more important - speed or correct results?

Posted: Sat Mar 20, 2010 10:03 pm
by kaushal.kumar@igate.com
ray.wurlod wrote:Not for me.

Define "performance" in an ETL context.

What's more important - speed or correct results?
correct result is more important,but i have to read more then 20 huge files so speed will also come in count?

Posted: Sun Mar 21, 2010 4:29 pm
by ray.wurlod
Speed - as in elapsed time? Well, all else being equal, more partitions (up to a sane amount) should finish faster than fewer. Or you may prefer fewer partitions but more than one job simultaneously. Or you may prefer to schedule jobs when not much else is happening on the machine. Or...

Posted: Sun Mar 21, 2010 10:17 pm
by kaushal.kumar@igate.com
ray.wurlod wrote:Speed - as in elapsed time? Well, all else being equal, more partitions (up to a sane amount) should finish faster than fewer. Or you may prefer fewer partitions but more than one job simultaneously ...
Thanks a lot :)