Page 1 of 1

First and last record

Posted: Wed Jul 29, 2015 11:59 pm
by vamsi_4a6
I have to fetch last record from a file but i do not want to execute the stage in sequential mode.i want to execute stage in parallel mode only.


Current logic i am following.

sequentialfile-->tail stage

or

sequentail file-->transformer stage(lastrow function as constraint).



both tail stage and transfomer stage i am running in sequential mode but i am thinking whether we can do samething by using by executing the stage in parallel mode

Posted: Thu Jul 30, 2015 12:26 am
by ray.wurlod
Sure, if what you want is the last record FROM EVERY NODE

Posted: Thu Jul 30, 2015 12:33 am
by vamsi_4a6
Thanks for reply but is it possible get last record only by executing stage in parallel mode and after that i can use some transformation to get last record alone from the file

Posted: Thu Jul 30, 2015 1:26 am
by ray.wurlod
Sure, if what you want is the last record FROM EVERY NODE.

Posted: Thu Jul 30, 2015 6:34 am
by chulett
In other words, no.

Posted: Fri Jul 31, 2015 9:56 am
by ShaneMuir
Is the purpose of the job to only get the last row of an input file?

If so, just use the unix cmd tail -1 on the filter in the sequential file stage.

Posted: Fri Jul 31, 2015 4:00 pm
by ray.wurlod
You do not know, in a parallel execution environment, which node is processing the last row of the file, unless there is some particular characteristic of the data from that line that allows it to be thus identified.

Posted: Fri Jul 31, 2015 4:37 pm
by chulett
Been wondering the same thing, all of this conversating about "last record alone" - if that's all you need just use tail -1 as noted and never mind about all this parallel mode stuff.

Oh, and... whatever happened to the "first record" you mentioned in your subject?