Row count generation

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Vino_joe84
Participant
Posts: 28
Joined: Mon Nov 06, 2006 5:44 am
Location: chennai

Row count generation

Post by Vino_joe84 »

Hi,

We have a requirement in which we need to seperate the 10th record from a file and process it separately. Is there any way to filter the 10th record alone from a file using any stages by checking the rowcount . Any help is appreciated. Thanks
J.Ithayavino
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Use "@INROWNUM" in a transform stage. Easiest if you change your job to use just one node or work sequentially in that stage.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

And on the off chance you mean every 10th record, then a MOD() could be used... assuming that is available in PX.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ag_ram
Premium Member
Premium Member
Posts: 524
Joined: Wed Feb 28, 2007 3:51 am

Post by ag_ram »

This may be what you want in the contraint as both said:

Code: Select all

Not(Mod(@INROWNUM,10))
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Keep in mind (and in your design) the consequences of parallel execution. Is "the tenth record on each node" what you require, or "the tenth line from a sequential file (being processed sequentially)" what you require?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
throbinson
Charter Member
Charter Member
Posts: 299
Joined: Wed Nov 13, 2002 5:38 pm
Location: USA

Post by throbinson »

Use the filter property in the Sequential Stage
awk 'NR == 10'

although I like the @INROWNUM better.

Hey, this is fun. I'll bet there aren't more then 10 more different ways to do it.
OddJob
Participant
Posts: 163
Joined: Tue Feb 28, 2006 5:00 am
Location: Sheffield, UK

Post by OddJob »

Use an External Source stage with the following command:
head -10 <file name> | tail -1
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Use a Copy stage with a Sample stage on the second output.

(Come on, keep 'em coming - the target is ten ways!)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
OddJob
Participant
Posts: 163
Joined: Tue Feb 28, 2006 5:00 am
Location: Sheffield, UK

Post by OddJob »

Using a Sequential File stage, add option 'Row Number Column' and follow up with a Filter stage only allowing through Row Number = 10.
Azzuri
Premium Member
Premium Member
Posts: 122
Joined: Tue May 13, 2008 11:42 am

Post by Azzuri »

Use transformer stage(Run sequentially) with a stage variable counting the number of rows. In contraints put where svCount <> 10 on output.
throbinson
Charter Member
Charter Member
Posts: 299
Joined: Wed Nov 13, 2002 5:38 pm
Location: USA

Post by throbinson »

1. Use row Number Option in Sequential Stage.
2. Define a lookup to Oracle with SELECT 10 As rowNumber from Dual as User Defined SQL.
3. Join to Row Number from Sequential Stage. Condition Not Met - Drop
4. Update your resume.
tscala
Premium Member
Premium Member
Posts: 7
Joined: Fri Jul 23, 2004 8:06 am

Post by tscala »

1-Open file in Notepad.
2-Count down to 10th record.
3-Highlight record using mouse.
4-Ctl+C
5-Open blank document using Notepad.
6-Ctl+V
7-Save file and note location.
8-Run job using file as input.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Notepad would be a good trick on UNIX. However, in vi you can use :set nu to make line numbers visible, and yank the appropriate row (line) from the file.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
tsn
Participant
Posts: 51
Joined: Wed Jan 10, 2007 1:32 am

Post by tsn »

There are many ways,

If you use transformer stage then,
you either use @INROWNUM or OUTROWNUM, based on the requirement, if you want to take out 10th record which got processed to the target the use @OUTROWNUM or if you want to take out 10th record from the file then use @INROWNUM
Use @OUTROWNUM=10 in the constraint of the transformer stage.

you can you use head -11 <filename> also. Use this if your source file has header record init.

tks.
with regards,
tsn
Post Reply