Problem with sequential file and file name column

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
bart12872
Participant
Posts: 82
Joined: Fri Jan 19, 2007 5:38 pm

Problem with sequential file and file name column

Post by bart12872 »

I have a strange problem when reading files in sequential stage.

I read 3 files 090309.sdi,090310.sdi and 090311.sdi with pattern file option.

I use the option file name column to associate each line with the corresponding file.
But, some lines in the 090309.sdi are affected to the 090311.sdi file.

Do you have any idea on what happen?

thanks,
Martin.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Does the total row count differ?

What if you write to a sequential file and then match them back to source?
keshav0307
Premium Member
Premium Member
Posts: 783
Joined: Mon Jan 16, 2006 10:17 pm
Location: Sydney, Australia

Post by keshav0307 »

can you give more details?
how is some lines in the 090309.sdi are affected to the 090311.sdi file?
bart12872
Participant
Posts: 82
Joined: Fri Jan 19, 2007 5:38 pm

Post by bart12872 »

Well, my logic is the following
I take one or more files in input and I sort them by date. I remove duplicate by taking the more recent line.

So, I have 2 lines with the same key in the 090309 and in the 090311, but the line in the 090309 is affected to the 090311, so my remove duplicate becomes wrong.

to Sainath.Srinivasan, effectivly the rows differ. I obtain this

File ---- row in the file ---- rows in Datastage
090309.sdi ---- 2 320 822 ---- 2 320 815 (-7)
090310.sdi ---- 2 287 527 ---- 2 287 527 (ok)
090311.sdi ---- 2 213 218 ---- 2 213 325 (+7)

so, i identified the lines. It's the last 7 lines of the file 090309.sdi that are affected to the 090311.sdi file.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Affected to? To quote The Princess Bride: You keep using that word. I do not think it means what you think it means. Could you please clarify what exactly you mean by 'affected to' in this context? :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
bart12872
Participant
Posts: 82
Joined: Fri Jan 19, 2007 5:38 pm

Post by bart12872 »

chulett wrote:Affected to? To quote The Princess Bride: You keep using that word. I do not think it means what you think it means. Could you please clarify what exactly you mean by 'affected to' in this context? :?
sorry, my mother tongue is french and not english.

well, in mean, in the sequential stage, I defined the option 'file name column'=FILEMAME
so, all the lines of the 090309.sdi should have the value '/pathfile/090309.sdi'
and all the lines of the 090310.sdi should have the value '/pathfile/090310.sdi' and so on...

but for 7 lines of the 090309.sdi files the FILENAME value is '/pathfile/090311.sdi'

I hope it is more clear this time.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

No worries, you wouldn't want to see my French. :wink:

The example helps alot. So... does this problem seem to be coming directly from the Sequential File stage itself or is it being introduced in later processing, say the 'remove duplicates' step you mentioned? I've seen posts about odd problems with that 'filename' option, perhaps the environment variable mentioned in this post might help:

viewtopic.php?t=121263
-craig

"You can never have too many knives" -- Logan Nine Fingers
bart12872
Participant
Posts: 82
Joined: Fri Jan 19, 2007 5:38 pm

Post by bart12872 »

chulett wrote:No worries, you wouldn't want to see my French. :wink:

The example helps alot. So... does this problem seem to be coming directly from the Sequential File stage itself or is it being introduced in later processing, say the 'remove duplicates' step you mentioned? I've seen posts about odd problems with that 'filename' option, perhaps the environment variable mentioned in this post might help:

viewtopic.php?t=121263
well, this environement is already set to TRUE. And for my test, I suppress the remove duplicate operator. It seems to become directly from the Sequential File Stage itself.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Then this seems to be something you should report to your official support provider, see if it is a known issue.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

What if you provide the files in reverse order?

It will be useful to isolate the input seq file (with pattern) and an output (peek or seq file).

Do you perform any other activity other than file pattern?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I'd also be curious what exact 7.x version you have and what UNIX platform you are on.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply