APT_IMPORT_PATTERN_USES_FILESET issue

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
peddidsx
Premium Member
Premium Member
Posts: 55
Joined: Wed Dec 26, 2007 8:20 am

APT_IMPORT_PATTERN_USES_FILESET issue

Post by peddidsx »

Hi All ,
I am facing a strange issue while using $APT_IMPORT_PATTERN_USES_FILESET parameter. Here is the situation where i am .

I have a folder in which my files resided as shown below:

DUMMY_20090727_FY2008_104367100_R01_IN
DUMMY_IN
DUMMY_20090727_FY2008_104367100_R01_IN_YYYYNNNY
DUMMY_20090727_FY2008_104367100_R01_IN_YYYYNNNN
DUMMY_20090727_FY2008_104367100_R01_IN_YYYYNYYY

and i read them via a sequential file as *IN* with file_pattern mode and $APT_IMPORT_PATTERN_USES_FILESET set to TRUE to have the file names.And here are the counts for the files and you can see the total count from all the files in unix as 103024 as below:

20828 DUMMY_20090727_FY2008_104367100_R01_IN
19712 DUMMY_IN
20828 DUMMY_20090727_FY2008_104367100_R01_IN_YYYYNNNY
20828 DUMMY_20090727_FY2008_104367100_R01_IN_YYYYNNNN
20828 DUMMY_20090727_FY2008_104367100_R01_IN_YYYYNYYY
103024 total

And i checked the count from the table by grouping by on the file name, once i loaded and the counts differ from the files which i see from the table where as the total record count matches correctly.
i
DB COUNT FROM THE TABLE WHEN I DO A GROUP BY:
15665 DUMMY_20090727_FY2008_104367100_R01_IN
29153 DUMMY_20090727_FY2008_104367100_R01_IN_YYYYNNNN
20828 DUMMY_20090727_FY2008_104367100_R01_IN_YYYYNNNY
25991 DUMMY_20090727_FY2008_104367100_R01_IN_YYYYNYYY
11387 DUMMY_IN
103024


Kindly suggest if i am missing anything here?

Thanks,
Rajesh
Rajesh Peddi
peddidsx
Premium Member
Premium Member
Posts: 55
Joined: Wed Dec 26, 2007 8:20 am

Post by peddidsx »

As a follow up to my earlier post...I have done further research and found that this issue only when the numer of files are greater than 2 with the specified pattern.

Thanks,
Rajesh
Rajesh Peddi
dsedi
Participant
Posts: 220
Joined: Wed Jun 02, 2004 12:38 am

Post by dsedi »

In the job design what are the stages you have in between the read and db write? Doing any partition on the column?
Accept that some days you're the pigeon and some days you're the statue.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Also curious if you are running the job on multiple nodes or a single node?
-craig

"You can never have too many knives" -- Logan Nine Fingers
peddidsx
Premium Member
Premium Member
Posts: 55
Joined: Wed Dec 26, 2007 8:20 am

Post by peddidsx »

Hi dsedi/chulett - Sorry for not responding late on this. I am away from my computer since i posted this issue.

The answers to your questions are :

1) There are lookupstage/transformer stages between file and DB.
2) We are running on multiple nodes.

Let me know if i can provide any additional information?

Regards,
Rajesh
Rajesh Peddi
Post Reply