Problem with folder stage?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
johnno
Participant
Posts: 50
Joined: Wed Mar 05, 2003 5:33 am

Problem with folder stage?

Post by johnno »

Not quite sure how to explain this but here goes:

I use a folder stage to identify all the files in a specific directory folder that are to be processed in the current run. I have used the same basic job for a number of other processes and it works fine. For this process the wildcard is *.DAT.

The problem I am having is that, in this instance, it doesn't pick up one large file (approx 340MB). If I have a much smaller file in the folder as well, the job will pick up that file to be processed, but not the large one.

I put two files in the directory: OD22003.DAT (340MB) and JOLTestFile.DAT (15KB).

If I run this job as a one-off and if I use the button to the right of the pathname parameter to build up the path I get the following pathname: D:\Data\Download\ETLPreProcLandG\JOL\Ready\ and the large file is selected.

If I use the parameter file that we use for all the other jobs in this project, I get a pathname of \\Apps17\Data\Download\ETLPreProcLandG\JOL\Ready\ which does not select the large file (but does select the small file), but this is what we use everywhere else and it works.

The job doesn't fail or anything - it just doesn't seem to like the large file.

I have no idea what the problem may be so any suggestions greatly appreciated.

Cheers
Johnno
ANSHULA
Participant
Posts: 12
Joined: Thu Mar 27, 2003 1:35 pm

Post by ANSHULA »

What is the defined SQLtype and length for 'Record' column (i.e. non-key column) on the FOLDER Stage --> OUTPUTS --> COLUMNS tab ?
johnno
Participant
Posts: 50
Joined: Wed Mar 05, 2003 5:33 am

Post by johnno »

In the folder stage the on the outputs/columns tab is:

Column name: FileName
Derivation: (none)
Key: Yes
SQL Type: VarChar
Length: 255
Nullable: No

In the sequential file stage that the data is being written to, the inputs/columns tab is:

Column name: FileName
Key: Yes
SQL Type: Char
Length: 64
Nullable: No

Thanks for the reply and I hope this helps

Johnno
ANSHULA
Participant
Posts: 12
Joined: Thu Mar 27, 2003 1:35 pm

Post by ANSHULA »

what about the non-key column ( outputs/columns tab) details ... the column which receives the contents of the file ?
johnno
Participant
Posts: 50
Joined: Wed Mar 05, 2003 5:33 am

Post by johnno »

This is the only column we have and we populate this with the filename. We are only interested in the filename and as you have to have a key in this stage, we have just set this column as the key.

Cheers
Johnno
ANSHULA
Participant
Posts: 12
Joined: Thu Mar 27, 2003 1:35 pm

Post by ANSHULA »

Then, why not use UNIX command to list the filenames and redirect output in the text file? You can use this text file for further processing.
johnno
Participant
Posts: 50
Joined: Wed Mar 05, 2003 5:33 am

Post by johnno »

Being on a Windows platform we could, as you say, use DOS commands to produce a file of filenames. My major concern is however, that as we have used this approach so many times in the past with this being the first problem, have we made a serious misjudgement in the design? And if not, why won't this approach work for this one file?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

There was a discussion on ADN about the folder stage when someone had a very similar issue. Here is a direct link to the discussion for any members. The gist of the problem is this:
Ernie Ostic wrote:It may be worth noting that the Folder Stage was officially designed for XML and had its original debut when DataStage first started supporting XML 5 years ago. It's intent is to read a set of (typically XML) files in a subdirectory, sending each one as a complete "chunk" (single column for the ENTIRE contents) to the XMLInput Stage for parsing into individual colums for elements and attributes. I've used it for some reasonably large XML documents (50 to 60 meg), but as Ray noted, it certainly is going to blow all link/column memory availability if a file as large as 2G is trying to pass thru....
Apparently, the file size limit is significantly lower than 2GB. It looks like you are going to need to rewrite this using either a Batch File approach or something written in Job Control for files of this size.
-craig

"You can never have too many knives" -- Logan Nine Fingers
johnno
Participant
Posts: 50
Joined: Wed Mar 05, 2003 5:33 am

Post by johnno »

Thanks very much. As much as I wish you could have told me different, I'm glad I found out the reason behind it as we will need to change a few of our jobs (and a whole 6 days before implementation!!!)

I will study the link you provided and look at it in some more detail tomorrow (train strike in London today so must nip off home!).

Thanks again.
Sreenivasulu
Premium Member
Premium Member
Posts: 892
Joined: Thu Oct 16, 2003 5:18 am

Post by Sreenivasulu »

Check the file name whether it correctly has ".DAT" in its name.
Rename this file with some other name having a .DAT extension and try.

Hope this works

Regards
Sreenivasulu
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Why do you think this would make any difference? :?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Sreenivasulu
Premium Member
Premium Member
Posts: 892
Joined: Thu Oct 16, 2003 5:18 am

Post by Sreenivasulu »

Hi Ray,

Sometimes what happens is the at the file name would have been
appended with control characters which is not visible from the
telnet window. Hence my suggestion was to rename the file
freshly.

Regards
Srini
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That's cool. 8)

However, there's no real need for a ".DAT" extension. This is what confused me.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
I faced this a while back, there are even posts here regarding this issue.
the problem is configuration that might be different between window servers.
I had this problem with over 200MB files in one machine and over 300MB in another.
since this stage has a size limit ( I don't know nor care what this limit is even if configurable ) and we have the potential to someday face a file bigger then the set limit I recommend to simply use other alternatives like DS basic which is more managable as a DS developer then running after the sys admin guys when you need to reconfigure this for any reason.

IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
johnno
Participant
Posts: 50
Joined: Wed Mar 05, 2003 5:33 am

Post by johnno »

Thanks for all your replies. I have finally got access to the Ascential developer net to check out the posts relating to this issue (thanks again chulett).

I will try and have a play with this and see how it works and what it can/can't really do (if I ever get any time!). For now I just use the DOS DIR command and read the file it produces.

Just a quick question though as I am still digesting the all this info: from what I've read on the Ascential developer net above, I think it was a comment from Ray, I get the impression that it is possible to pick up the file names only and therefore file size should not be an issue. Is this correct? If so, I would have thought the way I was doing it would be OK as when I check the output from this job (it just rights each row to a sequential file) all I see is the filename itself - no additional data!

Anyway, possibly not the most important thing now as we have the DOS command, but any suggestions/explanation would be welcome.

Cheers and thanks again.
Johnno
Post Reply