Page 1 of 1

Posted: Mon Feb 09, 2004 10:11 am
by essaion
Woaw, what a furious excitation around this matter. Please, no, no, it's too much, don't involve yourselves that way :)

... or :( , i don't know.

Posted: Mon Feb 09, 2004 10:15 am
by chulett
I don't know.

Never been saddled with running DataStage on Windows. :wink:

Re: RAM / swap usage

Posted: Mon Feb 09, 2004 10:30 am
by shawn_ramsey
It is my understanding that a single process in windows cannot use more than 2 GB of memory so it will never make the 3.5 GB with a single thread anyway. Now the other issue is if you have only 1.5 GB of ram and 1 GB is taken up by the UVSH you are not leaving much for the OS and other items running on the server.

Posted: Mon Feb 09, 2004 10:40 am
by essaion
Thanks for your answers !
(... Well, not to Craig, who laughes at those using DataStage on Windows :lol: )

The facts are :
- OS takes about 100-150 MB of RAM (seen at the server startup)
- SQL server can take up to 1 GB of RAM and 100% CPU ressources (this is, by the way, really annoying when it occurs). Let's say it takes an average 500 MB of RAM / swap.
- i never saw the ever-dying process take more than 700 MB (it howles an "ARGGGHHH" before)... The 1 GB mentionned is seen on the "Performances" tab ; the "Processes" one shows a 280-750 MB max used memory (by uvsh.exe).

The OS never complains about a low memory issue...

Again, thanks for your answers !

Posted: Mon Feb 09, 2004 10:56 am
by shawn_ramsey
essaion wrote:Thanks for your answers !
(... Well, not to Craig, who laughes at those using DataStage on Windows :lol: )

The facts are :
- OS takes about 100-150 MB of RAM (seen at the server startup)
- SQL server can take up to 1 GB of RAM and 100% CPU ressources (this is, by the way, really annoying when it occurs). Let's say it takes an average 500 MB of RAM / swap.
- i never saw the ever-dying process take more than 700 MB (it howles an "ARGGGHHH" before)... The 1 GB mentionned is seen on the "Performances" tab ; the "Processes" one shows a 280-750 MB max used memory (by uvsh.exe).

The OS never complains about a low memory issue...

Again, thanks for your answers !
We also found a memory leak in the OLEDB stage that caused some a failue when the memory usage hit the 2 GB window. We have 16 GB in our server so I am not sure if it would be the same in you configuration.

Posted: Tue Feb 10, 2004 3:07 am
by roy
Hi,
don't use the folder stage if you have big files.
I saw some machines that can't handle 100MB+ files and some only fails from 200MB+ file size.
you'r better of doing it via DSexecute on the command line level.
you can use wildcards so you can run something like:
type "<path>"\*.* > "<target file full path>"
specifying the target file in a different directory then the source files are in.

IHTH,

Posted: Tue Feb 10, 2004 7:05 am
by essaion
No OLEDB is involved in the crash, so it seems like i won't spend time looking for an answer this way.

Like underlined Roy (reading between lines), there probably is a bug onto the Folder stage (or onto Routines) that forces the process to work into physical RAM. I'll wait the support to call me back and see what they say.

Like i mentionned :
essaion wrote:If you see an other-and-less-memory-blaster way to do this using DataStage (not a command)
So the only way i see to resolve this issue is... adding RAM onto the server, at this time (shrinking source files is not possible in this case).

Thank you all for your posts...

Posted: Thu Nov 04, 2004 11:35 pm
by ray.wurlod
I've just received that same error on DataStage 7.1 on AIX 5.2.

Very simple job design; Folder stage retrieving file names only, Transformer stage, sequential file output.

Transformer stage adds a second column which invokes a routine that calculates the file's date/time modified using OpenSeq and Status statements. There seems to be a limit on the size of a file that can be managed using this combination; the largest file in the directory is 179,179,200 bytes (nowhere near 2GB), and there are only text files and compressed text files in the directory.

Here's the function; it's not rocket science!

Code: Select all

FUNCTION DTM(PathName)

* History (most recent first)
*    Date     Programmer        Version  Details of Modification
* ----------  ----------------  -------  ----------------------------------
* 04/11/2004  Ray Wurlod         2.0.0   Initial coding
*

$INCLUDE UNIVERSE.INCLUDE FILEINFO.H
      DEFFUN OpenTextFile(Filename, OpenMode, AppendMode, LoggingFlag) Calling "DSU.OpenTextFile"

      Ans = @NULL

      hFile = OpenTextFile(PathName, "R", "O", "N")

      If FileInfo(hFile, FINFO$IS.FILEVAR)
      Then

         Status FileStuff From hFile
         Then
            TimeModified = FileStuff<15>
            DateModified = FileStuff<16>
            Ans = Oconv(DateModified, "D-YMD[4,2,2]") : " " : Oconv(TimeModified, "MTS:")
         End

      End

RETURN(Ans)
Would appreciate any ideas. I do have a workaround, but it's not a perfect one in that it will have problems on a year transition when yesterday's files are in the previous year.

Posted: Fri Nov 05, 2004 5:23 am
by kduke
Ray

Why not use ls command.

Posted: Fri Nov 05, 2004 5:46 pm
by ray.wurlod
That was my workaround (ls -l | awk '{print $6,$7,$8}').

Problem is on 01 Jan, when the time ($8) suddenly contains the year number of the previous year, which is not desirable in the current context.

The problem still occurs if I don't invoke the function, so I think my initial diagnosis was wrong. It's either a bug in the Folder stage, or that one of the ulimit settings is too small. Will investigate further Monday and post results.

Posted: Wed Nov 17, 2004 8:14 am
by denzilsyb
hallo all
ray.wurlod wrote:It's either a bug in the Folder stage, or that one of the ulimit settings is too small. Will investigate further Monday and post results.
I just ran into an error on the folder stage as well. The job is:

Code: Select all

FOLDER -> TFM -> SEQ
The error i get is:

Code: Select all

DataStage Job 355 Phantom 17015
Program "JOB.2091227139.DT.1340829553.TRANS1": Line 37, 
Available memory exceeded. Unable to continue processing record.
DataStage Phantom Finished
The file JOB.2091227139.DT.1340829553.TRANS1 just shows:

Code: Select all

* Tokens were replaced below as follows:
* JobParam%%1 <= DATAFILEPATH
* Pin%%V0S57P1.Column%%1 <= LN_DATAFILEPATH.COL_1
* GET.Pin%%V0S57P1 <= GET.LN_DATAFILEPATH
* Pin%%V0S57P1.REJECTEDCODE <= LN_DATAFILEPATH.REJECTEDCODE
* Pin%%V0S57P2.Column%%1 <= LN_TFM_01.COL_1
* PUT.Pin%%V0S57P2 <= PUT.LN_TFM_01
* Pin%%V0S57P2.REJECTED <= LN_TFM_01.REJECTED
* Pin%%V0S57P2.REJECTEDCODE <= LN_TFM_01.REJECTEDCODE
*
* Subroutine for active stage SIIPSTransMerge01DailyFilesListed.TFM_01 generated at 15:45:30 17 NOV 2004
*
SUBROUTINE DSTransformerStage(HANDLES,ERROR)

$INCLUDE DSINCLUDE DSD_RTCONFIG.H
$INCLUDE DSINCLUDE DSD_STAGE.H
$INCLUDE DSINCLUDE DSD_BCI.H

$DEFINE JobParam%%1 STAGECOM.JOB.STATUS<7,1>

$INCLUDE DSINCLUDE JOBCONTROL.H
DEFFUN DSRLoadString(Num,Text,Args) CALLING '*DataStage*DSR_LOADSTRING'
$DEFINE Pin%%V0S57P1.Column%%1 STAGECOM.ARR(1)
$DEFINE GET.Pin%%V0S57P1 CALL $DS.UVGETNEXT(1,Pin%%V0S57P1.REJECTEDCODE)
IF STAGECOM.TRACE.STATS THEN CALL $PERF.NAME(-2,'LN_TFM_01.Derivation')
$DEFINE Pin%%V0S57P2.Column%%1 STAGECOM.ARR(2)
$DEFINE PUT.Pin%%V0S57P2 CALL $DS.SEQPUT(2, Pin%%V0S57P2.REJECTEDCODE)

UPDATE.COUNT = STAGECOM.RATE



LOOP
        REJECTED = @TRUE
        * Get next row from primary input pin LN_DATAFILEPATH
        STAGECOM.PINNO = 1
        GET.Pin%%V0S57P1
        ERROR = Pin%%V0S57P1.REJECTEDCODE
WHILE NOT(ERROR)

        STAGECOM.PINNO = 2
            IF STAGECOM.TRACE.STATS THEN CALL $PERF.BEGIN(-2)
            IF @TRUE THEN
                * Column derivation code for pin LN_TFM_01
                Pin%%V0S57P2.Column%%1 = (JobParam%%1 : Pin%%V0S57P1.Column%%1)
                Pin%%V0S57P2.REJECTED = @FALSE
            IF STAGECOM.TRACE.STATS THEN CALL $PERF.END(-2)

                PUT.Pin%%V0S57P2
                IF NOT(Pin%%V0S57P2.REJECTEDCODE) THEN
                    REJECTED = @FALSE
                END ELSE
                    Pin%%V0S57P2.REJECTED = @TRUE
                END
            END
            ELSE
                Pin%%V0S57P2.REJECTED = @TRUE
                Pin%%V0S57P2.REJECTEDCODE = 0
            END


  UPDATE.COUNT -= 1
  IF UPDATE.COUNT LE 0 THEN CALL DSD.Update(HANDLES);UPDATE.COUNT = STAGECOM.RATE
REPEAT
RETURN
END



The files listed in the SEQ are as follows:

Code: Select all

 /udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041001.SEC
/udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041002.SEC
 /udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041004.SEC
....
.....
/udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041110.SEC
/udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041111.SEC
/udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041112.SEC
/udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041113.SEC
 /udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041115.SEC
/udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041116.SEC
and the sizes of the files are all less than 263680700 which is the last file which i suspect where the error occurs (this file is read in the folder stage but not written to the SEQ stage)

Code: Select all

/udd003/Development/DataFiles/SEQ_SIIPS_MERGE_OVERFLOW.SEC
Are file sizes an issue? the only stange thing i am doing in the folder stage itself is sorting it ascending. I did check the length of the filename against what i am expecting, but its not a factor of truncated data.

For now I think I will have to go the ls -l route.

Posted: Wed Nov 17, 2004 8:23 am
by chulett
denzilsyb wrote:Are file sizes an issue?
Yes! There is definitely a limit on the file sizes the Folder stage can handle. Unfortunately, I don't believe there is a hard-and-fast number published by Ascential... it seems to vary by hardware and version. :?

Posted: Wed Nov 17, 2004 8:29 am
by denzilsyb
Seems I will have to go the ls -1 route. Thanks.
We are on solaris 9 OS and one mean array of disks. I wonder if this is fixed in 7.5?

Posted: Wed Nov 17, 2004 3:03 pm
by ray.wurlod
I was able to prove to Ascential's satisfaction that there is a bug in the Folder stage; even when you select only the file name, it seems to load the entire file. An ecase has been generated; I don't yet have the number.

On a lighter note, and in response to
OS takes about 100-150 MB of RAM (seen at the server startup)
I remember that the official answer to this question in the Microsoft exam for Windows NT 4.0 Administrator was that the operating system requires 16MB. :roll: