RAM / swap usage

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
essaion
Participant
Posts: 18
Joined: Tue Nov 04, 2003 8:55 am
Contact:

Post by essaion »

Woaw, what a furious excitation around this matter. Please, no, no, it's too much, don't involve yourselves that way :)

... or :( , i don't know.
Aurelien
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I don't know.

Never been saddled with running DataStage on Windows. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
shawn_ramsey
Participant
Posts: 145
Joined: Fri May 02, 2003 9:59 am
Location: Seattle, Washington. USA

Re: RAM / swap usage

Post by shawn_ramsey »

It is my understanding that a single process in windows cannot use more than 2 GB of memory so it will never make the 3.5 GB with a single thread anyway. Now the other issue is if you have only 1.5 GB of ram and 1 GB is taken up by the UVSH you are not leaving much for the OS and other items running on the server.
Shawn Ramsey

"It is a mistake to think you can solve any major problems just with potatoes."
-- Douglas Adams
essaion
Participant
Posts: 18
Joined: Tue Nov 04, 2003 8:55 am
Contact:

Post by essaion »

Thanks for your answers !
(... Well, not to Craig, who laughes at those using DataStage on Windows :lol: )

The facts are :
- OS takes about 100-150 MB of RAM (seen at the server startup)
- SQL server can take up to 1 GB of RAM and 100% CPU ressources (this is, by the way, really annoying when it occurs). Let's say it takes an average 500 MB of RAM / swap.
- i never saw the ever-dying process take more than 700 MB (it howles an "ARGGGHHH" before)... The 1 GB mentionned is seen on the "Performances" tab ; the "Processes" one shows a 280-750 MB max used memory (by uvsh.exe).

The OS never complains about a low memory issue...

Again, thanks for your answers !
Aurelien
shawn_ramsey
Participant
Posts: 145
Joined: Fri May 02, 2003 9:59 am
Location: Seattle, Washington. USA

Post by shawn_ramsey »

essaion wrote:Thanks for your answers !
(... Well, not to Craig, who laughes at those using DataStage on Windows :lol: )

The facts are :
- OS takes about 100-150 MB of RAM (seen at the server startup)
- SQL server can take up to 1 GB of RAM and 100% CPU ressources (this is, by the way, really annoying when it occurs). Let's say it takes an average 500 MB of RAM / swap.
- i never saw the ever-dying process take more than 700 MB (it howles an "ARGGGHHH" before)... The 1 GB mentionned is seen on the "Performances" tab ; the "Processes" one shows a 280-750 MB max used memory (by uvsh.exe).

The OS never complains about a low memory issue...

Again, thanks for your answers !
We also found a memory leak in the OLEDB stage that caused some a failue when the memory usage hit the 2 GB window. We have 16 GB in our server so I am not sure if it would be the same in you configuration.
Shawn Ramsey

"It is a mistake to think you can solve any major problems just with potatoes."
-- Douglas Adams
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
don't use the folder stage if you have big files.
I saw some machines that can't handle 100MB+ files and some only fails from 200MB+ file size.
you'r better of doing it via DSexecute on the command line level.
you can use wildcards so you can run something like:
type "<path>"\*.* > "<target file full path>"
specifying the target file in a different directory then the source files are in.

IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
essaion
Participant
Posts: 18
Joined: Tue Nov 04, 2003 8:55 am
Contact:

Post by essaion »

No OLEDB is involved in the crash, so it seems like i won't spend time looking for an answer this way.

Like underlined Roy (reading between lines), there probably is a bug onto the Folder stage (or onto Routines) that forces the process to work into physical RAM. I'll wait the support to call me back and see what they say.

Like i mentionned :
essaion wrote:If you see an other-and-less-memory-blaster way to do this using DataStage (not a command)
So the only way i see to resolve this issue is... adding RAM onto the server, at this time (shrinking source files is not possible in this case).

Thank you all for your posts...
Aurelien
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I've just received that same error on DataStage 7.1 on AIX 5.2.

Very simple job design; Folder stage retrieving file names only, Transformer stage, sequential file output.

Transformer stage adds a second column which invokes a routine that calculates the file's date/time modified using OpenSeq and Status statements. There seems to be a limit on the size of a file that can be managed using this combination; the largest file in the directory is 179,179,200 bytes (nowhere near 2GB), and there are only text files and compressed text files in the directory.

Here's the function; it's not rocket science!

Code: Select all

FUNCTION DTM(PathName)

* History (most recent first)
*    Date     Programmer        Version  Details of Modification
* ----------  ----------------  -------  ----------------------------------
* 04/11/2004  Ray Wurlod         2.0.0   Initial coding
*

$INCLUDE UNIVERSE.INCLUDE FILEINFO.H
      DEFFUN OpenTextFile(Filename, OpenMode, AppendMode, LoggingFlag) Calling "DSU.OpenTextFile"

      Ans = @NULL

      hFile = OpenTextFile(PathName, "R", "O", "N")

      If FileInfo(hFile, FINFO$IS.FILEVAR)
      Then

         Status FileStuff From hFile
         Then
            TimeModified = FileStuff<15>
            DateModified = FileStuff<16>
            Ans = Oconv(DateModified, "D-YMD[4,2,2]") : " " : Oconv(TimeModified, "MTS:")
         End

      End

RETURN(Ans)
Would appreciate any ideas. I do have a workaround, but it's not a perfect one in that it will have problems on a year transition when yesterday's files are in the previous year.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Ray

Why not use ls command.
Mamu Kim
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That was my workaround (ls -l | awk '{print $6,$7,$8}').

Problem is on 01 Jan, when the time ($8) suddenly contains the year number of the previous year, which is not desirable in the current context.

The problem still occurs if I don't invoke the function, so I think my initial diagnosis was wrong. It's either a bug in the Folder stage, or that one of the ulimit settings is too small. Will investigate further Monday and post results.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
denzilsyb
Participant
Posts: 186
Joined: Mon Sep 22, 2003 7:38 am
Location: South Africa
Contact:

Post by denzilsyb »

hallo all
ray.wurlod wrote:It's either a bug in the Folder stage, or that one of the ulimit settings is too small. Will investigate further Monday and post results.
I just ran into an error on the folder stage as well. The job is:

Code: Select all

FOLDER -> TFM -> SEQ
The error i get is:

Code: Select all

DataStage Job 355 Phantom 17015
Program "JOB.2091227139.DT.1340829553.TRANS1": Line 37, 
Available memory exceeded. Unable to continue processing record.
DataStage Phantom Finished
The file JOB.2091227139.DT.1340829553.TRANS1 just shows:

Code: Select all

* Tokens were replaced below as follows:
* JobParam%%1 <= DATAFILEPATH
* Pin%%V0S57P1.Column%%1 <= LN_DATAFILEPATH.COL_1
* GET.Pin%%V0S57P1 <= GET.LN_DATAFILEPATH
* Pin%%V0S57P1.REJECTEDCODE <= LN_DATAFILEPATH.REJECTEDCODE
* Pin%%V0S57P2.Column%%1 <= LN_TFM_01.COL_1
* PUT.Pin%%V0S57P2 <= PUT.LN_TFM_01
* Pin%%V0S57P2.REJECTED <= LN_TFM_01.REJECTED
* Pin%%V0S57P2.REJECTEDCODE <= LN_TFM_01.REJECTEDCODE
*
* Subroutine for active stage SIIPSTransMerge01DailyFilesListed.TFM_01 generated at 15:45:30 17 NOV 2004
*
SUBROUTINE DSTransformerStage(HANDLES,ERROR)

$INCLUDE DSINCLUDE DSD_RTCONFIG.H
$INCLUDE DSINCLUDE DSD_STAGE.H
$INCLUDE DSINCLUDE DSD_BCI.H

$DEFINE JobParam%%1 STAGECOM.JOB.STATUS<7,1>

$INCLUDE DSINCLUDE JOBCONTROL.H
DEFFUN DSRLoadString(Num,Text,Args) CALLING '*DataStage*DSR_LOADSTRING'
$DEFINE Pin%%V0S57P1.Column%%1 STAGECOM.ARR(1)
$DEFINE GET.Pin%%V0S57P1 CALL $DS.UVGETNEXT(1,Pin%%V0S57P1.REJECTEDCODE)
IF STAGECOM.TRACE.STATS THEN CALL $PERF.NAME(-2,'LN_TFM_01.Derivation')
$DEFINE Pin%%V0S57P2.Column%%1 STAGECOM.ARR(2)
$DEFINE PUT.Pin%%V0S57P2 CALL $DS.SEQPUT(2, Pin%%V0S57P2.REJECTEDCODE)

UPDATE.COUNT = STAGECOM.RATE



LOOP
        REJECTED = @TRUE
        * Get next row from primary input pin LN_DATAFILEPATH
        STAGECOM.PINNO = 1
        GET.Pin%%V0S57P1
        ERROR = Pin%%V0S57P1.REJECTEDCODE
WHILE NOT(ERROR)

        STAGECOM.PINNO = 2
            IF STAGECOM.TRACE.STATS THEN CALL $PERF.BEGIN(-2)
            IF @TRUE THEN
                * Column derivation code for pin LN_TFM_01
                Pin%%V0S57P2.Column%%1 = (JobParam%%1 : Pin%%V0S57P1.Column%%1)
                Pin%%V0S57P2.REJECTED = @FALSE
            IF STAGECOM.TRACE.STATS THEN CALL $PERF.END(-2)

                PUT.Pin%%V0S57P2
                IF NOT(Pin%%V0S57P2.REJECTEDCODE) THEN
                    REJECTED = @FALSE
                END ELSE
                    Pin%%V0S57P2.REJECTED = @TRUE
                END
            END
            ELSE
                Pin%%V0S57P2.REJECTED = @TRUE
                Pin%%V0S57P2.REJECTEDCODE = 0
            END


  UPDATE.COUNT -= 1
  IF UPDATE.COUNT LE 0 THEN CALL DSD.Update(HANDLES);UPDATE.COUNT = STAGECOM.RATE
REPEAT
RETURN
END



The files listed in the SEQ are as follows:

Code: Select all

 /udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041001.SEC
/udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041002.SEC
 /udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041004.SEC
....
.....
/udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041110.SEC
/udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041111.SEC
/udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041112.SEC
/udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041113.SEC
 /udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041115.SEC
/udd003/Development/DataFiles/SEQ_SIIPS_MERGE_20041116.SEC
and the sizes of the files are all less than 263680700 which is the last file which i suspect where the error occurs (this file is read in the folder stage but not written to the SEQ stage)

Code: Select all

/udd003/Development/DataFiles/SEQ_SIIPS_MERGE_OVERFLOW.SEC
Are file sizes an issue? the only stange thing i am doing in the folder stage itself is sorting it ascending. I did check the length of the filename against what i am expecting, but its not a factor of truncated data.

For now I think I will have to go the ls -l route.
dnzl
"what the thinker thinks, the prover proves" - Robert Anton Wilson
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

denzilsyb wrote:Are file sizes an issue?
Yes! There is definitely a limit on the file sizes the Folder stage can handle. Unfortunately, I don't believe there is a hard-and-fast number published by Ascential... it seems to vary by hardware and version. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
denzilsyb
Participant
Posts: 186
Joined: Mon Sep 22, 2003 7:38 am
Location: South Africa
Contact:

Post by denzilsyb »

Seems I will have to go the ls -1 route. Thanks.
We are on solaris 9 OS and one mean array of disks. I wonder if this is fixed in 7.5?
dnzl
"what the thinker thinks, the prover proves" - Robert Anton Wilson
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I was able to prove to Ascential's satisfaction that there is a bug in the Folder stage; even when you select only the file name, it seems to load the entire file. An ecase has been generated; I don't yet have the number.

On a lighter note, and in response to
OS takes about 100-150 MB of RAM (seen at the server startup)
I remember that the official answer to this question in the Microsoft exam for Windows NT 4.0 Administrator was that the operating system requires 16MB. :roll:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply