Page 1 of 2

Returning char* from external routine

Posted: Wed Jul 18, 2007 6:20 pm
by timsmith_s
I found a few postings regarding this topic, but their C++ syntax was incorrect. The question really is, who owns the memory once I return the char* from the external function? All of the samples and the advanced course material always show constant values being return, but howdo I deal with variable length strings?

For example, the following code returns a buffer from the string, but NEW allocated the buffer, who DELETEs it (free's it), DSEE? I believe it will create a memory leak.

Should I be using the String Class provide in the APT framework API?

#include <string.h>

char* GenerateStringFromParameter (char* StringParameter)
{
char* buffer = 0;

// allocate strong long enough to hold the argument & some text
buffer = new char[strlen(StringParameter) + 12];

// copy the contents of the argument
strcpy(buffer, StringParameter);

// tack on some random text to make string longer than parameter
strcat(buffer, "hello world");

// return the newly created string
return buffer;
}

Re: Returning char* from external routine

Posted: Wed Jul 18, 2007 6:47 pm
by Yuan_Edward
I don't think this is the proper way to return a string from a function. The temporary buffer address should not be referred outside the routine. Did you try to compile and run the routine in a C++ program?

In my opinion the buffer address to hold the changed value should be passed into the routine.
timsmith_s wrote:I found a few postings regarding this topic, but their C++ syntax was incorrect. The question really is, who owns the memory once I return the char* from the external function? All of the samples and the advanced course material always show constant values being return, but howdo I deal with variable length strings?

For example, the following code returns a buffer from the string, but NEW allocated the buffer, who DELETEs it (free's it), DSEE? I believe it will create a memory leak.

Should I be using the String Class provide in the APT framework API?

#include <string.h>

char* GenerateStringFromParameter (char* StringParameter)
{
char* buffer = 0;

// allocate strong long enough to hold the argument & some text
buffer = new char[strlen(StringParameter) + 12];

// copy the contents of the argument
strcpy(buffer, StringParameter);

// tack on some random text to make string longer than parameter
strcat(buffer, "hello world");

// return the newly created string
return buffer;
}

Posted: Wed Jul 18, 2007 7:32 pm
by timsmith_s
This is not temporary, the NEW operator creates the memory. I pass the address to that memory back via the return type of char*. Its proper C+++, the issue, the my quesiton is, who cleans up the memory created by the NEW operator?

Posted: Wed Jul 18, 2007 7:57 pm
by Yuan_Edward
I guess youself should clean up the momory by calling delete in your routine.

I have rewritten your codes in my way :) (i know its not the perfect way):

Code: Select all

#include <string.h> 

char* GenerateStringFromParameter (char* StringParameter, char* buffer) 
{ 
// copy the contents of the argument 
strcpy(buffer, StringParameter); 

// tack on some random text to make string longer than parameter 
strcat(buffer, "hello world"); 

// return the newly created string 
return buffer; 
}
timsmith_s wrote: the issue, the my quesiton is, who cleans up the memory created by the NEW operator?

Posted: Wed Jul 18, 2007 8:11 pm
by timsmith_s
Where is the buffer coming from, DataStage? How would I call this from a Transformer?

Posted: Wed Jul 18, 2007 9:57 pm
by DSguru2B
In Yuan_Edward code, StringParameter and buffer are input arguments to the function. You need to create an interlude of the function as a parallel routine. Specify the path of the object file and call it just as a routine. DataStage will handle the memory specification for the input arguments. Any variables you use inside is your responsibility or there will be memory leaks.

Posted: Wed Jul 18, 2007 10:14 pm
by timsmith_s
Interesting - so if I wanted to write my own trim function the signature would read

Trim(instring, outstring) rather than Trim(instring)

Why do the ~standard~ routines not require this sort of hassle? I was told they were handled in much the same way.

Posted: Wed Jul 18, 2007 10:32 pm
by Yuan_Edward
Not sure whats inside DataStage. Maybe DataStage uses global memory buffer...maybe DataStage cleans up the momoery area used by the standard routines.

I don't know how and where I can find the C codes DataStage generates for a job. Otherwise we can get the answer from there. :?
timsmith_s wrote:Interesting - so if I wanted to write my own trim function the signature would read

Trim(instring, outstring) rather than Trim(instring)

Why do the ~standard~ routines not require this sort of hassle? I was told they were handled in much the same way.

Posted: Thu Jul 19, 2007 12:54 am
by DSguru2B
If you would write your own Trim() function then you would still use a single variable. THe result variable does not need to be defined. The return statement will take care of it. The variable needs to be defined inside the code. On encountering the return statement the DS Engine releases the memory.

Posted: Thu Jul 19, 2007 4:22 am
by timsmith_s
DSguru2B - dont follow. paraphrase: If I still write my own Trim() and I "define my variable" inside of my function - how? This is the heart of the issue.

Using the following code is invalid - it will return an address of "LocalBuffer" which will be out of scope once the function call completes.

char* CopyString(char* SourceString)
{
char LocalBuffer[1024];

strcpy (LocalBuffer, SourceString);

return LocalBuffer;
}

Posted: Thu Jul 19, 2007 4:20 pm
by DSguru2B
I would rather define LocalBuffer as a pointer but anyhow. You are right when you say that the scope of this variable is untill the call to the function finishes, but then again thats all that we want. A C function is what a px routine is. Its just a function. For every row the function will be invoked and once the function finishes, the memory is released. So if you have 100 rows in your source file and you are calling the function in the transformer, the function will be called 100 times.
The variable that holds the end result, i.e, is used in the return command, the memory for this variable will be released when the function completes. Any other variable you use inside, make sure you free it within the code. I like to do that.
Check out this amature code that I wrote.

Posted: Thu Jul 19, 2007 6:18 pm
by Yuan_Edward
DSguru2B, I pasted your codes here, will the memory buffer (finOut) be released by DataStage (I am quite sure it will not be released by the routine itself)? That's OP's question. DataStage needs to call "free" or "delete" explicitly to release the memory. I cant find any documentation on that.

Maybe I can have a try to allocate a huge amount of memory in my routine and run it again and again, then i can find it out if the server is not getting crashed.

Code: Select all

#include <stdio.h> 
#include <stdlib.h> 
#include <string.h> 

char* SybaseToOracleTp(char* InTp) 
{ 
  //Initialize variables 
  const int SIZE = 30; 
  char* month = (char *)malloc(SIZE); 
  char* day = (char *)malloc(SIZE); 
  char* year = (char *)malloc(SIZE); 
  char* hour = (char *)malloc(SIZE); 
  char* newHr = (char *)malloc(SIZE); 
  char* min = (char *)malloc(SIZE); 
  char* sec = (char *)malloc(SIZE); 
  char* msec = (char *)malloc(SIZE); 
  char* time = (char *)malloc(SIZE); 
  char* intMon = (char *)malloc(SIZE); 
  char* finOut = (char *)malloc(SIZE); 
  const char* calender[] = {"Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"}; 

  int hr = 0; 

  //AM or PM 
  char* p = strstr(InTp, "PM"); 

  //Disect the Date 
  strcpy(month, strtok(InTp, " ")); 
  strcpy(day, strtok(NULL, " ")); 
  strcpy(year, strtok(NULL, " ")); 
  strcpy(time, strtok(NULL," ")); 

 //Disect Time 
  strcpy(hour, strtok(time, ":")); 
  strcpy(min, strtok(NULL, ":")); 
  strcpy(sec, strtok(NULL, ":")); 
  strcpy(msec, strtok(NULL, ":")); 

  //get numeric representation of Month 
  for(int i = 0; i < 12; i++) 
  { 
   if (strcmp(month, calender[i]) == 0) 
     sprintf(intMon, "%02d", i + 1); 
  } 
  if ((p) && strcmp(hour, "12") != 0) 
    { 
     hr = atoi(hour); 
     hr+=12; 
     sprintf(hour, "%02d", hr); 
    } 
  
  if ((!p) && strcmp(hour, "12") == 0) 
  { 
     strcpy(hour, "00"); 
  } 
 //format string to YYYY-MM-DD HH:MM:SS.sss 
 sprintf(finOut, "%s-%2s-%s %s:%s:%s.%s", year, intMon, day, hour, min, sec, msec); 

 //free memory 
 free(month); 
 free(day); 
 free(year); 
 free(hour); 
 free(min); 
 free(sec); 
 free(msec); 
 free(time); 
 free(intMon); 

 return finOut; 

} 
DSguru2B wrote:I would rather define LocalBuffer as a pointer but anyhow. You are right when you say that the scope of this variable is untill the call to the function finishes, but then again thats all that we want. A C function is what a px routine is. Its just a function. For every row the function will be invoked and once the function finishes, the memory is released. So if you have 100 rows in your source file and you are calling the function in the transformer, the function will be called 100 times.
The variable that holds the end result, i.e, is used in the return command, the memory for this variable will be released when the function completes. Any other variable you use inside, make sure you free it within the code. I like to do that.
Check out this amature code that I wrote.

Posted: Thu Jul 19, 2007 7:37 pm
by DSguru2B
Yuan_Edward wrote:DSguru2B, I pasted your codes here, will the memory buffer (finOut) be released by DataStage (I am quite sure it will not be released by the routine itself)? That's OP's question. DataStage needs to call "free" or "delete" explicitly to release the memory. I cant find any documentation on that.
Yes. DataStage will.
You dont need it to be in writing to believe it. It makes sense.
You cannot free it before returning it, you cannot have a free statement after the return statement as it will never be executed. So the DSEngine takes care of it.

Posted: Thu Jul 19, 2007 8:38 pm
by timsmith_s
Thank you all for the posts - great insight.

DSguru2B your code worries me - got memory leak written all over it - that said, I am willing to try your approach - its simple enough and I am running out of memory as it is.

I agree with the previous post that the DSEE documentation 1. sucks 2. doesnt describe this behavior - worth a shot.

I have to say that if this works, I will be very disappointed with Ascential/IBM and their total lack of good documentation.

Posted: Thu Jul 19, 2007 9:29 pm
by DSguru2B
Like where do you think there will be memory leaks? If you are talking about SIZE then thats just something that I kept large enough which I free before quiting the program anyways, so please explain.