Converting text to First Letter Capitalisation

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

jeawin
Participant
Posts: 18
Joined: Mon Oct 04, 2004 6:49 am
Location: Milton Keynes
Contact:

Converting text to First Letter Capitalisation

Post by jeawin »

In good old Server Edition, in the derivation of a transformer stage, you just did an OCONV using the MCT conversion To Produce Text With First Letter Capitalisation.

Anyone know of a nice simple way of doing this in Enterprise Edition?
_______________________________________
"If I had asked people what they wanted they would have said faster horses"
Henry Ford
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Apart from using a BASIC Transformer stage doing Oconv(), you mean?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jeawin
Participant
Posts: 18
Joined: Mon Oct 04, 2004 6:49 am
Location: Milton Keynes
Contact:

Post by jeawin »

Correct, without using the Basic Transformer...
_______________________________________
"If I had asked people what they wanted they would have said faster horses"
Henry Ford
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I don't know of one. You can, of course, write your own. If you do, why not publish it here so others can follow where you lead?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Nagaraj
Premium Member
Premium Member
Posts: 383
Joined: Thu Nov 08, 2007 12:32 am
Location: Bangalore

Post by Nagaraj »

This should give you expected results.

Upcase(Column_name)[1,1]:DownCase(Column_name)[2,Len(Column_Name)-1]
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

Slightly more efficient would be to substring first

UpCase(column[1,1])...
Nagaraj
Premium Member
Premium Member
Posts: 383
Joined: Thu Nov 08, 2007 12:32 am
Location: Bangalore

Post by Nagaraj »

[1,1] is nothing but substr, there is no function with that name in parallel job, but yes with server jobs we have that function called Substrings()
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That's not the same as "MCT" conversion, which would convert "paddy o'brien" to "Paddy O'Brien". The suggested "solutions" would yield "Paddy o'brien".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

You can use the following parallel routine to do what you want.

Code: Select all

#include<iostream.h>
#include<ctype.h>

char* ConvMCT(Char *str)
{

	int Space=1

	While(*str)
	{
		If(Space=1 and isLower(*str))
		{
			*str=toupper(*str)
		}

		If(int(*str)=32)
		{ 
			Space=1
		}
		else
		{
			Space =0
		}

		str++
	}
}
I don't have the compiler now hence its not tested. Its just an idea to achive what you want.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Obviously you have not tested or checked the code. So not intended to point mistakes.

Few changes required in the code
1.) look for non-alpha as seperators. So rather than restricting check for blank, look for anything between 'A-Z' or 'a-z' and do an 'else' capitalisation.
2.) Need return of char pointer
3.) Need to store initial location so to return
4.) Semicolons and cases
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

Sainath.Srinivasan wrote:Obviously you have not tested or checked the code. So not intended to point mistakes.

Few changes required in the code
1.) look for non-alpha as seperators. So rather than restricting check for blank, look for anything between 'A-Z' or 'a-z' and do an 'else' capitalisation.
2.) Need return of char pointer
3.) Need to store initial location so to return
4.) Semicolons and cases
I didn't had the compiler hence I was unable to validate anything. That was just an idea. Below is the modified one. Haven't tested this one also but incorporated the points made by you.

Code: Select all

 #include "stdio.h" 
#include "string.h" 
#include "stdlib.h"      
#include "ctype.h"

char* ConvMCT(char *str)  //Function with string input and string  
{
	char *result = (char *)malloc (sizeof(char *));
	int x=0, Flag=1;  // Setting Flag to 1 to make the first letter capital.
	char CheckStr[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";

	while(*str)
	{
		if(Flag=1)  //Check if the last character was not alphabet.
		{
			if(isalpha(*str) and islower(*str)) //Convert to uppercase if its a lower case alphabet.
			{
				result[x] = upper(*str);
			}
			else
			{
				result[x] = *str; //No Change if its already in uppercase or not an alphabet.
			}
			
		}
		else
		{
			if(isalpha(*str) and isupper(*str))
			{
				result[x] = tolower(*str); //Convert to lowercase except the first character. 
			}
			else
			{
				result[x] = *str;
			}
		}

		if(!strchr(CheckStr, *str))   //Check if the string is not a-z and A-Z.
		{ 
			Flag=1;
		}
		else
		{
			Flag =0;
		}
		++x;
		++str;
	}
	result[x] = '\0'; //Terminate the string 
	return result; //Return the replaced string
}
Hope this one is correct. :oops:
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

You are doing a good job !!

Couple of minor points.
1.) If you are doing malloc, you need malloc.h (if most compilers)
2.) Result is assigned to a random character pointer. Don't you think it is better to do a direct "result = str" so they point to the same place!!?
This way you can use 'x' for both rather than using pointers.
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

Sainath.Srinivasan wrote:Couple of minor points.
1.) If you are doing malloc, you need malloc.h (if most compilers)
The malloc function is already included in stdlib.h Hence no need to explicitly include malloc.h. May be some compilers doesn't have that (Never encountered one). In that case include malloc.h

Sainath.Srinivasan wrote:2.) Result is assigned to a random character pointer. Don't you think it is better to do a direct "result = str" so they point to the same place!!?
This way you can use 'x' for both rather than using pointers.
It would be better if result is assigned to str and then only update the required letters but if we have to check each and every character and converting the first one to uppercase and others to lowercase the performance will be similar.

but you can get rid of
else
{
result[x] = *str;
}

block. and number of lines will reduce.

However its on the person who wants to use it.

Anyways thanks for you valuable suggestion. I will try to post make a new version of it with this comment incorporated but I don't want to post another one without compiling and testing the code. :wink:

Thanks again.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
jeawin
Participant
Posts: 18
Joined: Mon Oct 04, 2004 6:49 am
Location: Milton Keynes
Contact:

Post by jeawin »

Hi all, just to say thanks for your input - much appreciated.

Jean
_______________________________________
"If I had asked people what they wanted they would have said faster horses"
Henry Ford
reachmexyz
Premium Member
Premium Member
Posts: 296
Joined: Sun Nov 16, 2008 7:41 pm

Post by reachmexyz »

did the code that priyadarshikunal sent worked. Since there is no free statement, the routine would blow up if there are many incoming records. Do a vmstat and the free memory should be coming down drastically depending on the number of records you are processing.
Post Reply