split the names

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
consulting
Participant
Posts: 50
Joined: Fri Dec 21, 2007 3:24 am

split the names

Post by consulting »

I have a text file which have a name column it contains 140000 names starting from alphabe A-Z with duplicate entries
it want to load it into files asper the name (ie name starts with A into one file starts with B to another file ...........)

in sql we can achieve it by where names like "A%" same how to get in datastage
balaji
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

You can split the stream using a switch or transform or filter stage (I'm sure that there are others I've overlooked)
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Transformer or Filter stage with 26 outputs (one for each letter) and one rejects/otherwise link.

A Switch stage is not appropriate; it switches on specific values, so you might need 140000 separate switch conditions!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
PhilHibbs
Premium Member
Premium Member
Posts: 1044
Joined: Wed Sep 29, 2004 3:30 am
Location: Nottingham, UK
Contact:

Post by PhilHibbs »

ray.wurlod wrote:A Switch stage is not appropriate; it switches on specific values, so you might need 140000 separate switch conditions!
Could you pick out the first character in a prior stage, and split on that? (I don't know parallel stages)
Last edited by PhilHibbs on Wed Feb 27, 2008 6:22 am, edited 1 time in total.
Phil Hibbs | Capgemini
Technical Consultant
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yes, but that stage would have to be a Modify or Transformer stage. If the latter, then you may as well take the outputs from it directly.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
PhilHibbs
Premium Member
Premium Member
Posts: 1044
Joined: Wed Sep 29, 2004 3:30 am
Location: Nottingham, UK
Contact:

Post by PhilHibbs »

Just as in C a switch statement is more efficient than nested if...elses, I thought a switch stage might be more efficient than a transformer with 26 conditional outputs.
Phil Hibbs | Capgemini
Technical Consultant
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I was thinking of the Switch on the first letter, but as Ray has pointed out, you would need a transform/modify stage in the job, so you might as well use the transform to split the streams.
Post Reply