I am trying to decide if datastage is an efficient tool to parse the following text as opposed to doing it in Perl:
|sessionID=acnielsen3-716-1-834|.q211=1|.q354=0|.q175=0|.q176=0|.q124=0|
My file contains many of the above formatted rows. The number of pipes in one row is not fixed (dynamic). What I want to do is to loop through each key-value pair and dispatch them to different tables based on the keys (SessionID or .q).
If I use pipe as a delimiter in the sequential file, I will end up with more than 500 columns (have to take the max). In addition to that, each pipe has to be further divided by the "=". Also I can't find a way to loop through EACH key-value pair.
Does anyone have any idea how I can approach this problem?
Any suggestions are highly appreciated.
parse files and loop through each row
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 7
- Joined: Fri Jul 11, 2003 5:04 pm
-
- Premium Member
- Posts: 483
- Joined: Thu Jun 12, 2003 4:47 pm
- Location: St. Louis, Missouri USA
If you replace all the pipe chars with @VM, you can write them to a hash file and read them back out "normalized". On the output link you have to define the column that has this data in it as multi-value (type = 'MV') and change the "Normalize on" field at the top of the tab to the field name of this field.
When the rows get read out of the hash file you'll see:
sessionID=acnielsen3-716-1-834
.q211=1
.q354=0
.q175=0
.q176=0
.q124=0
You can separate the name,value pairs by using the Field function and specifying "=" as the field separator.
Name column derivation:
Field(InputColName, "=", 1)
Value column derivation:
Field(InputColName, "=", 2)
Hope this helps,
Tony
When the rows get read out of the hash file you'll see:
sessionID=acnielsen3-716-1-834
.q211=1
.q354=0
.q175=0
.q176=0
.q124=0
You can separate the name,value pairs by using the Field function and specifying "=" as the field separator.
Name column derivation:
Field(InputColName, "=", 1)
Value column derivation:
Field(InputColName, "=", 2)
Hope this helps,
Tony
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Replacing the pipe characters with @VM (a system variable) can be achieved with the Convert function. Process the whole line as a single column, apply Convert("|",@VM,InLink.ColName) as a transform function and write the result into a hashed file using sessionID as the key value.
Use "normalize on" in the Output link of the hashed file (or UV) stage.
Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
Use "normalize on" in the Output link of the hashed file (or UV) stage.
Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518