Dynamically construct a string of all fields from a stage

ncsridhar · Post by **ncsridhar** » Tue Nov 08, 2005 8:28 pm

I am trying to construct a long delimited string which is basically a consolidation of all the fields in 1 row of a stage at a time.

I know that I could do something like:

InputLink.Field1 : '_' : InputLink.Field2 : '_' : ... InputLink.Fieldn

in the transformer, but I need a re-usable object here to do that (such as a routine that takes the stage as input) to do it because I plan to reuse this object to do some error reporting on the rejects in many other similar jobs. In the error report I need to spit out the whole record of data along with some other information.

I don;t know if the BASIC language provides an object variable of type 'record' or something of that nature and let me traverse through all the fields of it to construct one consolidated string!

Can anyone please point me in the right direction or provide an alternative solution?

kcbland · Post by **kcbland** » Tue Nov 08, 2005 8:39 pm

The row is unavailable as a variable, only as columns. You may consider writing your columns to a sequential text file with a delimiter of "_". You then could read the file back using a single column as metadata and take it from there.

Or, write your columns to a hash file, and then use a function to "reference" (really just read) from the hash file. Since the read will return the entire row with columns delimited by @AM, a simple CHANGE(row, @AM, "_") will convert it to a text string.

ncsridhar · Post by **ncsridhar** » Wed Nov 09, 2005 12:33 am

I actually have to write the row and other information to an MQ message, so that the error reporting job will pick the message up and email it to a destination email address. Writing to a file will impose file structure maintenance challenges (we have a lot of these jobs running and all of them need this).

I am just wondering if there is any way to construct a string out of all the fields of a row dynamically (irrespective of how many fields the table definition has). Iam a novice to the tool, but its kind of hard to believe that there is no way. Most of the other similar tools have and I know of that for a fact!

ray.wurlod · Post by **ray.wurlod** » Wed Nov 09, 2005 3:24 am

There's no legal way.

If you inspect the code produced by compiling a Transformer stage you will discern that each column is assigned to an element of a dimensioned array called STAGECOM.ARR. Other elements in this array are used for other purposes.

With lots of knowledge you could use a MATBUILD statement to extract the appropriate elements of this array into a delimited string (the delimiter character can be specified in the MATBUILD statement).

This approach is not for the faint hearted. You will need to research what constants and variables (such as STAGECOM.HIGHWATER - most of them are in the DSD_STAGE.H header file) you need to specify the appropriate range of elements of the dimensioned array.

It is likely that this information ceases to be valid with effect from the Hawk release.

ArndW · Post by **ArndW** » Wed Nov 09, 2005 3:30 am

I concur with Ray's assessment that the STAGECOM.ARR approach should only be used for a one-off type of job as it is not documented and likely to change.

What you can do is write your data to a named pipe (using no column delimiters and no quotes) and then read that pipe as a 1 varchar() column. You now have all of your columns put together into a string as you want and can process it. This is efficient as it doesn't write the data to disk and also has the advantage that the approach applies to Server and PX jobs alike _plus_ it is portable and uses supported technology.

vmcburney · Post by **vmcburney** » Wed Nov 09, 2005 5:17 am

Been thinking of doing something similar in a parallel job, a shared container that receives a propogated input link with almost no columns defined and can derive the key and value fields for error reporting. May be possible via a custom parallel stage. Best approach for a server job is to use a column merge stage and send the output to a shared container as a long delimited text string.

This is more robust then trying to concatenate in a transformer as you are likely to get coding errors and it does not handle nulls.

The output of the merge column stage can then be sent to an error handling shared container. In your error container you would add the extra MQ fields and write it to an MQ plugin.

kcbland · Post by **kcbland** » Wed Nov 09, 2005 7:18 am

ncsridhar wrote:Writing to a file will impose file structure maintenance challenges (we have a lot of these jobs running and all of them need this).

What are you talking about? I don't think you understand the trick. Every job will have a specific column structure relative to that job. Write it to a sequential file with N columns, read it back from sequential file as 1 column. Your task is to somehow switch the metadata from a row of N columns to a row of 1 column. By writing to a sequential file an N column row, using _ as a delimiter, and reading it back using metadata of a single column delimiter of something like TAB, you'll get exactly what you requested.

Using a pipe instead of a file will look exactly the same. You'll use the sequential file stage but check Uses Named Pipe instead. You also have to create a pipe for every single job ahead of time.

As for other tools, well, you're using this one. The makers have decided that access to the whole row, which is really a ragged array of column values, is not in your interests to have. Sucks, but that's just the way it has always been.

ncsridhar · Post by **ncsridhar** » Wed Nov 09, 2005 11:49 am

Thanks for all the valuable inputs. I really appreciate it.

I think I will try to use the named pipe approach and also would like to try the column merge thing to see if its going to do me any good. The problem with column merging is that I have to do a seperate merging for every job. If I am prepared to do that, I could just merge the columns using coding like someone suggested and just somehow handle the NULLs.

Let me get back to you folks with what I found.

Thanks.

ncsridhar · Post by **ncsridhar** » Thu Nov 10, 2005 6:41 pm

We use message queues so luckily I could write the information into a message queue and read from there for my error reporting module. This is much more elegant than writing to files and dealing with their management. Also, the corporate wants messaging to be used extensively...