Multi-value fields and strange output file format

ppalka · Post by **ppalka** » Mon Jul 18, 2005 3:38 am

Hello,
I have some source data ( records delimited by new line and fields by 'tab' ). In this data some fields are multi-valued ( sub-fields delimited by '#' ). The number of that subfields is unknown. How to easily extract all subfields and make some transformations on them?
The second case is the output file format:
Header file record
Account 1 Account Header
Account 1 History Header Record
Account 1 History Record (Month 1)
Account 1 History Record (Month 2)
Account 1 History Record (Month 3)
... ...
Account 1 Client Record (1)
Account 1 Client Record (2)
Account 2 Account Header
Account 2 History Header Record
Account 2 History Record (Month 1)
Account 2 History Record (Month 2)
Account 2 Client Record (1)
Account n Account Header
Account n History Header Record
Account n History Record (Month 1)
Account n History Record (Month 2)
Account n Client Record (1)
Account n Client Record (2)
Final Record
One input record have to be mapped to one account information. For each input record, output history and client records are made from multivalued fields. How to handle a such output format?
Thanks in advance.
Regards,
Piotrek

Sainath.Srinivasan · Post by **Sainath.Srinivasan** » Mon Jul 18, 2005 4:54 am

Need more info. What kind of transformations you want?

Because you can flatten the multivalued fields in DataStage. But that will be based on the actions you wish to do in the fields.

ppalka · Post by **ppalka** » Mon Jul 18, 2005 5:05 am

It is difficult to say about all transformations. But generally, I think, there will be no complex derivations. I select one input record from some table and make a join condition with some other tables. And I have to put that data in such output format as I wrote before. But some fields are multivalued (generally some dates). And by taking each date from that multivalued field I have to make a history (from an older date to younger) for every input record.
I found some checkbox in Tables definition: Meta data supports mulivalued fields, but in docs there is too little info about using that feature.

Regards,
Piotrek

ray.wurlod · Post by **ray.wurlod** » Mon Jul 18, 2005 2:57 pm

There are multivalued-handling equivalents of most intrinsic conversion functions and operators, usually constructed by adding "S" to the name.

For example, IF becomes IFS (and a function), FIELD becomes FIELDS, GE becomes GES, OCONV becomes OCONVS, and so on.
You may also need the REUSE function if substituting "constants", and SUM or SUMMATION to form totals of the values in a multivalued field.

Once you have your specification post again, and we can be more specific with our responses.

ppalka · Post by **ppalka** » Tue Jul 19, 2005 4:01 am

ray.wurlod wrote:There are multivalued-handling equivalents of most

Where can I find some information about that functions dedicated to multivalued fields? I cannot find it in DS help.

For example I have such source records :
First record:
MG0125600003 ANNUITY 20010913 PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN 0#3950.71#4015.01#4085.08#4147.03#4207.91#4270.13#4328.86#4389.03#4443.77#4498.51#4554.91#4607.63#4662.12#4712.9#4770.89#4817.26#4862.73#4910.4#4954.06#5000.00 20030513#20030413#20030313#20030213#20030113#20021213#20021113#20021013#20020913#20020813#20020713#20020613#20020513#20020413#20020313#20020213#20020113#20011213#20011113#20011013#20010913#20020902#20020214#20010913
Second record:
MG0125600004 MONTHLY 20020913 PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN#PLN 0#3950.71#4015.01#4085.08#4147.03#4207.91#4270.13#4328.86#4389.03#4443.77#4498.51#4554.91#4607.63#4662.12#4712.9#4770.89#4817.26#4862.73#4910.4#4954.06#5000.00 20030513#20030413#20030313#20030213#20030113#20021213#20021113#20021013#20020913#20020813#20020713#20020613#20020513#20020413#20020313#20020213#20020113#20011213#20011113#20011013#20010913#20020902#20020214#20010913

And I want to create in one output file a following structure:
File Header Record (using first field for example)
Account Record(second field)
History Header Record (number of dates in last field)
History Record 1 (for every date in last field)
History Record 2
...
History Record 3
Ending Record(some information)

How to solve that problem? Can I use hashed file to proceed only one record at time - proceed input data and just send it to output? Without collecting all input records?

Best regards,
Piotrek

ray.wurlod · Post by **ray.wurlod** » Tue Jul 19, 2005 5:50 am

Where can I find some information about that functions dedicated to multivalued fields?
DataStage BASIC manual, Chapter 2, especially the section on Dynamic Array Operations - multi-valued fields are a specific instance of dynamic arrays. There is a table of the vector functions in this section - multi-valued fields can also be thought of as vectors.

To convert your "#" delimited fields to dynamic arrays, you will need to convert the "#" characters to a dynamic array delimiter, for example @FM to @VM. The vector functions do not work with "#" as the delimiter. A suitable function is

Code: Select all

Convert("#", @VM, InLink.TheColumn)

As to solving your exact problem, I think that's a big ask. It's getting close to the fine line between how much help we can give without devoting too much unpaid time, and your hiring a competent consultant.

If I can find the time I'll get back to it. But I am very busy at the moment.

ppalka · Post by **ppalka** » Tue Jul 19, 2005 5:58 am

ray.wurlod wrote:
Code: Select all
Convert("#", @VM, InLink.TheColumn)

I have done it before.

If I can find the time I'll get back to it. But I am very busy at the moment.

Thanks a lot. I'm not expecting you to solve whole problem, I would be glad to see some useful advises.

Regards,
Piotrek

ppalka · Post by **ppalka** » Wed Jul 20, 2005 8:33 am

Maybe somebody had a similar problem and can give me some advise? I am thinking about that problem and I still haven't got any acceptable solution. I could write some basic functions to handle that multivalues fields, but I am looking for more universal solution.
Thanks in advance.

Regards,
Piotrek

ray.wurlod · Post by **ray.wurlod** » Wed Jul 20, 2005 9:07 pm

The output file needs to be constructed in three parts.
The header record. One job or routine. Create file.
The individual account records. One job sequence. Append to file.
The trailer record. One job or routine. Append to file.

The job sequence to process an account is similar. It could be an ordinary job if you can use before/after subroutines to append the account header and account trailer lines to the file. The body of the job then processes the account detail lines.

ppalka · Post by **ppalka** » Thu Jul 21, 2005 1:45 am

ray.wurlod wrote:The output file needs to be constructed in three parts.
The header record. One job or routine. Create file.
The individual account records. One job sequence. Append to file.
The trailer record. One job or routine. Append to file.

But what if I get on my input not only one record, but many more? And then each record should be processed as you wrote above. Can I transfer a record from one job to another in job sequence? So how to do that?
How to handle input multivalue fields to make history records?

The job sequence to process an account is similar. It could be an ordinary job if you can use before/after subroutines to append the account header and account trailer lines to the file. The body of the job then processes the account detail lines.

In my case I need to provide about 8-10 information from input to create header and trailer. Subroutines takes only one parameter. I could make a multivalue field to transfer needed information, but I think it is not a good solution.

Thanks.

Regards,
Piotrek