CFF Stage - Importing REDEFINES within the same record

vivekgadwal · Post by **vivekgadwal** » Thu Dec 17, 2009 1:48 pm

Gurus,

I have a problem while importing the following CFD into DataStage:

01  PLP-PM-EXTRACT-RECORD.

    05  PLP-PM-SURROGATE-KEY.
        10  PLP-PM-KEY-EXCT               PIC 9(06).                1-6
        10  PLP-PM-UNIQUE-NOS.
            15  PLP-PM-KEY-DATE.
                20  PLP-PM-KEY-CC         PIC 9(02).                7-8
                20  PLP-PM-KEY-YY         PIC 9(02).                9-10
                20  PLP-PM-KEY-MM         PIC 9(02).               11-12
            15  PLP-PM-KEY-SERIAL-NOS     PIC 9(06).               13-18
        10  PLP-PM-UNIQUE-NUM REDEFINES   PLP-PM-UNIQUE-NOS.
            15  PLP-PM-KEY-DAT-INITIAL.
                20  PLP-PM-INITIAL-YY     PIC 9(02).                7-8
                20  PLP-PM-INITIAL-MM     PIC 9(02).                9-10
            15  PLP-PM-INITIAL-SER-NOS    PIC 9(08).               11-18
    05  PLP-PM-ACTION-CODE                PIC X(01).               19-19
    05  PLP-PM-SYMBOL                     PIC X(03).               20-22
    05  PLP-PM-COMPANY                    PIC X(02).               23-24
    05  PLP-PM-POLICY-NUMBER              PIC X(07).               25-31
    05  PLP-PM-MODULE                     PIC X(02).               32-33
    05  PLP-PM-LOB-TYPE                   PIC X(03).               34-36
    05  PLP-PM-DIRECT-BUS-FLG             PIC X(01).               37-37
.....................
.....................more fields are present

The problem is, DataStage is considering the "REDEFINES" column as separate columns when I imported it from the CFD. This is resulting in data not being returned or data being scattered (only 1 row, with all 9s, is being returned in a weird way).

Other details:
> I have read the documentation for CFF stage and it does not mention anything about reading the groups with REDEFINES.
> I went into the Table Definition and did a "right-click > Edit Row". That took me to another window where the group field "PLP-PM-UNIQUE-NUM" is being shown as re-defining the column {Redefined field box has the field that this is redefining.}. But, I am unable to view data.
> I went ahead and manually deleted the re-defined fields (the fields inside the group "PLP-PM-UNIQUE-NUM") from the Outputs > Selection tab of the CFF stage in vain.

Please let me know what I should be doing which I am not doing now.

ray.wurlod · Post by **ray.wurlod** » Thu Dec 17, 2009 2:05 pm

REDEFINES means "different columns" - you have a different structure (2,2,2,6 versus 2,2,8). The CFF stage can deal with the REDEFINES - you specify the particular output columns that you require.

vivekgadwal · Post by **vivekgadwal** » Thu Dec 17, 2009 5:07 pm

ray.wurlod wrote:REDEFINES means "different columns" - you have a different structure (2,2,2,6 versus 2,2,8). The CFF stage can deal with the REDEFINES - you specify the particular output columns that you require.

Ray.

This worked like a charm in a SEQUENTIAL FILE STAGE, but not in CFF Stage. I am able to view data in a proper way in the former, but the latter still complains about it (Also, a point to note here is that this is a DOS format file...I just came to know about it while playing with the Format tab inside the SEQUENTIAL FILE STAGE).

However, there is another issue.

ISSUE: I am getting some premium fields as "S9(06)V9(02)". Data looks like "0005610{". IIS DataStage is importing (as Table Definition) these fields as Decimal. But, I guess, as a Decimal field cannot have "{" characters, all the premium rows are being shown as "000000.00". If I change the datatype to Character (Char) I am able to see "0005610{". How can I get this to "000561.00"?

As far as my understanding goes, the bracket { denotes a ZERO and } denotes a -ve sign etc. correct? I have also tried to change to the datatypes to Numeric, but it doesn't work!

Please help.

chulett · Post by **chulett** » Thu Dec 17, 2009 6:55 pm

I believe that would be a "zoned decimal" and from what little I recall you can tell it that in the CFF stage so it handles the overpunch correctly.

vivekgadwal · Post by **vivekgadwal** » Fri Dec 18, 2009 10:24 am

chulett wrote:I believe that would be a "zoned decimal" and from what little I recall you can tell it that in the CFF stage so it handles the overpunch correctly.

Thanks Craig. I am unable to use CFF Stage because the file has CR-LF as delimiters. I tried removing the CR and read it through CFF, but still it doesn't like it...the stage gives the error:

Code: Select all

Field "PLP_PM_TRANS_WRITTEN_PREMIUM" has import error and no default value; data: {0 0 0 5 6}, at offset: 77
.....

The field that it doesn't like is the field with '}'. On the contrary, Sequential File stage likes the same table definition (Record Delimiter Type: DOS Format) with the removal of the Redefined fields. But, again, the issue is arising with those decimal numbers (0005610}...I am getting all ZEROES when used in SEQ FILE stage). I tried setting the Decimal type default to "Packed" and I played with all the options (Yes, no (zoned), no (separate), no (overpunch)) but none seemed to read this as a COMP-3 decimal.

Please advise...

vivekgadwal · Post by **vivekgadwal** » Sat Dec 19, 2009 12:07 pm

An update on this...

I tried using Server job CFF stage and it gives me more flexibility in defining COMP-3 decimal and an option for Checking the sign bit for the record.

I am unable to achieve this using Parallel CFF or Sequential File stage. I have no clue where I am going wrong. Does anyone have any suggestions regarding this?

Thanks in advance.

Aruna Gutti · Post by **Aruna Gutti** » Tue Dec 22, 2009 3:04 pm

Vivek,

The only way I could get CFF Stage with redefines work is by changing the Record Definition.

If I want only one set of fields in the redefined group I removed other set of fields.

If the redefinition is basing on a specific criteria I created multiple record definitions basing on the criteria.

It is lot of work but that is the only way I could get it to work. It may not be the correct way.

Good Luck,

Aruna.

vivekgadwal · Post by **vivekgadwal** » Tue Dec 22, 2009 9:49 pm

Aruna Gutti wrote:Vivek,

The only way I could get CFF Stage with redefines work is by changing the Record Definition.

If I want only one set of fields in the redefined group I removed other set of fields.

If the redefinition is basing on a specific criteria I created multiple record definitions basing on the criteria.

It is lot of work but that is the only way I could get it to work. It may not be the correct way.

Good Luck,

Aruna.

Thanks Aruna. I will give this a shot. But, if I remove the fields that are re-defined, I was able to get it in the CFF stage. There is this issue about reading the Signed (COMP-3) Decimal fields [with data like...'000561{' ] that I am unable to read. I also posted my trials in my previous posts.

Have you come across this situation before? If so, how did you handle it?

Thanks for the suggestion again.

Aruna Gutti · Post by **Aruna Gutti** » Wed Dec 23, 2009 2:12 pm

I used Character set EBCDIC and data format binary. If I recall correctly the imported data type will come as Decimal.

I do not have the exact code with me as I just switched my job to some other company.

I derived the proper value in a stage variable;

If infield < 0 infield * -1 else infield.

I will try to get you a proper answer shortly.

vivekgadwal · Post by **vivekgadwal** » Wed Dec 23, 2009 3:15 pm

Aruna Gutti wrote:I used Character set EBCDIC and data format binary. If I recall correctly the imported data type will come as Decimal.

I do not have the exact code with me as I just switched my job to some other company.

I derived the proper value in a stage variable;

If infield < 0 infield * -1 else infield.

I will try to get you a proper answer shortly.

Great...thanks for your suggestion. Please let me know how you handled this stuff.

FYI...when imported, my File Definition is having Decimal fields in it. The issue is with data. As previously noted by me in the posts, the Decimal field in CFF stage / Sequential file stage is unable to read COMP-3 Decimal data with its default settings.

I think that I am missing some setting here which should read this type of data as I am able to do so in Server CFF stage

I have mentioned all my trials in the previous posts.

vivekgadwal · Post by **vivekgadwal** » Thu Dec 24, 2009 10:40 am

vivekgadwal wrote: I have mentioned all my trials in the previous posts.

I was doing some more trials on this. I have changed the fields that are not having the COMP-3 values (data NOT with values like "000561{") into "DISPLAY_NUMERIC" type and kept the fields with COMP-3 values as "DECIMAL" and this is the error I am getting:

Code: Select all

Field "PLP_PM_TRANS_WRITTEN_PREMIUM" has import error and no default value; data: {0 0 0 5 6}, at offset: 77
....
....

Here is how my record-options tab values look like:

Code: Select all

Byte Order:Native-Endian
Character-set: ASCII
Data format: Text
Record Delimiter: UNIX Newline

Decimal Rounding: Nearest value
Separator: Project Default

NOTE: I have removed the CR (carriage return) values that are accompanying the file when FTP-ed in this Test. As noted before in my first post, if I have the original file in Sequential File stage and mention that it is DOS format, the stage is able to parse the data perfectly. However, the issue is with these COMP-3 decimal values

Any help is very much appreciated...

Thanks.

chulett · Post by **chulett** » Thu Dec 24, 2009 10:48 am

As noted early on - unless you have an FD that says otherwise and then we have "issues" - those are not comp-3 fields aka 'packed decimal'. They are zoned decimal fields.

vivekgadwal · Post by **vivekgadwal** » Thu Dec 24, 2009 11:53 am

chulett wrote:As noted early on - unless you have an FD that says otherwise and then we have "issues" - those are not comp-3 fields aka 'packed decimal'. They are zoned decimal fields.

Craig,

My apologies for referring to those fields as COMP-3. I caught on the terminology from the Server CFF Stage. PX CFF Stage doesn't have that option (I noticed that the Table Definition import itself has this set to Zoned decimal). PX Sequential file stage has that option in "Type Defaults" and here is the conundrum I am facing...

In SEQUENTIAL FILE STAGE:
WRITTEN_PREMIUM field is have the definition PIC S9(6)V9(2) and it has the data like "0004800{"
There is another field EARNED_PREMIUM (which is part of the Occurs) whose definition is PIC S9(6)V9(2) and it has data like "0000407P".

When I do a view data on this, the tool is unable to decode the WRITTEN_PREMIUM field data, but it is giving me signed data for EARNED_PREMIUM field (like "-000040.70").

Aruna,

I have followed your suggestion in importing a new table definition by removing the REDEFINED fields and I am still having this issue (as it is Datatype related

)

chulett · Post by **chulett** » Thu Dec 24, 2009 5:34 pm

Worst case you could build a lookup with the 20 values to substitute for the last character and do all that while it's still a string. The link I posted showed you that a 'P' would be a -7 as the last digit. You'll probably want to handle the signage separately and multiply the result by -1 for the negative range.

Still pretty sure the stage should be able to handle them automagically, however.

vivekgadwal · Post by **vivekgadwal** » Sat Dec 26, 2009 12:01 pm

chulett wrote:Worst case you could build a lookup with the 20 values to substitute for the last character and do all that while it's still a string. The link I posted showed you that a 'P' would be a -7 as the last digit. You'll probably want to handle the signage separately and multiply the result by -1 for the negative range.

Still pretty sure the stage should be able to handle them automagically, however.

Thanks Craig. The way you proposed can be done. But, as you noted, it should be done only in a worst case scenario. Right now, to beat the clock, I am reading the file in a Server job using its CFF Stage which is reading the "Zoned" values along with the signs (I see '+' and '-' signs before the numbers). The job design is as follows:

Code: Select all

CFF ---> XFM ---> Seq File (comma delimited)

I am reading this Seq File (comma delimited) in my PX job (importing the definition, which was a pain in the neck as I cannot copy and paste the stage

).

However, one issue is arising now...after reading all the rows from the file (there are about 819,000 rows, the CFF Stage is reading one more row (which doesn't exist in the file). That extra row is not having any data...only commas.

Code: Select all

,,,,,,....,+,+,+,,,...,+,...

When I try to view this in Vi editor, it shows some weird characters in some places of the row (which are defined as Character fields...so I believe the tool is probably inserting some @FM or @VM in there).

Questions:
1) Why is it reading an addition row? I have seen this file in Hex editor and all the rows have '0D' and '0A' as delimiters. However, even if I strip off these '0D's and run it, it is still reading this extra line.
2) If I check the option "Omit last new line" in the Server Seq File stage, the PX job is throwing an error saying that it is expecting a new line (as EOR) and it doesn't have one.

Please suggest...

Also, if anyone else has worked on PX CFF stage to get the Zoned decimal values, please help me in configuring the stage.

***UPDATE***
In PX Sequential File Stage, if I change the datatype for the fields with Zoned decimal to "DISPLAY_NUMERIC" (previously it was DECIMAL...which is showing as COMP-3) and read the Fixed-width file directly (not as mentioned above), I can see some rows being read and some are still being displayed as 000000.

Upon further re-search into the data, the data with values like "0000274R" is being correctly read as "-000027.42" and data with values like "0000580}" is still "00000000"!! Is this because of the "{ or }" characters in the data? If so, shouldn't the stage read all the "Zoned Decimal" characters which includes "{" and "}"??