Complex file reading
Moderators: chulett, rschirm, roy
Complex file reading
I would like to read the following file format in a datastage job. Could you please let me know how can I do this? what are the stages I can use?
1 XXXXXXXXXXXXX
OPAL Detailed xxxxxxxxx xxxxx Listing Period: 20030228 To 20030331
0 Policy No. Table No. Name(s) R/I Claim
--------- -------- ---------------------------------------- ---------
60067322 35A MR BJ SWANSTON 26316.00
60349714 35A MRS CA JACOBS 30000.00
60381376 35A MR J MARTINSON 20000.00
60425069 35A MRS J CHAPMAN 20000.00
60450640 35A MRS BJ MCNAMARA 20000.00
60452691 35A MR KR NICHOLLS 49000.00
60474853 35A MR VG WILLIAMS 20000.00
60489440 35A MR R CHAPPELL 47000.00
60514587 35A MRS L SMITH 55556.00
60535076 35A MISS L LEBEN 10000.00
60637795 35A MR K LANG 20000.00
60697227 35A MRS H BLAND 50000.00
End of Report
------------
Page Total 367872.00
------------
0 Report Total 367872.00
------------
1 XXXXXXXXXXXXX
OPAL Detailed xxxxxxxxx xxxxx Listing Period: 20030228 To 20030331
0 Policy No. Table No. Name(s) R/I Claim
--------- -------- ---------------------------------------- ---------
60067322 35A MR BJ SWANSTON 26316.00
60349714 35A MRS CA JACOBS 30000.00
60381376 35A MR J MARTINSON 20000.00
60425069 35A MRS J CHAPMAN 20000.00
60450640 35A MRS BJ MCNAMARA 20000.00
60452691 35A MR KR NICHOLLS 49000.00
60474853 35A MR VG WILLIAMS 20000.00
60489440 35A MR R CHAPPELL 47000.00
60514587 35A MRS L SMITH 55556.00
60535076 35A MISS L LEBEN 10000.00
60637795 35A MR K LANG 20000.00
60697227 35A MRS H BLAND 50000.00
End of Report
------------
Page Total 367872.00
------------
0 Report Total 367872.00
------------
Ok, but not successful why? Are you able to identify the Policy rows?
Let's try this in a more "PX" fashion, shall we? Rather than the "one long string" approach, trying defining the sequential file metadata such that it matches only the Policy detail records. That should work because they look unique in the report - use the appropriate field delimiter, first field a six digit number, etc. When you read the file with that metadata, any rows that don't match that will automatically be discarded by the stage.
That or use a tool other than an ETL tool for this, one meant specifically to mine data from report files like you have: Monarch for example.
Let's try this in a more "PX" fashion, shall we? Rather than the "one long string" approach, trying defining the sequential file metadata such that it matches only the Policy detail records. That should work because they look unique in the report - use the appropriate field delimiter, first field a six digit number, etc. When you read the file with that metadata, any rows that don't match that will automatically be discarded by the stage.
That or use a tool other than an ETL tool for this, one meant specifically to mine data from report files like you have: Monarch for example.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Add a reject output link from the Sequential File stage; this will capture any row that does not match the record schema for a detail record, and downstream stages on this link can re-parse for example to retrieve the dates and totals.
Code: Select all
record schema
{
policy_no string[8];
filler1 string[1];
table_no string[3];
filler2 string[1];
remainder string[max=255];
}
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: