Complex Flat File stage with Variable Length rows
Moderators: chulett, rschirm, roy
Complex Flat File stage with Variable Length rows
I have same scenario as Don did in this post.
My file looks something like this -
CUS_INFO
05 First_Name PIX(5).
05 Last_Name PIX(5).
05 Age PIX(5).
05 Phone_Grp Occurs 3 Times PIX(10).
Sample File -
SunilBassi031012345678901234567890123456789
SunilBassi03101234567890123456789
SunilBassi0310123456789
and record delimiter will be \n - next line character.
Sunil being First Name
Bassi being Last Name
031 being Age
0123456789 being phone number
Now I may have scenario - when one customer may have 3 phone number, one may have only two, one may have one and one may have none. and data may come as mentioned in sample file.
Max number of columns - 6
Min number of columns - 3
and number of columns may be 3,4,5 or 6 and this is no delimiter variable length text file and need to read is through CFF stage.
I have used,
used .cfd metadata
file types(s) in file option-
Record type as variable length
data type as text, ascii
record delimiter as unix new line.
Now I may read records which are having all 6 columns, but if records having less columns.. CFF not able to read those records -
Can anyone provide solution.
My file looks something like this -
CUS_INFO
05 First_Name PIX(5).
05 Last_Name PIX(5).
05 Age PIX(5).
05 Phone_Grp Occurs 3 Times PIX(10).
Sample File -
SunilBassi031012345678901234567890123456789
SunilBassi03101234567890123456789
SunilBassi0310123456789
and record delimiter will be \n - next line character.
Sunil being First Name
Bassi being Last Name
031 being Age
0123456789 being phone number
Now I may have scenario - when one customer may have 3 phone number, one may have only two, one may have one and one may have none. and data may come as mentioned in sample file.
Max number of columns - 6
Min number of columns - 3
and number of columns may be 3,4,5 or 6 and this is no delimiter variable length text file and need to read is through CFF stage.
I have used,
used .cfd metadata
file types(s) in file option-
Record type as variable length
data type as text, ascii
record delimiter as unix new line.
Now I may read records which are having all 6 columns, but if records having less columns.. CFF not able to read those records -
Can anyone provide solution.
Gaurav Martha
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I assume that PIX(5) actually means PIC X(5).
Because there is no OCCURS DEPENDING ON clause Phone_Grp will always occur exactly three times. Presumably the unused fields are blank-filled. Since the data type is string, DataStage will happily read those strings; you can filter them out (replace with null) subsequently.
Because there is no OCCURS DEPENDING ON clause Phone_Grp will always occur exactly three times. Presumably the unused fields are blank-filled. Since the data type is string, DataStage will happily read those strings; you can filter them out (replace with null) subsequently.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Re: Complex Flat File stage with Variable Length rows
can be use Trim(),Ltrim() and Rtrim() fuctions within the transformer
for example
first count the total length of character is 18
source -> transformer -> target
SunilBassi03101234 Ltrim(18,13) Sunil
Here total length is 18 but we need first 5 character only use that function and if else statement also
for example
first count the total length of character is 18
source -> transformer -> target
SunilBassi03101234 Ltrim(18,13) Sunil
Here total length is 18 but we need first 5 character only use that function and if else statement also
mgplk
Re: Complex Flat File stage with Variable Length rows
Gopal,123gopal wrote:can be use Trim(),Ltrim() and Rtrim() fuctions within the transformer
for example
first count the total length of character is 18
source -> transformer -> target
SunilBassi03101234 Ltrim(18,13) Sunil
Here total length is 18 but we need first 5 character only use that function and if else statement also
This will work though, I can't use Trim or substring function in Transformer stage as this will work for less number of columns but I have a scenario where I have 900 columns, and this approach will cause maintainability issue, That's why I want to do that in CFF stage, in case you could help me.
Gaurav Martha
Yes your understanding is correct this is PIC X(5), Phone_Grp will occur 3 times and if field is unused source is not sending spaces for that, means less number of columns for that row, though I am not able to see the contents in here.ray.wurlod wrote:I assume that PIX(5) actually means PIC X(5).
Because there is no OCCURS DEPENDING ON clause Phone_Grp will always occur exactly three times. Presumably the unused fields are blank-filled. Since ...
Gaurav Martha
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I tried to explore Multiple record type option, but as it requires Record ID as "indicator" field for ALL TYPES of records, I checked with Source team and they are not able to provide any indicator field in extracts.
The CFF stage does a wonderful job of separating out multiple record formats, but only when there is an "indicator" field in ALL the records (and in the same spot) to indicate record type.
Does anyone have idea - if Parsing of records is possible in CFF stage without having any record type indicator, that means only one output link can have all record types?
The CFF stage does a wonderful job of separating out multiple record formats, but only when there is an "indicator" field in ALL the records (and in the same spot) to indicate record type.
Does anyone have idea - if Parsing of records is possible in CFF stage without having any record type indicator, that means only one output link can have all record types?
Gaurav Martha
This is the critical obstacle for you. If your source team cannot provide you with consistently formatted data, you cannot use the convenient fucntionality provided by certain stages, particularly CFF.Gen1715 wrote:I tried to explore Multiple record type option, but as it requires Record ID as "indicator" field for ALL TYPES of records, I checked with Source team and they are not able to provide any indicator field in extracts.
The short answer for you here: You must analyze your data sources and come up with your own reliable method for identifying record types. Failing that, were I in your shoes, I'd go back to the source team and suggest that their design is defective (being polite about it, of course) and get management to spend the money to fix it.
If you find such a method, transformers with multiple links and a constraint on each link for the identified type is the simplest approach. I don't have CFF available (internal issue), and I use the record type to find the correct link.
In the meantime, you are faced with coding a job for each distinctive record type, and only reading those records in that job.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson
Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson
Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Why are you even using CFF stage for this? A COBOL FD such as you have given specifies fixed-width so, unless there's a good way of handling missing columns, you have a problem.
I'd be using a server Sequential File stage (in a server Shared Container if it must be in a parallel job) which has the best facilities for handling missing columns. You can use the same table definition (imported from FD).
I'd be using a server Sequential File stage (in a server Shared Container if it must be in a parallel job) which has the best facilities for handling missing columns. You can use the same table definition (imported from FD).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Do not send private emails for responses to DSXchange questions unless you are proposing a commercial arrangement.
Especially don't send five identical emails.
Especially don't send five identical emails.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanks for your response FranklinE ,
Actually I do have a method to read this file. What I can do is - Use an UNIX script and convert this No delimiter variable length text file into a delimited file, and then can deal with delimited file in DS very easily. This approach is working beautifully.
But my architect is more interested in using DataStage tool only for reading this No delimiter variable length text file, I am open for any suggestion within scope of DataStage(Not resctricted to CFF stage).
Actually I do have a method to read this file. What I can do is - Use an UNIX script and convert this No delimiter variable length text file into a delimited file, and then can deal with delimited file in DS very easily. This approach is working beautifully.
But my architect is more interested in using DataStage tool only for reading this No delimiter variable length text file, I am open for any suggestion within scope of DataStage(Not resctricted to CFF stage).
Gaurav Martha
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Server Sequential File stage can do this easily. You can use this stage in a server Shared Container in a parallel job if that's what you really need to do, but maybe try it first in a server job. Scroll right in the Columns grid to find the "missing columns" rules.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.