Need help to generate schema files dynamically for txt files
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 6
- Joined: Sun May 31, 2015 12:04 am
Need help to generate schema files dynamically for txt files
Hi, I have a requirement like there will be set of input .txt files with same set of columns but columns order will be different in each input .txt file. I need to create one reusable job to convert the columns order into one particular column order for all the input .txt files.
For Ex:
I/P:
1st Input .txt file Columns order is A B C D E F G H
2nd Input .txt file Columns Order is A D C H G B F E
3rd Input .txt file Columns Order is B G E H A C F D
....
Nth Input .txt file Columns Order is E H D A C F B G (in some order)
O/P:
1st Input .txt file Columns order is A B C D E F G H
2nd Input .txt file Columns Order is A B C D E F G H
3rd Input .txt file Columns Order is A B C D E F G H
....
Nth Input .txt file Columns Order is A B C D E F G H
Can someone pls help me on this requirement?
Thanks in advance!
For Ex:
I/P:
1st Input .txt file Columns order is A B C D E F G H
2nd Input .txt file Columns Order is A D C H G B F E
3rd Input .txt file Columns Order is B G E H A C F D
....
Nth Input .txt file Columns Order is E H D A C F B G (in some order)
O/P:
1st Input .txt file Columns order is A B C D E F G H
2nd Input .txt file Columns Order is A B C D E F G H
3rd Input .txt file Columns Order is A B C D E F G H
....
Nth Input .txt file Columns Order is A B C D E F G H
Can someone pls help me on this requirement?
Thanks in advance!
So... I'm wondering how dynamic this all is. How will you know which files have what column layout? It is simply numbered as you've shown (the first is always X, the second always Y, etc. whatever first and second mean at any given time) or is there a naming standard that tells you? Do you need to discover it individually for each file? Something completely different?
Let's start there.
Let's start there.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Do the files have column headings? (If yes, that will make the task a lot easier).
That said, I don't believe there's going to be an easy "one job" solution. You will almost certainly be up for doing some coding, either a routine or a Build stage.
That said, I don't believe there's going to be an easy "one job" solution. You will almost certainly be up for doing some coding, either a routine or a Build stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 6
- Joined: Sun May 31, 2015 12:04 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
My solution would use two jobs - one to re-order the columns into the standard format, and the other to process that file.
The first job would read the file as a single VarChar column, including reading the heading row as data (do not check the "first line is column headings check box), and store the column headings and positions from row 0 into stage variables. Its output would use the positional information and the Field() function to direct columns appropriately onto the output.
The first job would read the file as a single VarChar column, including reading the heading row as data (do not check the "first line is column headings check box), and store the column headings and positions from row 0 into stage variables. Its output would use the positional information and the Field() function to direct columns appropriately onto the output.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 284
- Joined: Fri Oct 13, 2006 4:31 am
Read the file with a single column and then have a column import stage. In column import select option for schema file and have a parameter for the same.
In this way you will have a single multi instance job but you will have 'n' schema file for n files. When you invoke the job just pass the parameter for the specific schema file. You can re-use the job
In this way you will have a single multi instance job but you will have 'n' schema file for n files. When you invoke the job just pass the parameter for the specific schema file. You can re-use the job
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Premium Member
- Posts: 483
- Joined: Thu Jun 12, 2003 4:47 pm
- Location: St. Louis, Missouri USA
Another solution to try would be to use the TEXT file ODBC driver and the ODBC stage. As long as you have column headers that are always the same, even if the columns aren't in the same order it should work just by doing "select a,b,c,d,e,f,g,h from text file name". I'm not sure of the exact syntax of the select for text files, but some google research should give you what you need.
I hope this helps,
Tony
I hope this helps,
Tony
-
- Premium Member
- Posts: 6
- Joined: Sun May 31, 2015 12:04 am
Hi Abhilash, I tried your solution. If the input file column header order and the schema file column order are same then only its processing the input file and loading the data into output file.
Could you please help me on the details how i can reorder the columns with this approach?
Thanks in advance!
John
Could you please help me on the details how i can reorder the columns with this approach?
Thanks in advance!
John
-
- Premium Member
- Posts: 6
- Joined: Sun May 31, 2015 12:04 am
Hi Ray,
I tried your solution. i was able to read the input file as Single Varchar column and able to store the column heading and positions into stage variables.
After that i stuck to proceed further (on how to iterate through all the rows and direct column appropriately onto to the output using Field() function)....
could you please give some more insight on this?
Thanks in advance!
John
I tried your solution. i was able to read the input file as Single Varchar column and able to store the column heading and positions into stage variables.
After that i stuck to proceed further (on how to iterate through all the rows and direct column appropriately onto to the output using Field() function)....
could you please give some more insight on this?
Thanks in advance!
John
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Basically it's a whole heap of If..Then..Else expressions that test the field number associated with the specific column heading and, where appropriate, use the Field() function to extract that particular substring out of the single VarChar input column.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 6
- Joined: Sun May 31, 2015 12:04 am
John, a couple of things.
Use the Reply to topic link rather than quoting everything from the post you are replying to using Reply with quote unless it makes sense to do so. Also it doesn't make a whole lot of sense to use BBCode tags like for quoting, bold, italics and other things and then check the option to "Disable BBCode in this post"... pretty much defeats the purpose.
Use the Reply to topic link rather than quoting everything from the post you are replying to using Reply with quote unless it makes sense to do so. Also it doesn't make a whole lot of sense to use BBCode tags like for quoting, bold, italics and other things and then check the option to "Disable BBCode in this post"... pretty much defeats the purpose.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers