Extract and Load many tables - Short and fast
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 9
- Joined: Wed Nov 14, 2007 11:00 am
- Location: Montreal
Extract and Load many tables - Short and fast
Hi everyone. Hope you are doing well.
I am on a migration project at the moment. I have to ETL legacy apps information for the new applications tables but I also have to move some legacy apps tables with no modification for backward compatibility.
There is not that much tables but I was wondering if there is a better way to do this. Right now, I have 3 simple jobs moving data from DB A to DB B. Theses jobs contain like 10 times ODBC stage (in) -> ODBC stage (out) with runtime column propagation.
It works but even with parameter sets, it's boring and ugly. Using SSIS, I know I can create a loop (while read a file for table names) and automatically pass table names to db stages. It is also easy to create sql scripts to do that kind of stuff... but it a perfect world when using a centralized ETL tool, you'd like to use that centralized tool to control and have a centralized view of data transfers as well.
What would you do in a situation like that ? Any tricks ? What do you think ?
Thanks!
I am on a migration project at the moment. I have to ETL legacy apps information for the new applications tables but I also have to move some legacy apps tables with no modification for backward compatibility.
There is not that much tables but I was wondering if there is a better way to do this. Right now, I have 3 simple jobs moving data from DB A to DB B. Theses jobs contain like 10 times ODBC stage (in) -> ODBC stage (out) with runtime column propagation.
It works but even with parameter sets, it's boring and ugly. Using SSIS, I know I can create a loop (while read a file for table names) and automatically pass table names to db stages. It is also easy to create sql scripts to do that kind of stuff... but it a perfect world when using a centralized ETL tool, you'd like to use that centralized tool to control and have a centralized view of data transfers as well.
What would you do in a situation like that ? Any tricks ? What do you think ?
Thanks!
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
You can put the loop in a sequence, perhaps driving the loop from a query against the source system's system tables to derive a list of table names to be transferred.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 9
- Joined: Wed Nov 14, 2007 11:00 am
- Location: Montreal
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Code: Select all
Execute Command ---> User Variables ---> Start Loop <------- End Loop
| ^
| |
+-----> Job -----+
In the User Variables activity convert any @FM (line terminators) in the command output into some useful delimiter characters that does not occur in the data. Also remove any empty final line (for example by taking all but the final character from the string).
Use this delimiter in the Start Loop activity and pass the user variable as the "list of things".
Pass the Counter activity variable from the Start Loop activity to the Job activity.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 9
- Joined: Wed Nov 14, 2007 11:00 am
- Location: Montreal
ray.wurlod wrote:Code: Select all
Execute Command ---> User Variables ---> Start Loop <------- End Loop | ^ ...[/quote] I assume you mean Execute Command for the SQL query and get the list then filled in a user variable to use with the loop. That's what I've done now. I still have that annoying Controller problem: Error calling DSSetParam(currentLoopValue), code=-4 [ParamValue/Limitvalue is not appropriate]. But I feel I am getting near the answer. I am debugging... I'll keep you posted.
-
- Participant
- Posts: 9
- Joined: Wed Nov 14, 2007 11:00 am
- Location: Montreal
I am putting as much detail as I can in case someone can use this eventually... Sorry for my English, it is not my first language. Feel free to contact me if you need more details. Anyways...
I had to parse the string to get rid of the code=-4 [ParamValue/Limitvalue is not appropriate] error. There was some garbage characters in the query result from the temp table containing tables names. I did that using
field(convert(@FM,',', Execute_Command_0.$CommandOutput),',',1)
But instead to achieve the purpose of that job, I did use a sequential file removing all delimiters except end of record and use a comma. That created a file with a single row delimited by comma. Then I used the executeCommand to do a cat C:\myfile.txt and shipped it directly (no bad character parsing needed) to the StartLoop operator. Then I used the $counter variable to fill a job parameter of the job in that loop.
Everything works fine... well, nearly . Had to change the APT_Transport_Block_size since the legacy app had crazy tables with columns like varchar(8000).
Thanks for your help everyone.
I had to parse the string to get rid of the code=-4 [ParamValue/Limitvalue is not appropriate] error. There was some garbage characters in the query result from the temp table containing tables names. I did that using
field(convert(@FM,',', Execute_Command_0.$CommandOutput),',',1)
But instead to achieve the purpose of that job, I did use a sequential file removing all delimiters except end of record and use a comma. That created a file with a single row delimited by comma. Then I used the executeCommand to do a cat C:\myfile.txt and shipped it directly (no bad character parsing needed) to the StartLoop operator. Then I used the $counter variable to fill a job parameter of the job in that loop.
Everything works fine... well, nearly . Had to change the APT_Transport_Block_size since the legacy app had crazy tables with columns like varchar(8000).
Thanks for your help everyone.
Ummm... not "garbage" characters but Field Marks as in the @FM character you used Field() to strip from the strings. It's important to understand both what they are and why they are there, all of those @ "marks" are some of the fundamental underpinnings of DataStage.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
When DataStage ingests a string containing line terminator characters it converts them into field marks (this allows DataStage to be operating system agnostic). The field mark character is available through the @FM system variable - its value in hex is 0xFE.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.