Hi Gurus,
I need some advice on performance tuning in datastage Parallel extender.
This is the scenario.
1. I am using teradata enterprise stage to read a table and do some processing using lookup stage, sort stage and aggregator stage and finally loading 30 million records into the teradata table using teradata multiload stage. (runtime is 40 min)
2. To reduce run time, I just tried to do the sorting and aggregation in teradata enterprise stage itself and then do the same lookups, then loading into teradata table using teradata multiload stage - 30 million records. (runtime is 39 min).
My assumption was the scenario 2 should run very fast when compared to scenario 1. But it was not. Why is that ? Can some one give an explanation about this ? Is my assumption wrong? ( actually I expected the scenario 2 will run in 20 or 25 min, since it avoids sort and aggregate stage in datastage)
Thanks in advance
Performance tuning
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Charter Member
- Posts: 560
- Joined: Wed Jul 13, 2005 5:36 am
- Location: Ohio
Thats time is for extraction only or for extrction + sorting + aggregations? If its just extractioin then there isnt much you can do. Execpt extract in chunks using the Multiple instances, load it to sepererate file, do a cat and sort and then in your second job, do the aggregations and load.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Is there are a reason why you are using the multiload stage vs. the enterprise stage to load the target table? We typically have seen better load performance using enterprise.
Also, have your DBA check out the queries you are using to pull the data (both the straight read and the one where the sorting and aggregation are handled within the query). Make sure that is as tuned as possible.
How many Teradata nodes? How many DataStage nodes are defined in your config file? What values are you setting for the requestedsessions and sessionsperplayer options?
Brad.
Also, have your DBA check out the queries you are using to pull the data (both the straight read and the one where the sorting and aggregation are handled within the query). Make sure that is as tuned as possible.
How many Teradata nodes? How many DataStage nodes are defined in your config file? What values are you setting for the requestedsessions and sessionsperplayer options?
Brad.