Seq File Performance
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 258
- Joined: Tue Jul 04, 2006 10:35 pm
- Location: Toronto
Seq File Performance
Hi All,
The job reading a fixed width seq file around 150GB.
Its runs for around 2 hours.
I have tried Multiple Nodes / Readers. Dosent seem to help.
Is there anything else i can do to improve the performance?
Regards,
Samyam
The job reading a fixed width seq file around 150GB.
Its runs for around 2 hours.
I have tried Multiple Nodes / Readers. Dosent seem to help.
Is there anything else i can do to improve the performance?
Regards,
Samyam
Cheers,
Samyam
Samyam
-
- Premium Member
- Posts: 258
- Joined: Tue Jul 04, 2006 10:35 pm
- Location: Toronto
after the read its just doing a column import and writting into dataset.
in the log as well the seqential file stage takes about 2 hours to complete and read the whole file.
and after another 10 mins the column import finishes and the jobs completes.
I have also tried reading the same file with just a peek sateg after the seq file stage. Take 2 hours.
in the log as well the seqential file stage takes about 2 hours to complete and read the whole file.
and after another 10 mins the column import finishes and the jobs completes.
I have also tried reading the same file with just a peek sateg after the seq file stage. Take 2 hours.
Cheers,
Samyam
Samyam
-
- Premium Member
- Posts: 258
- Joined: Tue Jul 04, 2006 10:35 pm
- Location: Toronto
Make a copy of your job that just reads the file and puts it into a PEEK stage and see what the speed is. That will help you narrow down the potential problem to the sequential read itself if the speed remains slow in this test job.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Premium Member
- Posts: 258
- Joined: Tue Jul 04, 2006 10:35 pm
- Location: Toronto
Thanks for you suggestions.
qt_ky,
I have a local server admin. What should i be looking for asking him to do?
and ArndW,
The test job has only the seq file and peek. It starts very fast but it slows down in 5 mins and still takes 2 hours.
Is there anything else i can try out?
I am also planning to split the file into smaller chunks of 40GB and read them parallely.
qt_ky,
I have a local server admin. What should i be looking for asking him to do?
and ArndW,
The test job has only the seq file and peek. It starts very fast but it slows down in 5 mins and still takes 2 hours.
Is there anything else i can try out?
I am also planning to split the file into smaller chunks of 40GB and read them parallely.
Cheers,
Samyam
Samyam
Does a "cat <file> > /dev/null" go any faster?
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Premium Member
- Posts: 258
- Joined: Tue Jul 04, 2006 10:35 pm
- Location: Toronto
I tried to read 4 files of 40GB instead of one 160GB file.
Same reult.
Cat the 160 GB file same result. may be 10 mins faster.
Not sure what to do.
whats a good read time for 160 GB file?
Same reult.
Cat the 160 GB file same result. may be 10 mins faster.
Not sure what to do.
whats a good read time for 160 GB file?
Last edited by samyamkrishna on Mon Dec 07, 2015 12:16 pm, edited 1 time in total.
Cheers,
Samyam
Samyam
If the "cat" took almost as long as the DataStage read, then the problem isn't in DataStage and nothing you will do there will significantly increase your speed.
Is there SAN involved? What filesystem is used? Does the speed change if you copy the files to another partition (e.g. /tmp) and try to process them from there?
Is there SAN involved? What filesystem is used? Does the speed change if you copy the files to another partition (e.g. /tmp) and try to process them from there?
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Ask your local admin to monitor server resources and performance during your tests, help identify the bottleneck, and see if anything may be changed to improve it.
Is it a delimited file or fixed width? How many columns? How many records?
Is it a delimited file or fixed width? How many columns? How many records?
Choose a job you love, and you will never have to work a day in your life. - Confucius
-
- Premium Member
- Posts: 258
- Joined: Tue Jul 04, 2006 10:35 pm
- Location: Toronto