Hi All,
We are planning to use ODBC connector stage for extraction and loading of the data from vertica database. We are using DS version 8.7.
My concern is that we will be extracting huge volume of data ~270 Million rows with around 30-40 columns, so performance of jobs would take a lot of hit. Can anyone suggest, how can improve the performance of the jobs for such a volume of data?
Performance Impact on using ODBC to extract and load
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 50
- Joined: Tue Jan 19, 2010 4:14 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Never heard of Vertica database.
On average, ODBC is a turtle, but if you parallelize the load, you can have a hurd of turtles moving your data.
Some large volume loads could be sped up if you FTP a load ready file to the target system and use existing bulk load tools native to that DBMS platform. It will complicate your job, but "might" improve the speed.
On average, ODBC is a turtle, but if you parallelize the load, you can have a hurd of turtles moving your data.
Some large volume loads could be sped up if you FTP a load ready file to the target system and use existing bulk load tools native to that DBMS platform. It will complicate your job, but "might" improve the speed.
-
- Participant
- Posts: 437
- Joined: Fri Oct 21, 2005 10:00 pm
Vertica is a columnar database which lends itself well to performing analytics, but its strength is not in loading and extracting large volumes of data. The database is going to work hard at reconstructing the rows that it has stored in columnar format (each unique value only stored one time physically with pointers being utilized to represent rows). With that in mind, I believe that the database is going to be your bottleneck. If I were you I would parallelize my query by using the ODBC Connector's built in partition read method.
Have fun experimenting with speed. Try the extracts without partition reads and then with the partition reads in place using modulus and minimum/maximum range.
Have fun experimenting with speed. Try the extracts without partition reads and then with the partition reads in place using modulus and minimum/maximum range.
Keith Williams
keith@peacefieldinc.com
keith@peacefieldinc.com