DB2 Load against Partitioned Tables

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dcguuenther
Participant
Posts: 13
Joined: Wed Feb 08, 2006 3:06 pm
Location: Chicago

DB2 Load against Partitioned Tables

Post by dcguuenther »

Hello,

I am so frustrated. How the heck does the DB2 Load work against partioned tables. Here is my deal.

When I load a non-partitioned table and specify non-recoverable --- everything is fine and I can immediatly select, update, insert and delete from table.

When I load a partionted table and specify non-recoverable --- the load succeeds, but the tablespace gets put in backup pending and I cannot update the table. I feel like nonrecoverable is actually only getting sent to one of the partitions. Any ideas. I am so frustrated.

What are the good ways to use this load stage. I want to be able to do a bulk load against partitioned tables, I am fine losing any data that was in the process of being loaded, but I do not want to lose all the records that were in the table before the load has started --- which seems like what I am having happen. I am hitting a DB2/6000 8.2.2 DB.

I am using the enterprise edition DB2 UDB stage and specifying Load-append. Non-Recoverable is set to True.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

What is the size of data that you are loading and if you can post the Page size of the table.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
dcguuenther
Participant
Posts: 13
Joined: Wed Feb 08, 2006 3:06 pm
Location: Chicago

Post by dcguuenther »

I am loading approx 15 - 25 million rows at a time.

Loading 18 columns, 9 ints, 2 char 1's, a timestamp, and 5 decimal 15,3, and a Varchar 20
dcguuenther
Participant
Posts: 13
Joined: Wed Feb 08, 2006 3:06 pm
Location: Chicago

Post by dcguuenther »

All in all I will be loading the above size at least 8 times
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

You didnt post the page size of the tables.
As far as I remember the command to see the tablespaces is

Code: Select all

List Tablespaces;
This will give you the pagesize of the tables.
Please post that.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
dcguuenther
Participant
Posts: 13
Joined: Wed Feb 08, 2006 3:06 pm
Location: Chicago

Post by dcguuenther »

permission denied for me to do that and can't get ahold of a DBA right now
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Post by bcarlson »

We ran into a similar issue. It is because your database has rollforward recovery turned on. We have that off in development and on in production.

Now, I am not a DBA, so I don't know if there are multiple ways of getting your tablespace out of 'backup pending' mode. However the solution our DBA's recommended what doing a tablespace backup to /dev/null:

Code: Select all

db2 "BACKUP DATABASE ${mydbname} TABLESPACE ${mytbspacename} online to /dev/null"
Note: We are running DS 7.5.2 against DB2 V8.2.x 64-bit on AIX 5.2 64-bit. All code in this posting is based on Korn Shell.

You will need to run this on each DB2 partition. This can be accomplished pretty easily by looping through the entries in your db2nodes.cfg file (usually located in the database instance directory structure), repeating your DB2 command for each node:

The db2nodes.cfg file has the following format:

Code: Select all

db2partnum server serverpartition nodename
What you need are the first 2 fields, the db2 partition number and the server name. With this information, you can set the DB2NODE environment variable (with the partition number) and reference the necessary server name. If your system is only on one physical server, you just loop through the node numbers. If your system spans multiple physical servers, you'll need ssh or rsh to fire off db2 commands on each server/node combination:

Code: Select all

eval db2nodefile=~${DB2INSTANCE}/sqllib/db2nodes.cfg

cat ${db2nodefile} | while read node server filler
do
    # Note: mydbname could be replaced with DB2DBDFT if this is set
    db2cmd="BACKUP DATABASE ${mydbname} TABLESPACE ${mytbspacename} online to /dev/null"
    rsh ${server} -n env DB2INSTANCE=${DB2INSTANCE} \
                    DB2DBDFT=${mydbname} \
                    DB2NODE=${node} \
                    ~${DB2INSTANCE}/sqllib/bin/db2 "\"${db2cmd}\""
done
We created a function that contains the backup logic (called ld_backup) and another to wrapper the remote DB2 command processing (called rdb2) and we call it after every DataStage load job (writes do not require this, if you didn't know that already). We currently have 48 nodes spanning 3 servers, so we this setup to do each node's backup in the background. The backup doesn't take long, especially writing to /dev/null, but the more nodes you have to backup, the longer it takes.

Here's a version of the previous script with background processing:

Code: Select all

function rdb2 {
    typeset host="$1"
    typeset node="$2"
    shift; shift

    rsh ${host} -n env DB2INSTANCE=${DB2INSTANCE} \
                    DB2DBDFT=${DB2DBDFT} \
                    DB2NODE=${node} \
                    ~${DB2INSTANCE}/sqllib/bin/db2 "\"$*\""

    return $?
}

function ld_backup {
    typeset Tablespace=$1
    typeset TempFile=/tmp/ld_backup.$$
    typeset command="BACKUP DATABASE ${DB2DBDFT} TABLESPACE ${Tablespace} online to /dev/null"

    (
        typeset PIDList=""
        eval db2nodefile=~${DB2INSTANCE}/sqllib/db2nodes.cfg

        cat ${db2nodefile} | while read node host filler
        do
            rdb2 ${host} ${node} ${command} > ${TempFile}.${host}.${node} 2>&1 &
            PIDList="${PIDList} $!"
        done

        # Wait for background processes to complete
        wait ${PIDList}
    ) 2>${TempFile}.err
    RC=$?
}

When the while loop completes, you can loop through the tempfiles to print error/status messages from the database.

Another Note: The previous functions are adapted from our code and have not been tested as posted, but the basic functionality is there.

Since we don't have rollforward recovery setup on development, our version of this ld_backup function is setup to not fail on development - DB2 prints the following error message to the log files (tempfile):
SQL2421N Table space level backup is not allowed because
roll-forward recovery is not enabled.
So on development, if we get this message we count it as successful. that way our dev code and production code remain in sync - we don't have to have 2 separate versions of our code.

Anyway, sorry this has gotten so long. Hope this helps.

Brad.
[/code]
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I have caused the same problem before; when using normal inserts against a partitioned table everything went well but I made the DBAs unhappy due to the backup pending state where they had to restore a whole tablespace. I hate posting a "me too" thread because I cannot recall exactly what the cause was, I seem to remember that it had to do with disabling constraints or indices and then having the process unable to re-enable them. There is another poster here who knows both the cause and the potential workaround(s) but I am afraid he won't be back until Tuesday as Monday is a holiday here.

If you have access to support they will be able to answer this for you or perhaps someone else has a better remembory on this subject than I do.
Post Reply