Skipping Rows in a SQL Server Bulk Load

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
JDionne
Participant
Posts: 342
Joined: Wed Aug 27, 2003 1:06 pm

Skipping Rows in a SQL Server Bulk Load

Post by JDionne »

I am trying to load another departments table through a SQL Server bulk load stage. The department has added columns to the table that I dont need to populate, but I have goten a new column that has been added to the end that I need to load. If I dont add the departments columns to the transform stage, the job adds my column to the next avaialable column in the table, not the last column where it needs to be. If I add all the columns to the transform, the formulas in the column of the table causes my DS job to abort. I need to know why even if I mapped the columns correctly DS loads it to the wrong column.
Thanx
Jim
Sure I need help....But who dosent?
shawn_ramsey
Participant
Posts: 145
Joined: Fri May 02, 2003 9:59 am
Location: Seattle, Washington. USA

Re: Skipping Rows in a SQL Server Bulk Load

Post by shawn_ramsey »

You have two options, one if it is a flat file that you are sourcing from.
:arrow: If you are pulling directly from a table then don't include those columns in the select from the source stage.
:arrow: The other option is to add the columns to the source side of the transform and don't map them to anything on the output side.
Shawn Ramsey

"It is a mistake to think you can solve any major problems just with potatoes."
-- Douglas Adams
JDionne
Participant
Posts: 342
Joined: Wed Aug 27, 2003 1:06 pm

Re: Skipping Rows in a SQL Server Bulk Load

Post by JDionne »

shawn_ramsey wrote:You have two options, one if it is a flat file that you are sourcing from.
:arrow: If you are pulling directly from a table then don't include those columns in the select from the source stage.
:arrow: The other option is to add the columns to the source side of the transform and don't map them to anything on the output side.
I am pulling from a sequencial file, but if I add the columns to the sequincial file and they arent in the file will that not mess up the load on the other side?
Jim
Sure I need help....But who dosent?
shawn_ramsey
Participant
Posts: 145
Joined: Fri May 02, 2003 9:59 am
Location: Seattle, Washington. USA

Re: Skipping Rows in a SQL Server Bulk Load

Post by shawn_ramsey »

No they wont. Just becuse a column comes into a transform does not mean that there has to go out of the transform. Do you even have a transform stage in the Job? If you do not have a transfrom stage in the middle then you should.
Shawn Ramsey

"It is a mistake to think you can solve any major problems just with potatoes."
-- Douglas Adams
JDionne
Participant
Posts: 342
Joined: Wed Aug 27, 2003 1:06 pm

Re: Skipping Rows in a SQL Server Bulk Load

Post by JDionne »

shawn_ramsey wrote:No they wont. Just becuse a column comes into a transform does not mean that there has to go out of the transform. Do you even have a transform stage in the Job? If you do not have a transfrom stage in the middle then you should.
yes i have a transform stage, taht wasnt my question. If i change the metadata on the flat fill then will that not throw off the columns, thus saying that there are columns that arent realy there? In turn will that not throw off the last column in the flat file and give me nulls?
Jim
Sure I need help....But who dosent?
mhester
Participant
Posts: 622
Joined: Tue Mar 04, 2003 5:26 am
Location: Phoenix, AZ
Contact:

Post by mhester »

Jim,

What Shawn is saying is that if all things are correct (metadata etc...) you do not need to pass the unnecessary columns to the output stage. The bottom line is that the metadata in the job needs to match the data in the source file. If these do not match then you will have issues trying to stream that data. How you define the metadata on the outbound link is your business and some, none or all of the inbound columns can be moved to the outbound stage.

Also, I believe that if the columns in your SQL Server table are nullable then they need not be included in the SQL statement. I know it works this way with Oracle, but not sure about SQL Server - Shawn would know this better than I.

Hope this helps,

Michael Hester
JDionne
Participant
Posts: 342
Joined: Wed Aug 27, 2003 1:06 pm

Post by JDionne »

mhester wrote:Jim,

What Shawn is saying is that if all things are correct (metadata etc...) you do not need to pass the unnecessary columns to the output stage. The bottom line is that the metadata in the job needs to match the data in the source file. If these do not match then you will have issues trying to stream that data. How you define the metadata on the outbound link is your business and some, none or all of the inbound columns can be moved to the outbound stage.

Also, I believe that if the columns in your SQL Server table are nullable then they need not be included in the SQL statement. I know it works this way with Oracle, but not sure about SQL Server - Shawn would know this better than I.

Hope this helps,

Michael Hester

Ok my source does not have the columns. You are sayng that I should add them to teh metadata? and if so will that not mess up the input of the file? Also i think that you are right about adding the columns and makeing them null, except that the department im feeding has added formulas to the columns that make a null value imposable. the only other thing that I can think of is to fill in the vormulas on my side.
Jim
Sure I need help....But who dosent?
mhester
Participant
Posts: 622
Joined: Tue Mar 04, 2003 5:26 am
Location: Phoenix, AZ
Contact:

Post by mhester »

Ok my source does not have the columns. You are sayng that I should add them to teh metadata? and if so will that not mess up the input of the file?
1) No 2) Yes. If they are not in the source data then adding columns to the stage will cause issues. If the source file is independent of the target table (not a 1-1 match) then you will need to derive (formulas etc...) for what these "extra" columns should be in the output link to the table.

If you have 3 columns in your source file -

c1, c2, c3

and you have a requirement for 4 columns in the target table

c1, c2, c3, c4

Then you must at some point derive what the value of c4 should be (if it is required).

Remember that relational tables do not care about positions... only the column name and if you don't need a column and it is nullable then you don't need to derive or pass to the output link.

If the "department" has some business rules surrounding what are acceptable values for these columns then you could certainly implement these as part of your process.

Forgive me if I'm not on the right track here and have "confused" the issue more than required or have misunderstood your post.

Regards,

Michael Hester
Post Reply