Creating control files for SQL Server BCP utility

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
crouse
Charter Member
Charter Member
Posts: 204
Joined: Sun Oct 05, 2003 12:59 pm
Contact:

Creating control files for SQL Server BCP utility

Post by crouse »

Anyone out there have an "automatic" way of creating a control file for use by the BCP (bulk insert) utility in SQL Server? These files are called "format files", and the only way I can find to create them, so far, is via the interactive bcp command line. This is pretty labor intensive if I have several (or a hundred) tables that I need to define format files for. My next step is to create a PERL script, or the like, to read a CREATE statement or DESCRIBE output and create a format file.

I just don't want to invent the wheel again if I don't need to.

The "MS SQL Server Load" stage in DS isn't as "nice" as the older ORACLE stage that creates the format/control file on the fly...

Thanks for any input,
Craig
Craig Rouse
Griffin Resouces, Inc
www.griffinresources.com
dhiraj
Participant
Posts: 68
Joined: Sat Dec 06, 2003 7:03 am

Post by dhiraj »

craig,

1) If you are bulkloading in to all the columns of the table and your file contains only the columns which are present in the table and your file is a delimited file and your file is a character file ( no packed decimal or binary data) then you do not require a format to bulk load using the stored procedure BuLKInsert provided by sql server.Refer swl server help for details

2) if you are using datastage to bulk insert, i think format file is not required. Haven't tried this though


HTH

Dhiraj
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

If you saved these into DS_METADATA then I would modify my routine KgdCreateDdlMetadata. It will create the create table sql for records in the DataStage metadata hash file DS_METADATA. It should be easy to change this to build what you want. If you post an example of the control file then I might do it for you.

KgdCreateDdlMetadata is posted on my web site and ADN. If you cannot find it then let me know.
Mamu Kim
crouse
Charter Member
Charter Member
Posts: 204
Joined: Sun Oct 05, 2003 12:59 pm
Contact:

Post by crouse »

Dhiraj,

Thanks, the helps some. And yes, the DS stage doesn't need the format file, but it's not very performant.

I have several jobs that I need to "retrofit" their transactional stages into a flat file to be used for bulk loading.

The positions of the columns in the resultant flat file may not match the ordinal position of the columns in the table, and/or may not contain all the columns in the table, so I'll need a format file. :cry:

I think I have a plan in mind to combine the columns from a table definition and the column names from the load file and create the format file on the fly.

Thanks for the help,
Craig
Craig Rouse
Griffin Resouces, Inc
www.griffinresources.com
crouse
Charter Member
Charter Member
Posts: 204
Joined: Sun Oct 05, 2003 12:59 pm
Contact:

Post by crouse »

Kim, here's a link for a good desciption of what's needed in the format file.

http://msdn.microsoft.com/library/defau ... p_9yat.asp

There are several rules to follow (i.e. prefix length).

If you come up with something, let me know.

Thanks,
Craig
Craig Rouse
Griffin Resouces, Inc
www.griffinresources.com
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

If you have one working please post it with code blocks around it so it does not lose anything. I do not mind helping you but you need to do the research. Post one and I can get that routine to do the rest. It should only take a few minutes.
Mamu Kim
crouse
Charter Member
Charter Member
Posts: 204
Joined: Sun Oct 05, 2003 12:59 pm
Contact:

Post by crouse »

Kim,

Here's an example:

8.0
3
1 SQLINT 0 4 "" 1 calendar_key ""
2 SQLDATETIME 1 8 "" 2 calendar_date ""
3 SQLCHAR 0 20 "" 3 month_name SQL_Latin1_General_CP1_CI_AS

And it should be tab seperated.

The referenced URL is applicable, since there are many rules to comply with in creating the format file. :(

-Craig
Craig Rouse
Griffin Resouces, Inc
www.griffinresources.com
shawn_ramsey
Participant
Posts: 145
Joined: Fri May 02, 2003 9:59 am
Location: Seattle, Washington. USA

Post by shawn_ramsey »

crouse wrote: Thanks, the helps some. And yes, the DS stage doesn't need the format file, but it's not very performant.
Craig,

I am assuming that you have done some performance testing with BCP vs. DS bulk load. Are you using the SQL Bulk Load stage or the BCP stage? We have had some issues with performance of the bulk loads but have never gone to the point of spinning off a flat file and BCPing it. We have been able to get reasonable performance with the SQL Bulk load stage by tweaking the parameters. One other suggestion that I have is to insert a IPC stage before the bulk loader in the job stream and test your load performance.
Shawn Ramsey

"It is a mistake to think you can solve any major problems just with potatoes."
-- Douglas Adams
crouse
Charter Member
Charter Member
Posts: 204
Joined: Sun Oct 05, 2003 12:59 pm
Contact:

Post by crouse »

Shawn,

I don't see a BCP stage in 7.1r1 for Windows (on the distribution CD). Only SQL Server bulk stage is "Microsoft SQL Server Bulk Loader".

Do you have both? On what platform and release?

-Craig
Craig Rouse
Griffin Resouces, Inc
www.griffinresources.com
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Code: Select all

* -----------------------------------------------------------------
* KgdCreateBcpMetadata(DsMetadataId)
* Decription: Create Bcp file from records in DS_METADATA.
* Written by: Kim Duke
* Notes:
* -----------------------------------------------------------------
* $INCLUDE DSINCLUDE JOBCONTROL.H
      Ans = ''
      RoutineName = "KgdCreateBcpMetadata"
      Usage = RoutineName:'(DsMetadataId)'
* -----------------------------------------------------------------
* initialize variables
* -----------------------------------------------------------------
      True = 1
      False = 0
      Other = True
      Forever = True
      Today = oconv(@DATE, "D4-")
      ErrMsg = ''
      NullRec = ''
      CrLf = char(13):char(10)
      DsMetadataSep = '\'
      DebugMode = False
* -----------------------------------------------------------------
      FileName = 'DS_METADATA'
      open FileName to FilePtr else
         ErrMsg = 'Can not open ':FileName:' file.'
         goto ErrRtn
      end
      DsMetadataPtr = FilePtr
* -----------------------------------------------------------------
      GenRec = ''
      PKeys = ''
* -----------------------------------------------------------------
      read DsMetadataRec from DsMetadataPtr, DsMetadataId then
         Occurrence = 1
* -----------------------------------------------------------------
FindColumns:
         FindStr "CMetaColumn" In DsMetadataRec, Occurrence Setting AttrNo, MvNo, SvNo Then
* -----------------------------------------------------------------
* number of column definitions in collection
* -----------------------------------------------------------------
            NoOfFields = Field(DsMetadataRec<AttrNo>, "/", 2, 1)
            If NoOfFields = 0 Then
               * Stored procedures use the second columns collection in the record.
               * Go get the next one.
               Occurrence += 1
               GoTo FindColumns
            End
            TableName = field(DsMetadataId, DsMetadataSep, dcount(DsMetadataId, DsMetadataSep))
            GenRec<-1> = '8.0'
            GenRec<-1> = NoOfFields
            Indent = spaces(3*1)
            * NoOfFields = dcount(DsMetadataRec<AttrNo+1>,
            for MvNo=1 to NoOfFields
               ColumnName = DsMetadataRec<AttrNo+1, MvNo>
               if DebugMode then print 'ColumnName=':ColumnName
               ColumnDescription = DsMetadataRec<AttrNo+2, MvNo>
               DataElement = DsMetadataRec<AttrNo+3, MvNo>
               if DebugMode then print 'DataElement=':DataElement
               SQLDataTypeCode = DsMetadataRec<AttrNo+4, MvNo>
               if DebugMode then print 'SQLDataTypeCode=':SQLDataTypeCode
               Precision = DsMetadataRec<AttrNo+5, MvNo>
               Scale = DsMetadataRec<AttrNo+6, MvNo>
               IsNullable = DsMetadataRec<AttrNo+7, MvNo>
               if DebugMode then print 'IsNullable=':IsNullable
               KeyPosition = DsMetadataRec<AttrNo+8, MvNo>
               DisplayWidth = DsMetadataRec<AttrNo+9, MvNo>
               Begin case
                  case SQLDataTypeCode = -7
                     SQLType = "BIT"
                  case SQLDataTypeCode = -6
                     SQLType = "TINYINT"
                  case SQLDataTypeCode = -5
                     SQLType = "BIGINT"
                  case SQLDataTypeCode = -4
                     SQLType = "LONGVARBINARY"
                  case SQLDataTypeCode = -3
                     SQLType = "VARBINARY"
                  case SQLDataTypeCode = -2
                     SQLType = "BINARY"
                  case SQLDataTypeCode = -1
                     SQLType = "LONGVARCHAR"
                  case SQLDataTypeCode = 0
                     SQLType = "VARCHAR"
                  case SQLDataTypeCode = 1
                     SQLType = "CHAR"
                  case SQLDataTypeCode = 2
                     SQLType = "NUMERIC"
                  case SQLDataTypeCode = 3
                     SQLType = "NUMERIC"
                  case SQLDataTypeCode = 4
                     SQLType = "INTEGER"
                  case SQLDataTypeCode = 5
                     SQLType = "SMALLINT"
                  case SQLDataTypeCode = 6
                     SQLType = "FLOAT"
                  case SQLDataTypeCode = 7
                     SQLType = "REAL"
                  case SQLDataTypeCode = 8
                     SQLType = "DOUBLE"
                  case SQLDataTypeCode = 9
                     SQLType = "DATE"
                  case SQLDataTypeCode = 10
                     SQLType = "TIME"
                  case SQLDataTypeCode = 11
                     SQLType = "TIMESTAMP"
                  case SQLDataTypeCode = 12
                     SQLType = "VARCHAR"
                  case True
                     SQLType = "VARCHAR"
               end case
               If SQLType = "NUMERIC" Then
                  If Precision = @NULL Then
                     DataType = SQLType
                     GenRec<-1> = MvNo:' SQL':SQLType:' "" 22 "" ':MvNo:' ':ColumnName:' ""' 
                  End Else
                     If Scale = @NULL Then
                        DataType = SQLType : "(" : Precision : ")"
                        GenRec<-1> = MvNo:' SQL':SQLType:' "" ':Precision:' "" ':MvNo:' ':ColumnName:' ""' 
                     End Else
                        DataType = SQLType : "(" : Precision : "," : Scale : ")"
                        GenRec<-1> = MvNo:' SQL':SQLType:' "" ':Precision:' "" ':MvNo:' ':ColumnName:' ""' 
                     End
                  End
               End Else
                  if SQLType = "DATE" or SQLType = "TIME" or SQLType = "TIMESTAMP" then
                     DataType = SQLType
                     GenRec<-1> = MvNo:' SQL':SQLType:' 1 8 "" ':MvNo:' ':ColumnName:' ""' 
                  end else
                     DataType = SQLType : "(" : Precision : ")"
                     GenRec<-1> = MvNo:' SQL':SQLType:' 0 ':Precision:' "" ':MvNo:' ':ColumnName:' SQL_Latin1_General_CP1_CI_AS' 
                  end
               End
               if DebugMode then print 'DataType=':DataType
            next MvNo

            if DebugMode then print
            NoOfLines = dcount(GenRec, @FM)
            for i=1 to NoOfLines
               if DebugMode then print GenRec<i>
            next i

            Ans = GenRec
         end else
            ErrMsg = 'Could not find CMetaColumn in ':DsMetadataId
            goto ErrRtn
         end
      end else
         ErrMsg = 'Could not find ':DsMetadataId:' in DS_METADATA.'
         goto ErrRtn
      end

      goto TheEnd
* -----------------------------------------------------------------
ErrRtn:
      if DebugMode then print
      if DebugMode then print ErrMsg
      if DebugMode then print
      Ans = ErrMsg
* -----------------------------------------------------------------
TheEnd:
      if DebugMode then print
      if DebugMode then print
      if DebugMode then print Ans
      if DebugMode then print
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

This is in the same order as your metadata in a link if you save a link or if you import. I suggest you create a type 19 hash file. You save all your metadata in a specific category. Select this category call this routine and write the results into this hash file. A type 19 hash file is a directory at the OS level. You can just copy these to where you need them and shell out and execute bcp in the same directory you write your sequential files.
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

This is how you should use metadata to help automate simple tasks. Now this creates an accurate control file. It is simple to use too. Metadata if used properly can help design the best solutions. I think something like MetaStage is overkill for most of us. Metadata should drive your designs from start to finish. Metadata should be proactive not a passive list of what fields exist where.

There is a difference in project tracking and project management. Sometimes tracking is all you need. In metadata what are you trying to do manage it or use it to help create your ETL. Most developers do not use metadata effectively or they waste time managing the descriptions.

On one of my recent projects we spent more time writing QA documents than writing the ETL. This is not a well managed Rapid Application Development project. Naturally it was by an ex-big accounting firm manager. RAD ETL tools need to managed differently than old COBOL shops where something took 6 months to write one small change and a whole group of developers.

I hope more people start to develop these metadata routines to automate simple admin functions like create tables or estimate table space. A simple routine can save a lot of guess work.
Mamu Kim
shawn_ramsey
Participant
Posts: 145
Joined: Fri May 02, 2003 9:59 am
Location: Seattle, Washington. USA

Post by shawn_ramsey »

crouse wrote:Shawn,

I don't see a BCP stage in 7.1r1 for Windows (on the distribution CD). Only SQL Server bulk stage is "Microsoft SQL Server Bulk Loader".

Do you have both? On what platform and release?

-Craig
We are running 7.5 under Windows and are using the SQL Bulk loader so I never noticed when it went away. From what I have been told by the Microsoft folks we should not use the DataStage BCP since it uses MSDBLIB (which is only being maintained for backwards compatibility). It is also my understanding that the new BCP utility is essentially just a wrapper for the SQL Bulk load.

By the way it seems that the BCP is now under the Sybase drop-down on the pallet.
Shawn Ramsey

"It is a mistake to think you can solve any major problems just with potatoes."
-- Douglas Adams
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Kim,

You lost me on how to use this.

I've created a routine with your code, now what?

Looks interesting.

-Craig
This routine will ask for a id to a DS_METADATA record. The id is category : "\" : tablename like Hashed\AdSales\Daypart2StgHash. If you compile and test this routine then type in something that is in your metadata. Double click on results and you can see the bcp file.

You call this routine in a loop and write the results out so you could use these for all your tables. You could do this in a batch job, a normal job with a UV stage or a routine. If you want to do this then it may take a little tweaking. Test it first.
Mamu Kim
Post Reply