Creating control files for SQL Server BCP utility
Moderators: chulett, rschirm, roy
Creating control files for SQL Server BCP utility
Anyone out there have an "automatic" way of creating a control file for use by the BCP (bulk insert) utility in SQL Server? These files are called "format files", and the only way I can find to create them, so far, is via the interactive bcp command line. This is pretty labor intensive if I have several (or a hundred) tables that I need to define format files for. My next step is to create a PERL script, or the like, to read a CREATE statement or DESCRIBE output and create a format file.
I just don't want to invent the wheel again if I don't need to.
The "MS SQL Server Load" stage in DS isn't as "nice" as the older ORACLE stage that creates the format/control file on the fly...
Thanks for any input,
Craig
I just don't want to invent the wheel again if I don't need to.
The "MS SQL Server Load" stage in DS isn't as "nice" as the older ORACLE stage that creates the format/control file on the fly...
Thanks for any input,
Craig
craig,
1) If you are bulkloading in to all the columns of the table and your file contains only the columns which are present in the table and your file is a delimited file and your file is a character file ( no packed decimal or binary data) then you do not require a format to bulk load using the stored procedure BuLKInsert provided by sql server.Refer swl server help for details
2) if you are using datastage to bulk insert, i think format file is not required. Haven't tried this though
HTH
Dhiraj
1) If you are bulkloading in to all the columns of the table and your file contains only the columns which are present in the table and your file is a delimited file and your file is a character file ( no packed decimal or binary data) then you do not require a format to bulk load using the stored procedure BuLKInsert provided by sql server.Refer swl server help for details
2) if you are using datastage to bulk insert, i think format file is not required. Haven't tried this though
HTH
Dhiraj
If you saved these into DS_METADATA then I would modify my routine KgdCreateDdlMetadata. It will create the create table sql for records in the DataStage metadata hash file DS_METADATA. It should be easy to change this to build what you want. If you post an example of the control file then I might do it for you.
KgdCreateDdlMetadata is posted on my web site and ADN. If you cannot find it then let me know.
KgdCreateDdlMetadata is posted on my web site and ADN. If you cannot find it then let me know.
Mamu Kim
Dhiraj,
Thanks, the helps some. And yes, the DS stage doesn't need the format file, but it's not very performant.
I have several jobs that I need to "retrofit" their transactional stages into a flat file to be used for bulk loading.
The positions of the columns in the resultant flat file may not match the ordinal position of the columns in the table, and/or may not contain all the columns in the table, so I'll need a format file.![Crying or Very sad :cry:](./images/smilies/icon_cry.gif)
I think I have a plan in mind to combine the columns from a table definition and the column names from the load file and create the format file on the fly.
Thanks for the help,
Craig
Thanks, the helps some. And yes, the DS stage doesn't need the format file, but it's not very performant.
I have several jobs that I need to "retrofit" their transactional stages into a flat file to be used for bulk loading.
The positions of the columns in the resultant flat file may not match the ordinal position of the columns in the table, and/or may not contain all the columns in the table, so I'll need a format file.
![Crying or Very sad :cry:](./images/smilies/icon_cry.gif)
I think I have a plan in mind to combine the columns from a table definition and the column names from the load file and create the format file on the fly.
Thanks for the help,
Craig
Kim, here's a link for a good desciption of what's needed in the format file.
http://msdn.microsoft.com/library/defau ... p_9yat.asp
There are several rules to follow (i.e. prefix length).
If you come up with something, let me know.
Thanks,
Craig
http://msdn.microsoft.com/library/defau ... p_9yat.asp
There are several rules to follow (i.e. prefix length).
If you come up with something, let me know.
Thanks,
Craig
Kim,
Here's an example:
8.0
3
1 SQLINT 0 4 "" 1 calendar_key ""
2 SQLDATETIME 1 8 "" 2 calendar_date ""
3 SQLCHAR 0 20 "" 3 month_name SQL_Latin1_General_CP1_CI_AS
And it should be tab seperated.
The referenced URL is applicable, since there are many rules to comply with in creating the format file.
-Craig
Here's an example:
8.0
3
1 SQLINT 0 4 "" 1 calendar_key ""
2 SQLDATETIME 1 8 "" 2 calendar_date ""
3 SQLCHAR 0 20 "" 3 month_name SQL_Latin1_General_CP1_CI_AS
And it should be tab seperated.
The referenced URL is applicable, since there are many rules to comply with in creating the format file.
![Sad :(](./images/smilies/icon_sad.gif)
-Craig
-
- Participant
- Posts: 145
- Joined: Fri May 02, 2003 9:59 am
- Location: Seattle, Washington. USA
Craig,crouse wrote: Thanks, the helps some. And yes, the DS stage doesn't need the format file, but it's not very performant.
I am assuming that you have done some performance testing with BCP vs. DS bulk load. Are you using the SQL Bulk Load stage or the BCP stage? We have had some issues with performance of the bulk loads but have never gone to the point of spinning off a flat file and BCPing it. We have been able to get reasonable performance with the SQL Bulk load stage by tweaking the parameters. One other suggestion that I have is to insert a IPC stage before the bulk loader in the job stream and test your load performance.
Shawn Ramsey
"It is a mistake to think you can solve any major problems just with potatoes."
-- Douglas Adams
"It is a mistake to think you can solve any major problems just with potatoes."
-- Douglas Adams
Code: Select all
* -----------------------------------------------------------------
* KgdCreateBcpMetadata(DsMetadataId)
* Decription: Create Bcp file from records in DS_METADATA.
* Written by: Kim Duke
* Notes:
* -----------------------------------------------------------------
* $INCLUDE DSINCLUDE JOBCONTROL.H
Ans = ''
RoutineName = "KgdCreateBcpMetadata"
Usage = RoutineName:'(DsMetadataId)'
* -----------------------------------------------------------------
* initialize variables
* -----------------------------------------------------------------
True = 1
False = 0
Other = True
Forever = True
Today = oconv(@DATE, "D4-")
ErrMsg = ''
NullRec = ''
CrLf = char(13):char(10)
DsMetadataSep = '\'
DebugMode = False
* -----------------------------------------------------------------
FileName = 'DS_METADATA'
open FileName to FilePtr else
ErrMsg = 'Can not open ':FileName:' file.'
goto ErrRtn
end
DsMetadataPtr = FilePtr
* -----------------------------------------------------------------
GenRec = ''
PKeys = ''
* -----------------------------------------------------------------
read DsMetadataRec from DsMetadataPtr, DsMetadataId then
Occurrence = 1
* -----------------------------------------------------------------
FindColumns:
FindStr "CMetaColumn" In DsMetadataRec, Occurrence Setting AttrNo, MvNo, SvNo Then
* -----------------------------------------------------------------
* number of column definitions in collection
* -----------------------------------------------------------------
NoOfFields = Field(DsMetadataRec<AttrNo>, "/", 2, 1)
If NoOfFields = 0 Then
* Stored procedures use the second columns collection in the record.
* Go get the next one.
Occurrence += 1
GoTo FindColumns
End
TableName = field(DsMetadataId, DsMetadataSep, dcount(DsMetadataId, DsMetadataSep))
GenRec<-1> = '8.0'
GenRec<-1> = NoOfFields
Indent = spaces(3*1)
* NoOfFields = dcount(DsMetadataRec<AttrNo+1>,
for MvNo=1 to NoOfFields
ColumnName = DsMetadataRec<AttrNo+1, MvNo>
if DebugMode then print 'ColumnName=':ColumnName
ColumnDescription = DsMetadataRec<AttrNo+2, MvNo>
DataElement = DsMetadataRec<AttrNo+3, MvNo>
if DebugMode then print 'DataElement=':DataElement
SQLDataTypeCode = DsMetadataRec<AttrNo+4, MvNo>
if DebugMode then print 'SQLDataTypeCode=':SQLDataTypeCode
Precision = DsMetadataRec<AttrNo+5, MvNo>
Scale = DsMetadataRec<AttrNo+6, MvNo>
IsNullable = DsMetadataRec<AttrNo+7, MvNo>
if DebugMode then print 'IsNullable=':IsNullable
KeyPosition = DsMetadataRec<AttrNo+8, MvNo>
DisplayWidth = DsMetadataRec<AttrNo+9, MvNo>
Begin case
case SQLDataTypeCode = -7
SQLType = "BIT"
case SQLDataTypeCode = -6
SQLType = "TINYINT"
case SQLDataTypeCode = -5
SQLType = "BIGINT"
case SQLDataTypeCode = -4
SQLType = "LONGVARBINARY"
case SQLDataTypeCode = -3
SQLType = "VARBINARY"
case SQLDataTypeCode = -2
SQLType = "BINARY"
case SQLDataTypeCode = -1
SQLType = "LONGVARCHAR"
case SQLDataTypeCode = 0
SQLType = "VARCHAR"
case SQLDataTypeCode = 1
SQLType = "CHAR"
case SQLDataTypeCode = 2
SQLType = "NUMERIC"
case SQLDataTypeCode = 3
SQLType = "NUMERIC"
case SQLDataTypeCode = 4
SQLType = "INTEGER"
case SQLDataTypeCode = 5
SQLType = "SMALLINT"
case SQLDataTypeCode = 6
SQLType = "FLOAT"
case SQLDataTypeCode = 7
SQLType = "REAL"
case SQLDataTypeCode = 8
SQLType = "DOUBLE"
case SQLDataTypeCode = 9
SQLType = "DATE"
case SQLDataTypeCode = 10
SQLType = "TIME"
case SQLDataTypeCode = 11
SQLType = "TIMESTAMP"
case SQLDataTypeCode = 12
SQLType = "VARCHAR"
case True
SQLType = "VARCHAR"
end case
If SQLType = "NUMERIC" Then
If Precision = @NULL Then
DataType = SQLType
GenRec<-1> = MvNo:' SQL':SQLType:' "" 22 "" ':MvNo:' ':ColumnName:' ""'
End Else
If Scale = @NULL Then
DataType = SQLType : "(" : Precision : ")"
GenRec<-1> = MvNo:' SQL':SQLType:' "" ':Precision:' "" ':MvNo:' ':ColumnName:' ""'
End Else
DataType = SQLType : "(" : Precision : "," : Scale : ")"
GenRec<-1> = MvNo:' SQL':SQLType:' "" ':Precision:' "" ':MvNo:' ':ColumnName:' ""'
End
End
End Else
if SQLType = "DATE" or SQLType = "TIME" or SQLType = "TIMESTAMP" then
DataType = SQLType
GenRec<-1> = MvNo:' SQL':SQLType:' 1 8 "" ':MvNo:' ':ColumnName:' ""'
end else
DataType = SQLType : "(" : Precision : ")"
GenRec<-1> = MvNo:' SQL':SQLType:' 0 ':Precision:' "" ':MvNo:' ':ColumnName:' SQL_Latin1_General_CP1_CI_AS'
end
End
if DebugMode then print 'DataType=':DataType
next MvNo
if DebugMode then print
NoOfLines = dcount(GenRec, @FM)
for i=1 to NoOfLines
if DebugMode then print GenRec<i>
next i
Ans = GenRec
end else
ErrMsg = 'Could not find CMetaColumn in ':DsMetadataId
goto ErrRtn
end
end else
ErrMsg = 'Could not find ':DsMetadataId:' in DS_METADATA.'
goto ErrRtn
end
goto TheEnd
* -----------------------------------------------------------------
ErrRtn:
if DebugMode then print
if DebugMode then print ErrMsg
if DebugMode then print
Ans = ErrMsg
* -----------------------------------------------------------------
TheEnd:
if DebugMode then print
if DebugMode then print
if DebugMode then print Ans
if DebugMode then print
Mamu Kim
This is in the same order as your metadata in a link if you save a link or if you import. I suggest you create a type 19 hash file. You save all your metadata in a specific category. Select this category call this routine and write the results into this hash file. A type 19 hash file is a directory at the OS level. You can just copy these to where you need them and shell out and execute bcp in the same directory you write your sequential files.
Mamu Kim
This is how you should use metadata to help automate simple tasks. Now this creates an accurate control file. It is simple to use too. Metadata if used properly can help design the best solutions. I think something like MetaStage is overkill for most of us. Metadata should drive your designs from start to finish. Metadata should be proactive not a passive list of what fields exist where.
There is a difference in project tracking and project management. Sometimes tracking is all you need. In metadata what are you trying to do manage it or use it to help create your ETL. Most developers do not use metadata effectively or they waste time managing the descriptions.
On one of my recent projects we spent more time writing QA documents than writing the ETL. This is not a well managed Rapid Application Development project. Naturally it was by an ex-big accounting firm manager. RAD ETL tools need to managed differently than old COBOL shops where something took 6 months to write one small change and a whole group of developers.
I hope more people start to develop these metadata routines to automate simple admin functions like create tables or estimate table space. A simple routine can save a lot of guess work.
There is a difference in project tracking and project management. Sometimes tracking is all you need. In metadata what are you trying to do manage it or use it to help create your ETL. Most developers do not use metadata effectively or they waste time managing the descriptions.
On one of my recent projects we spent more time writing QA documents than writing the ETL. This is not a well managed Rapid Application Development project. Naturally it was by an ex-big accounting firm manager. RAD ETL tools need to managed differently than old COBOL shops where something took 6 months to write one small change and a whole group of developers.
I hope more people start to develop these metadata routines to automate simple admin functions like create tables or estimate table space. A simple routine can save a lot of guess work.
Mamu Kim
-
- Participant
- Posts: 145
- Joined: Fri May 02, 2003 9:59 am
- Location: Seattle, Washington. USA
We are running 7.5 under Windows and are using the SQL Bulk loader so I never noticed when it went away. From what I have been told by the Microsoft folks we should not use the DataStage BCP since it uses MSDBLIB (which is only being maintained for backwards compatibility). It is also my understanding that the new BCP utility is essentially just a wrapper for the SQL Bulk load.crouse wrote:Shawn,
I don't see a BCP stage in 7.1r1 for Windows (on the distribution CD). Only SQL Server bulk stage is "Microsoft SQL Server Bulk Loader".
Do you have both? On what platform and release?
-Craig
By the way it seems that the BCP is now under the Sybase drop-down on the pallet.
Shawn Ramsey
"It is a mistake to think you can solve any major problems just with potatoes."
-- Douglas Adams
"It is a mistake to think you can solve any major problems just with potatoes."
-- Douglas Adams
This routine will ask for a id to a DS_METADATA record. The id is category : "\" : tablename like Hashed\AdSales\Daypart2StgHash. If you compile and test this routine then type in something that is in your metadata. Double click on results and you can see the bcp file.Kim,
You lost me on how to use this.
I've created a routine with your code, now what?
Looks interesting.
-Craig
You call this routine in a loop and write the results out so you could use these for all your tables. You could do this in a batch job, a normal job with a UV stage or a routine. If you want to do this then it may take a little tweaking. Test it first.
Mamu Kim