On a fairly regular basis at work I have to take .csv files with data from a customer’s database and import them into my SQL Server instance. This should be a simple operation, but the design of the native SQL Server Data Import tool makes it much, much more complicated than it needs to be. The biggest pain point by far is having to specify the data types of every single column in the data source by hand. Not only does it assume that everything is a string, forcing me to go through and figure out which types are a date, an int, a float or whatever else I have in the database, but by default changing a column’s type to a decimal uses decimal(18,0), truncating all my floating point values.
It always seemed a bit absurd that I should have to manually specify the types of all my data since I was importing into a database where the data types of the columns are explicitly specified. Once the import tool knew what table I was importing into, shouldn’t it be able to figure out the types for me?
So I wrote a Python script to import data for me. It uses the pymssql module to connect to the database, and the main insert function requires only three arguments: a connection to a database, the name of the table to import into, and an object with a function to return the names of the columns and then be able to iterate over the data. Conveniently, I also wrote a CSVFile class that wraps around a .csv file and provides just that functionality. Continue reading Importing CSV Data into SQL Server →