An alternative to sqlcmd and more: bulk insert csv Data into Microsoft SQL Server

(, en)

The Microsoft SQL Server Command Line Tool, aka sqlcmd, is .. working, in … a way, somehow, but especially the csv bulk import, not as I expected.

The csv file is on my machine, the Microsoft SQL (MSSQL) or Azure SQL Edge Server somewhere else. Perhaps even in the cloud. Or locally running as Docker container. Or as part of my CI/CD pipeline.

My use case is simple:

I want to populate the database during my CD/CI pipeline run with a non-trivial amount of data. It needs to be simple, I want to use CSV. I don’t want to copy the data to the server itself to bulk load it.

Overall, these are the things I would like to have:

So basically a subset of the sqlcmd tool, only fast and easy to use.

I couldn’t find anything on »The Internet«, but the go version of the Microsoft SQL server driver, building your own tool is a piece of cake.

So, here it is: the go-mssql-load tool.

Installation

go get github.com/jwbargsten/go-mssql-load

make sure the go bin path is in your $PATH. You can get the path with

# $GOPATH/bin
go env GOPATH

The easiest way would be to add something like this to your bash/zsh/shell profile:

export PATH="$(go env GOPATH)/bin:$PATH"

Alternatively, on Mac (or Linux) you can also use homebrew:

brew install jwbargsten/misc/go-mssql-load

Test setup

Before you can run/test the commands below, start up an example container with Azure SQL Edge (Azure SQL edge is mostly compatible with mssql and it runs on M1 machines):

$ docker run --rm -p 1433:1433 \
  -e "ACCEPT_EULA=1" \
  -e "MSSQL_SA_PASSWORD=Passw0rd" \
  mcr.microsoft.com/azure-sql-edge:latest

Basic usage

With docker running, you can start playing:

# print the help/usage
$ go-mssql-load help

# print the DSN
$ go-mssql-load printdsn

# check the connection for localhost:1433
$ go-mssql-load --user sa --pass Passw0rd check

# load some example data
$ go-mssql-load --user sa --pass Passw0rd loadsql sql/init.sql

# try a query
$ echo "select * from pokemon.pokemon" | go-mssql-load --user sa --pass Passw0rd querysql -
# try a query, ignore logging
$ echo "select * from pokemon.pokemon" | go-mssql-load --user sa --pass Passw0rd querysql - 2>/dev/null

CSV loading

You can use this tool to do csv bulk loading. By default all columns are treated as string, but you can specify a data type as

loadcsv doesn’t have any parsing magic and uses the csv parser provided by the go std lib. So, if you don’t specify it, it won’t happen. The spec is as follows:

<name>::<datatype>[!]

with an optional ! indicating a nullable column.

column spec column name data type nullable
name::string! name string yes
name::! name string yes
name name string no
age::int! age int64 yes
height::float! age float64 yes
height::float age float64 no

Internally the golang parse functions are used.

Supported types:

Example: ./sql/pokemon_typed.csv

You can set the null string and the separator via cli flags. The null string is by default the empty string "".

The TAB character is a bit tricky to specify, but you can just supply a quoted TAB to parse TSV files:

$ go-mssql-load \
    --user sa --pass Passw0rd loadcsv --sep "	" \
    pokemon.pokemon sql/pokemon_typed.csv

Only columns that have the nullable flag ! will use the nullstr flag.

External Type Spec

As mentioned in the beginning, you can also supply an external types file in JSON format. There are two options supported, as dict or as list.

The dict looks as follows (order is not important):

{
  "colname2": "int",
  "colname1": "string!",
  "colname3": "float!"
}

Or the same as list (order is important):

[
  "string!",
  "int",
  "float!"
]

You can add the types via the --types parameter:

$ go-mssql-load --user sa --pass Passw0rd loadcsv \
    --sep ";" \
    --types sql/pokemon_types.json \
    pokemon.pokemon sql/pokemon.csv

SQL execution

go-mssql-load uses the batch mechanism of the go-mssqldb lib. This means that each file read and split into statements separated by a GO keyword.

Example: ./sql/init.sql.

$ go-mssql-load --user sa --pass Passw0rd loadsql sql/init.sql

SQL querying

Similar to SQL execution, query scripts are split by the keyword GO. This means you can have multiple query statements per file. Each query results in a set of Newline Delimited JSON records, separated by “---”.

Example: ./sql/select.sql

$ go-mssql-load --user sa --pass Passw0rd querysql sql/select.sql  2>/dev/null
{"name":"Wartortle"}
---
{"hp":4}