Export XML from SQL Server Using bcp

*** Update – see the comments below for a solution to an issue with unexpected CR/LF characters in the XML file generated with this method. Thanks to reader Anthony for diagnosing the issue and providing the solution. ***

A recent project at work included a requirement to generate a well-formed XML file from the output of a stored procedure. I knew I could create the XML in SQL Server using the FOR XML clause with nested queries (topics for another time) to create a single-row result set with a single XML column containing the entire well-formed XML. For simplicity and ease of maintenance, I also wanted to automate the creation of the XML file without having to create a .NET application or an SSIS package to bridge the gap. A command-line utility seemed like just the thing, and I first tried the sqlcmd utility but quickly learned that it truncates variable-length datatypes (including XML) to 1 MB, far smaller than the anticipated output of my stored procedure.

When I explained my problem to a colleague, he suggested that I try the bcp utility. I have used bcp to move data between SQL Server instances but because I was always dealing with source tables and destination tables with the same schema, I had never tried to manipulate the shape of the bcp output file. After a little digging in Books Online, I found a page titled “Examples of Bulk Import and Export of XML Documents”.  Way down at the bottom of the page was a section headed “E. Bulk Exporting XML Data”, which, with no further explanation, gave this example of a bcp command that would export XML data from a table to a file:

bcp bulktest..xTable out a-wn.out -N -T -S<server_name>\<instance_name>

I knew that the -T and -S switches specify that bcp should connect to the indicated server using integrated authentication and that this syntax would export all the data from the specified table, but didn’t recognize the -N switch. Looking back at the bcp utility reference, I learned that the -N switch specifies that noncharacter data should be exported as native SQL Server datatypes and character data should be exported as Unicode characters and “is intended for transferring data from one instance of SQL Server to another using a data file.” I didn’t want to transfer data from one instance of SQL Server to another, though, and a quick test of this syntax showed that the resulting output file flummoxed XML parsers. 

In the description of the -N switch, though, I noticed a reference to the -w switch and scrolled down to the -w section of the page to discover that this switch specifies that all data should be exported as Unicode characters. I wondered whether converting the SQL Server XML to Unicode would produce a well-formed XML file, so I tried this syntax, using the queryout  argument to execute my stored procedure and export the results to a file:

bcp "exec dbo.MyStoredProc" queryout "\\myfileserver.mydomain.com\myfolder\storedprocoutput.xml" -S MyServer\MyInstance -d MyDatabase -T -w

Success! The resulting file contained well-formed XML easily understood by XML parsers, achieving my goal of creating and exporting well-formed XML from SQL Server simply and easily with a command-line utility. I can now schedule the creation of the required XML file using a task scheduler like SQL Server Agent, causing it to appear at the appointed time and place like magic.

Advertisements

7 comments

  1. Jeff Moden · · Reply

    Simple and nicely done. Thanks for taking the time to post it.

  2. Thanks very much for posting this Jason.

  3. Great, this is what I need. Thanks a lot.

  4. What version of SQL Server were you using? Unfortunately bcp’s -w switch doesn’t seem to work for me on SQL Server 2014.

    In my case bcp inserts a Unicode pair into the stream every ~2,030 characters (4,060 bytes). Sometimes this happens inside data but other times it happens right in the middle of a tag name, which obviously breaks the XML file. e.g.:

    EXEC xp_cmdshell ‘bcp “SELECT * FROM Database.dbo.LocalisedString FOR XML PATH(”LocalisedString”), ROOT(”Root”)” queryout D:\LocalisedStrings-unicode.xml -w -T’

    1. The fix for the embedded CR LF issue is to set an empty record separator by adding a “-r” switch at the very end of the BCP command line, e.g.:

      EXEC xp_cmdshell ‘bcp “SELECT * FROM Database.dbo.LocalisedString FOR XML PATH(”LocalisedString”), ROOT(”Root”)” queryout D:\LocalisedStrings-unicode.xml -w -T -r’

      1. Hi Anthony,

        I’m pretty sure that I developed this using the SQL Server 2012 version of the bcp utility and don’t recall any issue with CR/LF characters in the output. When I try both my code and your code today, though, I see the issue you mentioned. Your solution (using the -r switch with nothing specified to override the default row terminator) works perfectly, though. Thanks for the update!

        Jason

  5. That should have read “Unicode CR LF pair” but I guess WordPress eats less-than and greater-than tags.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: