CI Artifact Creation For SQL Server

This article is part of my SQL Server CI/CD Series. In it I will be talking about creating artifacts related to SQL Server in your the CI part of the build chain.

What is an artifact?

First lets define an artifact within the narrow scope of Continuous Integration (CI) an artefact is anything created during a build that is retained for download. Typically  this is a zip file or other package of the compiled application and a html rendering of test reports.

Artifacts we will produce

We are going to produce the following artifacts:

  • A backup of the database taken after all tests are run
  • A dacpac of the schema
  • Output of some queries we run through SQLCMD

We can produce more artifacts like SQL Unit test reports, database diagrams, or even scripting out the entire database. We will do that in a future article. For now, we will stick to these three artifacts.

Why do we want to produce these artifacts?

Just because you can’ doesn’t mean you should. You don’t want to create artifacts unless they will be used. So what is the point of producing each artifact.

Database Backup

In a CI process that involves SQL Server, you’re probably going to create a database. You are then going to modify it as part of your testing. In the end you are going to end up with a “gently used” database. You will likely have all your schema setup, any “static” data, and some test data. You’ll have some statistics, and fragmentation as well. You might want to restore the backup to your laptop and inspect all this stuff. Perhaps manually verify that your assertions in your tests are covering what you think they should.

You also might want a copy of the database that someone can restore to run a local copy of their code, or reports against.

Finally, if any of your CD pipelines are designed to create a fresh database, your going to need this artifact in that CD pipeline. As a matter of fact you might end up creating two backups. One at the point where your database is in a clean “initial state,” and another after all your automated testing has been run.

Dacpac

I hold the strong opinion that the best way to deploy schema changes to a database is a dacpac. You’re welcome to disagree with me, and like everything else in SQL Server, the answer it depends. However, My advice in these articles will always be around the assumption that the dacpac is the gold standard for automated deployments of schema changes.

That being said, if you prefer to use scripts or another tool to manage schema changes, then you are wasting storage and build server CPU creating a dacpac.

Sqlcmd output

Sometime you just want to generate a quick and dirty report in T-SQL to look at. Maybe there is information you want to be aware of, but don’t necessarily want to fail a build, like index fragmentation after the tests are run. Perhaps someone just wants a report to look at after a test run.

Regardless, sometimes sqlcmd –i inputfile.sql –o outputfile.txt is good enough to get you home.

How do we create these artifacts in AppVeyor?

In our introduction I stated we would use AppVeyor as our CI tool, because its free for open source projects, and because its simplicity makes it easy to explicitly show how we accomplish tasks. The steps are all segments of appveypr.yml.

First of all, we need to start SQL Server. We use the services section of appveyor.yml to specify what services to start. I happen to use SQL Server 2016 because that’s the version of SQL Server that Always Encrypted was introduced on.

services:
- mssql2016

Now, every step we use to get the database to its final step is a subject for another blog. I’ll just simply this.

  1. We use environment variables to configure the connection information.
  2. We create the database with sqlcmd.
  3. Other stuff happens to the database.

The relevant environment variables in our case are:

environment:
  SQL_SERVER_INSTANCE: (local)\SQL2016
  SQL_SERVER_USER: AlwaysEncryptedOwner
  SQL_SERVER_PASSWORD: 7aO!z@xUu!4r6EvD#D&l$sz6&h^rhxL6fzAHMpnOga@LO*WdsEdpfh4^Egtl
  SQL_SERVER_DATABASE: AlwaysEncryptedSample
  SQL_SERVER_BACKUP_FILE: $(SQL_SERVER_DATABASE).bak
  SQL_SERVER_DACPAC: $(SQL_SERVER_DATABASE).dacpac
  SQL_SERVER_VERIFICATION_LOG: $(SQL_SERVER_DATABASE).verification.log

Our creation script is run here:

before_build:
- cmd: sqlcmd -S "%SQL_SERVER_INSTANCE%" -i .\appveyor\init.sql

The script itself is:

CREATE DATABASE [$(SQL_SERVER_DATABASE)];
GO
CREATE LOGIN $(SQL_SERVER_USER) WITH PASSWORD = '$(SQL_SERVER_PASSWORD)';
GO
-- Needed for dacpac creation
-- GRANT VIEW ANY DEFINITION TO $(SQL_SERVER_USER);
-- GO
USE [$(SQL_SERVER_DATABASE)];
GO
CREATE USER $(SQL_SERVER_USER);
ALTER ROLE  db_owner  
       ADD MEMBER $(SQL_SERVER_USER); 
GO

It goes without saying that all SQL Server services on AppVeyor build machines use mixed mode authentication and the user that executes AppVeyor steps is a system administrator on all SQL Server instances.

Creating the sql login/user is superfluous, and a step I might remove from that script.

Other stuff happening in the database is the part I’m going to brush over. We now have an empty database that we can perform artifact creating operations on.

Creating the Artifacts

As of the time I wrote this article all my sql artifacts are created in after_build. This is before unit tests are run. None of my tests or any subsequent steps currently modify the database. At this point these artifacts operate on a “fresh” database.  Moving them to after_test would give me artifacts that have the test result data in them, if any.

after_build:
- cmd: sqlcmd -S "%SQL_SERVER_INSTANCE%" -d %SQL_SERVER_DATABASE% -W -i .\appveyor\schema_verification.sql -o %SQL_SERVER_VERIFICATION_LOG%
- cmd: sqlcmd -S "%SQL_SERVER_INSTANCE%" -Q "BACKUP DATABASE [$(SQL_SERVER_DATABASE)] TO DISK='$(SQL_SERVER_BACKUP_FILE)' WITH FORMAT, COMPRESSION, STATS=10;"
- ps: Move-Item -Path (Join-Path $env:SQL_SERVER_BACKUP_DIRECTORY "$($env:SQL_SERVER_BACKUP_FILE)") -Destination $env:APPVEYOR_BUILD_FOLDER
- cmd: sqlpackage.exe /Action:Extract /TargetFile:"%SQL_SERVER_DACPAC%" /SourceServerName:"%SQL_SERVER_INSTANCE%" /SourceDatabaseName:%SQL_SERVER_DATABASE%

Right now schema_verification.sql looks like this:

PRINT 'Tables in database:'
SELECT 
        QUOTENAME(s.name) + '.' + QUOTENAME(t.name) AS [Table Name] 
    FROM 
        sys.tables t INNER JOIN sys.schemas s
            ON s.schema_id = t.schema_id;

I was actually using this to manually verify that the tables were being created. This is a stop gap for actual automated tests of the SQL Schema.

Also a note on the Move-Item. AppVeyor only lets you create artifacts from files inside the folder where the git repo is checked out. I probably could just specify the full path of the backup file in the CREATE BACKUP T-SQL statement. I’m pretty sure the image is setup so there would be no security issues.

Finally, a note on the environment variables used by the Move-Item invocation. SQL_SERVER_BACKUP_DIRECTORY is set from data in the registry so I need powershell to set it dynamically like so.

install:
- ps: $env:SQL_SERVER_BACKUP_DIRECTORY=(Get-ItemProperty "HKLM:\Software\Microsoft\Microsoft SQL Server\MSSQL13.SQL2016\MSSqlServer").BackupDirectory

And APPVEYOR_BUILD_FOLDER is provided to us already. That’s the folder where the git repo is checked out.

So now that we have created these artifacts, we must publish them. This is where the artifacts section of the yaml files comes in.

artifacts:
  - path: $(SQL_SERVER_BACKUP_FILE)
    name: Database backup
    type: file
  - path: $(SQL_SERVER_DACPAC)
    name: DACPAC
    type: file
  - path: $(SQL_SERVER_VERIFICATION_LOG)
    name: Verifiction Query Results
    type: file
  - path: $(MSBUILD_LOG_FILE)
    name: MSBuild BInary Log
    type: file

So as you can see, any file we can make via powershell, the command line or any of the tooling the AppVeyor images provides us can become an artifact. Artifacts are by no means limited to SQL Server, that’s simply the focus of this blog series.

Introducing the SQL CI/CD series

Articles in the series

About the series

Perhaps the hardest part of your software stack to integrate into a CI/CD (Continuous Integration/Continuous Deploy) chain is your database objects. Now Microsoft has made some great strides in this, and its actually not that bad at all if you happen to use all the Microsoft best practices, and have a simple use case, ie:

  • You’re using Agile, Scrum, Kanban, etc. on a closed source project with a single production instance.
  • A private git repo with a .NET Core MVC App and an OLTP database preferably hosted on GitHub or Azure DevOps (formerly Visual Studio Team Services online).
  • Schema managed by a SSDT Project in Visual Studio
  • Any ORMs are not generating schema, and developers are manually handling creating models (C# classes that map to the RDBMS tables).
  • All code coverage and unit tests reports will be viewed in Visual Studio or from the Azure DevOps Dashboard.
  • Build Pipeline Creating a DACPAC in Azure DevOps
  • Release Pipeline does a full DB backup and Deploys a DACPAC in Azure DevOps
  • Release pipeline failure plan is to restore DB backup and rerun last good release
  • Every merged PR gets deployed to dev
  • Manual gates for Production Deploys
  • Database deployed to Azure SQL DB PaaS

If you stay within these lines, and you don’t have any schema or data changes that require a specific migration strategy that present a problem for SSDT, then things are easy.

However, reality doesn’t always match this ideal, especially if the DBAs were brought in after the fact. Even if  DBAs were brought in you could be in a shop that doesn’t bleed Microsoft Blue. Or you could simple be overruled on something. Heck, you could even be a DBA that doesn’t want to do some of the things on that list.

Perhaps your reality, for whatever reason looks like the following:

  • This is an open source project.
  • This project is a demo for training, conference talks, user group talks, and blog articles, so you need many different artefacts produced.
  • Your Git repo is a complex mono repo or contains submodule with projects in different languages
  • You’re main web app is .NET 4.6.1 with MVC 5 and EF 6.2
  • You’re using a third party CI/CD tool such as AppVeyor
  • You’re using other third party services in your CI/CD chain such as codecov.io.
  • Part of your schema is from a Code First EF6 application
  • Your build pipelines is creating several artifacts
    • Website deploy zip file
    • Nuget Packages for some of the DLLs
    • Noget Packages of Powershell Modules
    • SQL DACPAC
    • SQL Schema Generation Script
    • Backup of database Unit test were run against.
    • Code Coverage and unit test reports in HTML format and junit
    • Merges to master trigger a github release and all generated artifacts are sent there.

I happen to have such a project, the AlwaysEncrypted sample application I wrote around the release of SQL Server 2016 that I’ve recently dusted off to improve my DevOps skills. I’ve decided to document some of the steps in my journey along the way as a series of articles on this blog.

About the sample project

While I’m no fan of code first ORMs, I wanted this app to demonstrate AlwaysEncrypted “just worked” even if your app wasn’t built around it. Instead of Azure DevOps, I went with AppVeyor because its free, and since its more simple than Azure DevOps build pipelines, and configured by YAML, I can illustrate the how and why better than “here is a GUI step in Azure DevOps”