Jump to content

SQL Server containers on Kubernetes with S3-compatible object storage - Getting started


Recommended Posts

Guest amvin87
Posted

One of the new features introduced with SQL Server 2022 is the ability to connect to any S3-compatible object storage and SQL Server supports both Backup/Restore and data lake virtualization with Polybase integration. In this blog, we will demonstrate both of these features for SQL Server 2022 Containers running on Kubernetes. As usual, I will use the Azure Kubernetes Service as my Kubernetes environment :)

.

 

 

To know more about these features, refer SQL Server backup to URL for S3-compatible object storage - SQL Server | Microsoft Learn and Access external data: S3-compatible object storage - PolyBase - SQL Server | Microsoft Learn.

 

 

 

Before we enable Polybase for SQL Server containers, let's go over the prerequisites for this demo:

 

 

 

Pre-requisites:

 

 

You need to have an S3 – Object storage that you can connect to and in this blog, I have configured a MinIO based S3-object storage server on Ubuntu with TLS configured. For details on how to configure this refer: Deploy MinIO: Single-Node Single-Drive — MinIO Object Storage for Linux and Network Encryption (TLS) — MinIO Object Storage for Linux.

 

  • MinIO setting details:
     
    • Minio Console is configured at : https://10.0.0.82:9001/ and the Server endpoint is configured at : https://10.0.0.82:9000/
    • On the MinIO server, I have enabled ports 9000 and 9001 in the firewall.
    • The public certificate that I used to configure TLS on the MinIO server is called "public.crt," and I have it downloaded and saved on my Windows client.

    [*]Customized SQL Server Container:

     

    • Let's create the customized SQL Server container image, that'll be used to deploy the SQL Server containers on the Kubernetes cluster. Here are the steps for creating the customised SQL Server containers; for this demonstration, I created the custom container image for this demo on the same Ubuntu-based Azure VM that I used to configure the MinIO Server.
      ## Step 1: Create a DockerFile with the below content
      # Base OS layer: Latest Ubuntu LTS
      FROM mcr.microsoft.com/mssql/server:2022-latest
      USER root
      # Install prerequistes since it is needed to get repo config for SQL server
      RUN apt-get update && \
      apt-get install -y software-properties-common && \
      rm -rf /var/lib/apt/lists/*
      RUN add-apt-repository "$(wget -qO- https://packages.microsoft.com/config/ubuntu/20.04/mssql-server-2022.list)" && \
      apt-get install -y mssql-server-polybase
      EXPOSE 1433
      USER mssql
      # Run SQL Server process
      CMD ["/opt/mssql/bin/sqlservr"]
       
      # Step 2: Let's build the container image
      sudo docker build -t mssql-pb-update .
       
      # Step 3: Tag the image and push to your container registry, in this case I am using Azure Container Registry(ACR)that I have.
      sudo docker tag sql-pb-update amvinacr.azurecr.io/sql-pb-update:latest
      sudo docker tag mssql-pb-update amvinacr.azurecr.io/sql-pb-update:latest
       
      # To know more about the ACR refer: Tutorial - Quick container image build - Azure Container Registry

    [*]Azure Kubernetes Service based cluster creation and attaching it to the Azure Container Registry(ACR)

     

Let's get started

 

 

With the pre-requisties done, now we can start deploying SQL Server containers on our AKS cluster using the custom image that we created, that installs polybase. But, before we start deploying SQL Server containers, we need two configmaps and one secret object in the AKS cluster, with the details as shown below, I am running all these commands from my Windows client which has kubectl installed and is able to connect to the AKS cluster 'sqlk8s' that I created in previous steps.

 

  • As the endpoint will be validated by the certificate, we will use one configmap object to load the public.crt certificate required for configuring the secure channel between SQL Server and the MinIO Storage. You can use the command below to generate the configmap from the public.crt file. I have copied the certificate public.crt to the following location on my Windows machine: "C:\pb\demo\certs\public.crt"
     
    # Use the below command to create the configmap using the file "public.crt"
    kubectl create configmap mssql-tls --from-file="C:\pbdemo\certs\public.crt"
     
    # You can verify that the, the configmap is created using the command below and the sample output is shown below with obfuscated lines in the output.
     
    C:\>kubectl describe configmap mssql-tls
    Name: mssql-tls
    Namespace: default
    Labels: <none>
    Annotations: <none>
     
    Data
    ====
    public.crt:
    ----
    -----BEGIN CERTIFICATE-----
    MIIDjjCCAnagAwIBAgIUf/RP+g+9uFAKD4TB0R2ot6jZMJAwDQYJKoZIhvcNAQEL
    BQAwZjELMAkGA1UEBhMCVVMxCzAJBgNVBAgMAlZBMRIwEAYDVQQHDAlTb21ld2hl
    cmUxDjAMBgNVBAoMBU15T3JnMQ0wCwYDVQQLDARNeU9VMRcwFQYDVQQDDA5taW5p
    b3NlcnZlcnVidTAeFw0yMzAxMDYxNTA0MjhaFw0zMzAxMDMxNTA0MjhaMGYxCzAJ
    BgNVBAYTAlVTMQswCQYDVQQIDAJWQTESMBAGA1UEBwwJU29tZXdoZXJlMQ4wDAYD
    VQQKDAVNeU9yZzENMAsGA1UECwwETXlPVTEXMBUGA1UEAwwObWluaW9zZXJ2ZXJ1
    YnUwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQCcC0irpVFJQaVqiuKP
    LuRQUdYxehTvFCuCJLpGhS03XczCWKzemd0a3W/UC01N+tbFzpmCi0/l+R5SFsVC
    3tRPKlngtDOICnYPpKSaWz03o3BgwNqtuKtucboXp3Qo+5DSd7XDi1qY+plY6LwC
    vbjLtjNBfyWlwS6nhTjDebvUY6w0xNmG8w1H9fPZcq/gsSGUnPyvywEENYmk5R6X
    xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    1J/rLOJUVeKRvpbLraN25VoEoS8hz07DT3gmtVyJgpEK1oJ+mEs94ae76ujr5MGm
    M4HgV4+o9wxqjWdPyFDny/1EPsv30VqbcwmbkuPE7UHG4JH/5gxqxnXJcPnr3BQy
    4GU=
    -----END CERTIFICATE-----
     
  • The second configmap object is used to load mssql.conf, which contains the polybase Trace flag 13702, which is required for SQL Server on Linux. This configmap is generated using the yaml script shown below:
    # Create a yaml file called: mssql_config.yaml and save it to the location "C:\pbdemo\certs\mssql_config.yaml"
     
    kind: ConfigMap
    apiVersion: v1
    metadata:
    name: mssql
    data:
    mssql.conf: |
    [EULA]
    accepteula = Y
    accepteulaml = Y
     
    [coredump]
    captureminiandfull = true
    coredumptype = full
     
    [language]
    lcid = 1033
     
    [filelocation]
    defaultdatadir = /var/opt/mssql/userdata
    defaultlogdir = /var/opt/mssql/userlog
     
    [traceflag]
    traceflag0 = 13702
     
    # We are ready to create the configmap using the command as shown below:
    Kubectl apply -f “C:\pbdemo\certs\ mssql_config.yaml”
     
    # You can inspect the configmap using the command shown below with the sample output as well
    C:\>kubectl describe configmap mssql
    Name: mssql
    Namespace: default
    Labels: <none>
    Annotations: <none>
     
    Data
    ====
    mssql.conf:
    ----
    [EULA]
    accepteula = Y
    accepteulaml = Y
     
    [coredump]
    captureminiandfull = true
    coredumptype = full
     
    [language]
    lcid = 1033
     
    [filelocation]
    defaultdatadir = /var/opt/mssql/userdata
    defaultlogdir = /var/opt/mssql/userlog
     
    [traceflag]
    traceflag0 = 13702
     
    Events: <none>
  • The secret object in the cluster is used to save the sa password for SQL Server, the command to create the secret is as shown below
    kubectl create secret generic mssql --from-literal=MSSQL_SA_PASSWORD="MyC0m9l&xP@ssw0rd"

Deploy SQL Server Containers

 

 

We are now ready to deploy SQL Server containers, below is the deployment yaml called "sqldeploy.yaml" that is used to deploy one SQL Server container as a statefulset deployment using the Azure disk as the storage class.

 

Using this YAML we deploy:

 

  • One SQL Server instance, with separate PV(persistent volumes) and Persistent volume claims (PVCs) for User database, user log and tempdb files.
  • We are loading the public.crt certificate to the folder: /var/opt/mssql/security/ca-certificates folder inside the container.
  • We are loading the mssql.conf file with various parameter and the required trace flag 13702.
  • At the end, we create the external load balancer Service to be able to connect to the SQL Server.
    # Create the sqldeploy.yaml at the location "C:\pbdemo\certs\sqldeploy.yaml", and paste the content below, you can modify it according to your requirements.
     
    kind: StorageClass
    apiVersion: storage.k8s.io/v1
    metadata:
    name: azure-disk
    provisioner: kubernetes.io/azure-disk
    parameters:
    storageaccounttype: Standard_LRS
    kind: Managed
    ---
    apiVersion: apps/v1
    kind: StatefulSet
    metadata:
    name: mssql
    labels:
    app: mssql
    spec:
    serviceName: "mssql"
    replicas: 1
    selector:
    matchLabels:
    app: mssql
    template:
    metadata:
    labels:
    app: mssql
    spec:
    securityContext:
    fsGroup: 10001
    containers:
    - name: mssql
    command:
    - /bin/bash
    - -c
    - cp /var/opt/config/mssql.conf /var/opt/mssql/mssql.conf && /opt/mssql/bin/sqlservr
    image: amvinacr.azurecr.io/sql-pb-update:latest
    ports:
    - containerPort: 1433
    env:
    - name: ACCEPT_EULA
    value: "Y"
    - name: MSSQL_ENABLE_HADR
    value: "1"
    - name: MSSQL_SA_PASSWORD
    valueFrom:
    secretKeyRef:
    name: mssql
    key: MSSQL_SA_PASSWORD
    volumeMounts:
    - name: mssql
    mountPath: "/var/opt/mssql"
    - name: userdata
    mountPath: "/var/opt/mssql/userdata"
    - name: userlog
    mountPath: "/var/opt/mssql/userlog"
    - name: tempdb
    mountPath: "/var/opt/mssql/tempdb"
    - name: mssql-config-volume
    mountPath: "/var/opt/config"
    - name: mssql-tls-configmap
    mountPath: "/var/opt/mssql/security/ca-certificates/"
    volumes:
    - name: mssql-config-volume
    configMap:
    name: mssql
    - name: mssql-tls-configmap
    configMap:
    name: mssql-tls
    volumeClaimTemplates:
    - metadata:
    name: mssql
    spec:
    accessModes:
    - ReadWriteOnce
    resources:
    requests:
    storage: 8Gi
    - metadata:
    name: userdata
    spec:
    accessModes:
    - ReadWriteOnce
    resources:
    requests:
    storage: 8Gi
    - metadata:
    name: userlog
    spec:
    accessModes:
    - ReadWriteOnce
    resources:
    requests:
    storage: 8Gi
    - metadata:
    name: tempdb
    spec:
    accessModes:
    - ReadWriteOnce
    resources:
    requests:
    storage: 8Gi
    ---
    apiVersion: v1
    kind: Service
    metadata:
    name: mssql-0
    spec:
    type: LoadBalancer
    selector:
    statefulset.kubernetes.io/pod-name: mssql-0
    ports:
    - protocol: TCP
    port: 1433
    targetPort: 1433
    name: tcpsql
     
    # You can run the below command to deploy SQL Server conatiner based on the deployment yaml file called sqldeploy.yaml
     
    Kubectl apply -f “C:\pbdemo\certs\sqldeploy.yaml”
     
    # You can run the kubectl get command to see all the SQL Server container deployed, sample command and output shown below for reference
     
    C:\>kubectl get all
    NAME READY STATUS RESTARTS AGE
    pod/mssql-0 1/1 Running 0 92m
     
    NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
    service/kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 16h
    service/mssql-0 LoadBalancer 10.0.252.15 20.235.55.7 1433:31551/TCP 13s
     
    NAME READY AGE
    statefulset.apps/mssql 1/1 97m

 

 

 

Connect to SQL Server - Demo time

 

 

We are ready to connect to the SQL Server, feel free to use your favorite SQL Server tool .viz SQL Server Management Studio(SSMS) or Azure Data Studio (ADS) and connect to the SQL Server and run the below commands to virtualize data and access CSV or parquet files and/or to take backups of the databases on the S3 based object storage.

 

 

 

I'm using the SSMS tool and running the below queries. I connect to the SQL Server and confirm that that TLS certificate public.crt is loaded by SQL Server, I can do that from the errorlog using the command : sp_readerrorlog from the T-SQL edition and the sample output is shown below with the certificate loaded:

 

largevv2px999.png.79c5d567b1904a22e35a1678f2b6e357.png

 

 

 

Backup/Restore to S3-based object storage:

 

 

To take S3 based backups, I need to first create bucket on the storage in this case, I connect to the MinIO portal and create the bucket called "sqltest" as shown below and then create a credential to access the storage.

 

largevv2px999.png.4eae76fd813d163b7ccd47dc5cd7cc66.png

 

 

 

Now, use the below sample T-SQL script to create the credential and running the backup, the backup should be successful as shown below

 

 

 

CREATE CREDENTIAL [s3://10.0.0.82:9000/sqltest]

WITH

IDENTITY = 'S3 Access Key',

SECRET = 'amit:mypass12345';

 

BACKUP DATABASE S3_backup_demo

TO URL = 's3://10.0.0.82:9000/sqltest/S3_backup_demo_db.bak'

WITH STATS = 10

 

 

 

largevv2px999.png.f208e373541aed1b5a9b96d46dd3abdc.png

 

You can view the backup file created on the MinIO console as shown below:

 

largevv2px999.png.d68a0dbb7393baa66adef0530451cb7d.png

 

 

 

Data Virtualization- Polybase

 

 

Here are the steps to configure data virtualization, below T-SQL sample script is used to configure the following steps:

 

  • We verify the polybase feature is installed, then configure and enable polybase feature for SQL Server.
  • Create the database scoped credential to access the object storage
  • Now, we create the external data source and link it with the credential that we created in the preceding step.
  • We are now ready to query the data from the external data source

 

 

 

-- Verify the Polybase feature is installed:

SELECT SERVERPROPERTY ('IsPolyBaseInstalled') AS IsPolyBaseInstalled;

 

-- Enable polybase feature using the commmands below:

 

exec sp_configure @configname = 'polybase enabled', @configvalue = 1;

reconfigure with override

exec sp_configure @configname = 'polybase enabled'

 

--Next, let’s create a database and database scoped credential to access the object storage.

 

create database pb_demo

 

USE pb_demo

 

CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'mypass123@';

 

CREATE DATABASE SCOPED CREDENTIAL s3_dc WITH IDENTITY = 'S3 Access Key', SECRET = 'amit:mypass12345' ;

 

-- To verify, the credential is created you can run the below command:

SELECT * FROM sys.database_scoped_credentials;

 

-- Now go ahead and try creating the External data source:

 

CREATE EXTERNAL DATA SOURCE s3_ds WITH (LOCATION = 's3://10.0.0.82:9000/',CREDENTIAL = s3_dc)

 

-- When creating the external data source, if you see the below error, then please restart the container and run the same command again that should solve the below error

Msg 46530, Level 16, State 11, Line 20

External data sources are not supported with type GENERIC.

 

--We now have the credential and the external data source created to query files on the object storage to query a CSV based file that is already present on the object storage you can run the command as shown below

 

SELECT * FROM OPENROWSET( BULK '/sqltest/test.csv', FORMAT = 'CSV', DATA_SOURCE = 's3_ds') WITH ( c1 varchar(50) ) AS [Test1]

 

-- To query a parquet file that is available on the object storage you can run the below query, I copied a sample parquet file that is available here: kylo/samples/sample-data/parquet at master · Teradata/kylo to my Minio object storage.

 

SELECT *

FROM OPENROWSET

( BULK '/sqltest/userdata1.parquet'

, FORMAT = 'PARQUET'

, DATA_SOURCE = 's3_ds'

)

AS [Test12]

 

 

 

 

 

largevv2px999.png.cd64dd1e3fa039af5ec9a36648bafdea.png

 

Note: You can also use Create External Table as (CETAS) to access both parquet and CSV files for SQL Server on Linux.

 

 

 

I hope you find this helpful to get started with polybase and S3 storage integration with SQL Server 2022 containers. Special thanks to my colleagues from the Polybase team for helping me with this demo. Happy learning!

 

Continue reading...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...