Friday, March 30, 2012

XML Query Composition in Practice

XML composition using XML Query is not what you might call a popular subject, not even a frequently discussed one; well, at least as far as SQL Server is concerned. In this blog, I have discussed XML retrieval on numerous occasions, I have also touched SQL Server XML performance characteristics, but XML composition has so far been stuck on the back burner. Until now, that is.

The subject of XML composition is covered prominently in the XQuery W3C Recommendation. It can be used not only to create new XML documents, but also to transform existing ones. Even though SQL Server currently doesn't implement the entire XQuery Recommendation, the essentials are covered, and have been available since SQL Server 2005.


A Little Bit of Background

SQL Server provides two XML composition methods natively:

  • The FOR XML clause instructs the Database Engine to return the result of a SELECT query as an XML document; and
  • XQuery, principally used to retrieve XML data, can also be used to create XML documents (or fragments).

The SQL Server implementation of XQuery supports two XML composition methods (please, copy them to SSMS):

  • Using direct constructors – the XML document is created from a string, mimicking the resulting XML structure.
    E.g.:
    select cast(N'' as xml).query
      ('
      <collection>
       <item number="1">
        This is an item in the collection.
       </item>
      </collection>
      ')
  • Using computed constructors – rather using a string representation to build the resulting XML document or fragment, the result is created using special XQuery instructions.
    E.g.:
    select cast(N'' as xml).query
      ('
      element  collection
      {
       element  item
       {
        text  {"This is an item in the collection."}
        ,attribute number { "1" }
       }
      }
      ')

After copying the above examples to SSMS, execute them. Compare the results.

Both methods support the use of two extension functions, sql:column and sql:variable, which allow the data to be added to the resulting XML document dynamically (e.g. from a SQL variable, or from a column in the available row set). The part of the expression where an empty string is converted to XML (i.e. cast(N'' as xml)) is used simply to provide a reference to an empty XML document, required by the query XML function.

The following XQuery Computed Constructors are available in SQL Server:

  • element – instructing the Database Engine to construct an element node;
  • attribute – instructing the Database Engine to construct an attribute node; and
  • text – instructing the Database Engine to construct a text node.

Let this be enough background for now, as I will discuss more in upcoming posts. In this one, however, let's put XML composition to some good use, and also illustrate how to use it.

To help you get started in understanding this rather complex subject, I've prepared a T-SQL script, and you are free to use it to learn about XML composition. Please, open it and then copy it to a new query window in SSMS.

The script uses the data about the tables in your database, exposed in INFORMATION_SCHEMA system views, and builds an XML Schema document for each table. You can use the script to create XML Schema files for your tables that can then be used to transport the data using XML, or to provide a standard way of describing the tables in your SQL Server database. Of course, the principal purpose of the script is to demonstrate SQL Server XML composition.

But before executing any script on SQL Server, we should know exactly what it does.


Decomposing the XML Schema CTE

Let's have a look at the individual elements of the Common Table Expression (or CTE, for short) presented in the script.


The XML Namespaces

...
with xmlnamespaces
  (
  'http://www.w3.org/2001/XMLSchema' as xs
  )
...

We are building an XML Schema, therefore we need the appropriate namespace declarations.


The Data Type Map

...
 ,TypeMap -- Mapping SQL Server data types to XML Schema data types
  (
  SqlType  -- SQL Server Data Type
  ,XmlSchemaType -- XML Schema Data Type
  ,IsFixed
  )
 as
 (
 select 'bigint' as SqlType
  ,'xs:long' as XmlSchemaType
  ,1 as IsFixed
 union
 select 'binary'
  ,'xs:hexBinary'
  ,0
 union
 select 'bit'
  ,'xs:boolean'
  ,1
 union
...(shortened to improve readability)...
 select 'varbinary'
  ,'xs:hexBinary'
  ,0
 union
 select 'varchar'
  ,'xs:string'
  ,0
 union
 select 'xml'
  ,'xs:string'
  ,1
 )
...

SQL Server Data Types, used by table columns, need to be translated to XML Schema Data Types. The purpose of the IsFixed column in the CTE above, is to divide the data types into two groups:

  • The data types that do not require dimensioning, such as bit, int, ntext, xml, etc. – for these the value of IsFixed = 1; and
  • The data types that support dimensioning (length, or scale and precision), such as decimal, [n]varchar, [n]char, etc. for these the value of IsFixed = 0.

Nillability

...
 ,Nillability -- Mapping SQL Server nillability to XML Schema nillability
  (
  SqlNillable
  ,XmlNillable
  )
 as
 (
 select 'NO' as SqlNillable
  ,N'false' as XmlNillable
 union
 select 'YES'
  ,N'true'
 )
...

Column nillability must also be translated accordingly.


The Primary Key

...
 ,PrimaryKey
 as
 (
 select TABLE_CONSTRAINTS.TABLE_CATALOG as [Catalog]
  ,TABLE_CONSTRAINTS.TABLE_SCHEMA as [Schema]
  ,TABLE_CONSTRAINTS.TABLE_NAME as [Table]
  ,TABLE_CONSTRAINTS.CONSTRAINT_NAME as Name
  ,[Columns]
   = (
   select COLUMN_NAME as [@xpath]
    from INFORMATION_SCHEMA.KEY_COLUMN_USAGE
    where (KEY_COLUMN_USAGE.TABLE_CATALOG = TABLE_CONSTRAINTS.TABLE_CATALOG)
     and (KEY_COLUMN_USAGE.TABLE_SCHEMA = TABLE_CONSTRAINTS.TABLE_SCHEMA)
     and (KEY_COLUMN_USAGE.TABLE_NAME = TABLE_CONSTRAINTS.TABLE_NAME)
     and (KEY_COLUMN_USAGE.CONSTRAINT_NAME = TABLE_CONSTRAINTS.CONSTRAINT_NAME)
    order by KEY_COLUMN_USAGE.ORDINAL_POSITION
    for xml path('xs:field'), type
   )
  from INFORMATION_SCHEMA.TABLE_CONSTRAINTS
  where (TABLE_CONSTRAINTS.CONSTRAINT_TYPE = 'PRIMARY KEY')
 )
...

The SQL Server Primary Key constraint is translated to the XML Schema key constraint. Only a single key element is supported in XML Schema, just like in a SQL Server table, and the key can either be single-column, or a composite key.


Unique Constraints

...
 ,UniqueConstraint
 as
 (
 select TABLE_CONSTRAINTS.TABLE_CATALOG as [Catalog]
  ,TABLE_CONSTRAINTS.TABLE_SCHEMA as [Schema]
  ,TABLE_CONSTRAINTS.TABLE_NAME as [Table]
  ,TABLE_CONSTRAINTS.CONSTRAINT_NAME as Name
  ,[Columns]
   = (
   select COLUMN_NAME as [@xpath]
    from INFORMATION_SCHEMA.KEY_COLUMN_USAGE
    where (KEY_COLUMN_USAGE.TABLE_CATALOG = TABLE_CONSTRAINTS.TABLE_CATALOG)
     and (KEY_COLUMN_USAGE.TABLE_SCHEMA = TABLE_CONSTRAINTS.TABLE_SCHEMA)
     and (KEY_COLUMN_USAGE.TABLE_NAME = TABLE_CONSTRAINTS.TABLE_NAME)
     and (KEY_COLUMN_USAGE.CONSTRAINT_NAME = TABLE_CONSTRAINTS.CONSTRAINT_NAME)
    order by KEY_COLUMN_USAGE.ORDINAL_POSITION
    for xml path('xs:field'), type
   )
  from INFORMATION_SCHEMA.TABLE_CONSTRAINTS
  where (TABLE_CONSTRAINTS.CONSTRAINT_TYPE = 'UNIQUE')
 )
...

Multiple unique constraints in a SQL Server table can be translated to multiple XML Schema unique constraints. These can also be single-column, or composite constraints.


Self-referencing Foreign Keys

...
 ,SelfReference
 as
 (
 select FOREIGN_KEY.TABLE_CATALOG as [ForeignCatalog]
  ,FOREIGN_KEY.TABLE_SCHEMA as [ForeignSchema]
  ,FOREIGN_KEY.TABLE_NAME as [ForeignTable]
  ,FOREIGN_KEY.CONSTRAINT_NAME as ForeignKeyName
  ,PRIMARY_KEY.CONSTRAINT_NAME as PrimaryKeyName
  ,[ForeignColumns]
   = (
   select COLUMN_NAME as [@xpath]
    from INFORMATION_SCHEMA.KEY_COLUMN_USAGE
    where (KEY_COLUMN_USAGE.TABLE_CATALOG = FOREIGN_KEY.TABLE_CATALOG)
     and (KEY_COLUMN_USAGE.TABLE_SCHEMA = FOREIGN_KEY.TABLE_SCHEMA)
     and (KEY_COLUMN_USAGE.TABLE_NAME = FOREIGN_KEY.TABLE_NAME)
     and (KEY_COLUMN_USAGE.CONSTRAINT_NAME = FOREIGN_KEY.CONSTRAINT_NAME)
    order by KEY_COLUMN_USAGE.ORDINAL_POSITION
    for xml path('xs:field'), type
   )
  from INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS
   inner join INFORMATION_SCHEMA.TABLE_CONSTRAINTS FOREIGN_KEY
     on FOREIGN_KEY.CONSTRAINT_NAME = REFERENTIAL_CONSTRAINTS.CONSTRAINT_NAME
   inner join INFORMATION_SCHEMA.TABLE_CONSTRAINTS PRIMARY_KEY
     on PRIMARY_KEY.CONSTRAINT_NAME = REFERENTIAL_CONSTRAINTS.UNIQUE_CONSTRAINT_NAME
  where (FOREIGN_KEY.CONSTRAINT_TYPE = 'FOREIGN KEY')
   and (FOREIGN_KEY.TABLE_CATALOG = PRIMARY_KEY.TABLE_CATALOG)
   and (FOREIGN_KEY.TABLE_SCHEMA = PRIMARY_KEY.TABLE_SCHEMA)
   and (FOREIGN_KEY.TABLE_NAME = PRIMARY_KEY.TABLE_NAME)
 )
...

Foreign Key constraints are supported by XML Schema; however, in our particular case, we are creating schemas for individual tables. Therefore, only self-referencing foreing keys will be translated – i.e. keys, that reference one or more columns in the same table.


The Columns

...
select case
  when (TypeMap.IsFixed = 1)
   then cast(N'' as xml).query
     ('
     element  xs:element
     {
      attribute name {sql:column("COLUMNS.COLUMN_NAME")}
      ,attribute type {sql:column("TypeMap.XmlSchemaType")}
      ,attribute nillable {sql:column("Nillability.XmlNillable")}
     }
     ')
  when (COLUMNS.CHARACTER_MAXIMUM_LENGTH is not null)
   and (COLUMNS.CHARACTER_MAXIMUM_LENGTH != -1)
   then cast(N'' as xml).query
     ('
     element  xs:element
     {
      attribute name {sql:column("COLUMNS.COLUMN_NAME")}
      ,attribute nillable {sql:column("Nillability.XmlNillable")}
      ,element xs:simpleType
      {
       element  xs:restriction
       {
        attribute base {sql:column("TypeMap.XmlSchemaType")}
        ,element xs:maxLength
        {
         attribute value {sql:column("COLUMNS.CHARACTER_MAXIMUM_LENGTH")}
        }
       }
      }
     }
     ')
  when (COLUMNS.CHARACTER_MAXIMUM_LENGTH is not null)
   and (COLUMNS.CHARACTER_MAXIMUM_LENGTH = -1)
   then cast(N'' as xml).query
     ('
     element  xs:element
     {
      attribute name {sql:column("COLUMNS.COLUMN_NAME")}
      ,attribute nillable {sql:column("Nillability.XmlNillable")}
      ,element xs:simpleType
      {
       element  xs:restriction
       {
        attribute base {sql:column("TypeMap.XmlSchemaType")}
        ,element xs:maxLength
        {
         attribute value {2147483647}
        }
       }
      }
     }
     ')
  when (COLUMNS.CHARACTER_MAXIMUM_LENGTH is null)
   and (COLUMNS.NUMERIC_PRECISION is not null)
   and (COLUMNS.NUMERIC_SCALE is null or COLUMNS.NUMERIC_SCALE = 0)
   then cast(N'' as xml).query
     ('
     element  xs:element
     {
      attribute name {sql:column("COLUMNS.COLUMN_NAME")}
      ,attribute nillable {sql:column("Nillability.XmlNillable")}
      ,element xs:simpleType
      {
       element  xs:restriction
       {
        attribute base {sql:column("TypeMap.XmlSchemaType")}
        ,element xs:totalDigits
        {
         attribute value {sql:column("COLUMNS.NUMERIC_PRECISION")}
        }
       }
      }
     }
     ')
  when (COLUMNS.CHARACTER_MAXIMUM_LENGTH is null)
   and (COLUMNS.NUMERIC_PRECISION is not null)
   and (COLUMNS.NUMERIC_SCALE is not null)
   and (COLUMNS.NUMERIC_SCALE != 0)
   then cast(N'' as xml).query
     ('
     element  xs:element
     {
      attribute name {sql:column("COLUMNS.COLUMN_NAME")}
      ,attribute nillable {sql:column("Nillability.XmlNillable")}
      ,element xs:simpleType
      {
       element  xs:restriction
       {
        attribute base {sql:column("TypeMap.XmlSchemaType")}
        ,element xs:totalDigits
        {
         attribute value {sql:column("COLUMNS.NUMERIC_PRECISION")}
        }
        ,element xs:fractionDigits
        {
         attribute value {sql:column("COLUMNS.NUMERIC_SCALE")}
        }
       }
      }
     }
     ')
  else cast(N'' as xml).query
    ('
    element  xs:element
    {
     attribute name {sql:column("COLUMNS.COLUMN_NAME")}
     ,attribute type {"xs:string"}
     ,attribute nillable {sql:column("Nillability.XmlNillable")}
    }
    ')
  end
 from INFORMATION_SCHEMA.COLUMNS
  inner join Nillability
    on Nillability.SqlNillable = COLUMNS.IS_NULLABLE
  left join TypeMap
    on TypeMap.SqlType = COLUMNS.DATA_TYPE
 where (COLUMNS.TABLE_CATALOG = TABLES.TABLE_CATALOG)
  and (COLUMNS.TABLE_SCHEMA = TABLES.TABLE_SCHEMA)
  and (COLUMNS.TABLE_NAME = TABLES.TABLE_NAME)
 for xml path(''), root('Columns'), type
...

Table columns are translated to XML elements, each with appropriate XML Schema type extensions used to define the domain, which must be logically and functionally equivalent to the SQL Server domain used in each table column. Observe the case expression and see how the IsFixed column of the data type map is used, together with the actual SQL Server data type declarations, to translate SQL Server column domains to XML Schema domains.


Keys

...
select PrimaryKey.Name as [@name]
 ,'row' as [xs:selector/@xpath]
 ,(
 select PrimaryKey.[Columns]
 )
 from PrimaryKey
 where (PrimaryKey.[Catalog] = TABLES.TABLE_CATALOG)
  and (PrimaryKey.[Schema] = TABLES.TABLE_SCHEMA)
  and (PrimaryKey.[Table] = TABLES.TABLE_NAME)
 for xml path('xs:key'), root('PrimaryKey'), type
...
select UniqueConstraint.Name as [@name]
 ,'row' as [xs:selector/@xpath]
 ,(
 select UniqueConstraint.[Columns]
 )
 from UniqueConstraint
 where (UniqueConstraint.[Catalog] = TABLES.TABLE_CATALOG)
  and (UniqueConstraint.[Schema] = TABLES.TABLE_SCHEMA)
  and (UniqueConstraint.[Table] = TABLES.TABLE_NAME)
 for xml path('xs:unique'), root('UniqueConstraints'), type
...
select SelfReference.PrimaryKeyName as [@refer]
 ,SelfReference.ForeignKeyName as [@name]
 ,'row' as [xs:selector/@xpath]
 ,(
 select SelfReference.[ForeignColumns]
 )
 from SelfReference
 where (SelfReference.[ForeignCatalog] = TABLES.TABLE_CATALOG)
  and (SelfReference.[ForeignSchema] = TABLES.TABLE_SCHEMA)
  and (SelfReference.[ForeignTable] = TABLES.TABLE_NAME)
 for xml path('xs:keyref'), root('SelfReferences'), type
...

All three XML Schema key constraints follow the same principle: the selector defines the context for the key – the key is enforced inside the context of the node, designated by the XPath expression, and one or more fields define the nodes that constitute the key (again, via XPath expressions).


The XQuery Composition

...
.query
 ('
 element  xs:schema
 {
  element  xs:element
  {
   attribute name {concat(sql:column("TABLES.TABLE_SCHEMA")
      , ".", sql:column("TABLES.TABLE_NAME"))}
   ,element xs:complexType
   {
    element  xs:choice
    {
     attribute minOccurs {0}
     ,attribute maxOccurs {"unbounded"}
     ,element xs:element
     {
      attribute name {"row"}
      ,element xs:complexType
      {
       (: Table Columns :)
       element  xs:sequence {Table/Columns/xs:element}
      }
     }
    }
   }
   (: Primary Key :)
   ,Table/PrimaryKey/xs:key
   (: Unique Constraint(s) :)
   ,Table/UniqueConstraints/xs:unique
   (: Self-referencing Foreign Key(s) :)
   ,Table/SelfReferences/xs:keyref
  }
 }
 ')
...

The retrieval from INFORMATION_SCHEMA views in the example above is set-oriented – as many XML Schema definitions will be created as there are rows in the outer query (i.e. as there are user tables in the database) – all in a single result set. From the XML composition perspective, the retrieval operation is also nested – a table may contain multiple columns as well as a single constraint may be composed of multiple columns. Finally, all individual elements are placed into a single XML Schema definition.

Observe the comments inside the XQuery, enclosed in (: :) comment delimiters, to understand where and how individual T-SQL sub-queries are referenced to place the relevant XML fragments into the resulting XML document.


Ready?

When you're done reviewing the script, execute it. Observe the results, review the newly composed XML Schemas.


Homework

The best way to learn something is to try it out. :)

  • Modify the original query to see the results returned by individual CTEs: the TypeMap, the Nillability, the PrimaryKey, the UniqueConstraint, the SelfReference.
  • Modify the final query, and move individual XML columns, designated by the comments "Table Columns", "Primary Key", "Unique Constraint(s)", and "Self-reference Foreign Key(s)" to separate CTEs.

ML

Friday, March 23, 2012

SQL Server 2012 RTM

Earlier this month SQL Server 2012 RTM was announced, and the evaluation is now available for download:

The build number for the RTM is 11.0.2100.60.

Related downloads are also available:

  • Books Online for SQL Server 2012 – product documentation is available online, and can also be installed locally. Books Online are not installed as part of the SQL Server installation; the Help Viewer needs to be configured appropriately for you to access SQL Server documentation;
  • Microsoft SQL Server Data Tools – not simply a replacement for the SQL Server Business Intelligence Development Studio, this new – freely available – tool provides an improved database development experience, both to SQL Server as well as SQL Azure database developers;
  • Microsoft SQL Server 2012 Feature Pack – additional components are available separately (all 34 of them);
  • Adventure Works for SQL Server 2012 – sample databases are available on CodePlex. Read the installation instructions carefully; the deployment of data samples is not a trivial task, although it should not cause problems to a seasoned DBA.

Note to DQS users upgrading to RTM from RC0: before attempting to upgrade your Data Quality Services (DQS) installation, read the instructions available at TechNet:


May it serve you well!


ML