Friday, November 24, 2006

SQL Server 2005 XML Methods, Part One, Basic Retrieval

As mentioned in one of the previous posts XML became a native data type with SQL Server 2005. Opinions on its usefulness in a relational database are divided – "fragmented", even. ;) The purpose of this series of posts is to shed some light on how the XML data type could be utilized efficiently, and what are the caveats that should be avoided. In this first part specifically, we'll take a look at the built-in data retrieval methods concerning the XML data type.

So, you've decided on using the XML data type in your SQL Server database. Now your main concern is how to access parts of that XML in your queries. There are three data retrieval XML methods available in SQL Server 2005:

  • query() – returns the results of an XML Query (the entire result of the XML Query in one row). The result is untyped XML;
  • nodes() – returns a rowset of nodes returned by an XML Query (one row for each node matched by the XML Query expression). The result is a rowset of untyped XML values;
  • value() – returns the result of an XML Query as a value in the specified data type (the XML Query statement and the name of the data type for the result are passed as parameters). The result of the XML Query used in this function must be a singleton, otherwise an error is raised. The final result is value in the data type specified at the function's invocation.

Which one to use and when?

If values are needed rather than XML nodes then the value function should be used to access the data (obvious, isn't it?). When designing queries it is vital to take into account that the XQuery results evaluate to singletons before the value function can be applied.

Yes, an example might be of help here:

Let's start with the following XML:

declare @xml xml

set @xml = N'<?xml version="1.0" encoding="utf-8"?>
<attributeCollection>
 <attribute>
  <name>firstName</name>
  <value>Catherine</value>
 </attribute>
 <attribute>
  <name>middleName</name>
  <value>R.</value>
 </attribute>
 <attribute>
  <name>lastName</name>
  <value>Abel</value>
 </attribute>
</attributeCollection>'

Executing either of these two queries:

  • Query #1
    select @xml.query(
                   '//attribute'
                   ).value(
                           'value'
                           ,'varchar(max)'
                           ) as AttributeValue
    ...or:
  • Query #2
    select Attributes.Attribute.value(
                    'value'
                    ,'varchar(max)'
                    ) as AttributeValue
     from @xml.nodes(
            '//attribute'
            ) Attributes (Attribute)

...leads to the following exception:

Msg 2389, Level 16, State 1, Line #
XQuery [value()]: 'value()' requires a singleton (or empty sequence),
found operand of type 'xdt:untypedAtomic *'

The road to a valid XML singleton lies within the XPath expression – it must evaluate to a single XML node. If we correct the XPath expression in both queries, the exception won't occur:

  • Query #3
    select @xml.query(
                    '//attribute/value'
                    ).value(
                            '.'
                            ,'varchar(max)'
                            ) as AttributeValue
    ...and:
  • Query #4
    select Attributes.Attribute.value(
                    '.'
                    ,'varchar(max)'
                    ) as AttributeValue
     from @xml.nodes(
                '//attribute/value'
                ) Attributes (Attribute)

This brings us to the first consideration regarding the choice of XML data retrieval methods. The principal question we need to answer is purely conceptual and deals with the business purpose of a particular XML: what relationships exist inside the particular XML? And don't say none. ;) The goal of queries against the XML should be clear – clarify it by deciding whether the relationships inside the XML are of value to the relational model.

For example: Query #3 returns a concatenation of values from all nodes corresponding to the XPath expression, while Query #4 returns a set of values:

  • Results for Query #3:
    AttributeValue
    -------------------
    CatherineR.Abel
    
    (1 row(s) affected)
  • Results for Query #4:
    AttributeValue
    -------------------
    Catherine
    R.
    Abel
    
    (3 row(s) affected)

Usually, if the relationships in a particular XML model significantly influence the data use then it would be more appropriate (especially performance-wise) if they were properly catered for in the database model directly (stored separately, relationally constrained, and indexed if applicable) rather than left to dwell within the depths of the XML.

However, there are cases when it is more appropriate (from a business perspective) to leave data as XML entities even in the data store; e.g. document management systems (where each document is stored as an XML entity and a single data model is used to store several classes or types of documents), information platforms (where on the business layer the solution needs to cater to seemingly highly diverse requirements, often leaving the end-users to have uninhibitted control over the structure of their entities, while the solution actually allows for unified physical storage), etc.

In fact, once again there is no straight answer – the actual implementation depends on the requirements of the case in hand. This influences the design of the data model, and the latter influences the choice of appropriate retrieval methods.


Is that it?

Well, this has been an incredibly brief introduction to the subject of XML retrieval methods provided by SQL Server 2005, but we're not done yet! Next time we'll be looking at a few examples where the three built-in XML methods we've seen today are used in various combinations. And we might just as well make a few surprising discoveries. ;)


ML

16 comments:

Unknown said...

What if you wanted to return name and value? Meaning you want to return two columns per row via the query 4 method.

Rosie

Matija Lah said...

Rosie,

In your case the XPath expression of the nodes method should target the entire Attribute node, and then you could use the query method to target the appropriate child node in order to extract data from both.

E.g.:
select Attributes.Attribute.query
('
name
').value
(
'.'
,'varchar(max)'
) as AttributeName
,Attributes.Attribute.query
('
value
').value
(
'.'
,'varchar(max)'
) as AttributeValue
from @xml.nodes
('
//attribute
') Attributes (Attribute)


ML

Unknown said...

Thank you that worked like a charm.

Matija Lah said...

You're welcome!

Next time you might get better/more help in microsoft.public.sqlserver.programming.


ML

Unknown said...

hi ,

Please check the below code ,
At present, i am currently getting the data from a variable , is there any possibility that i can get the data from a single sql query. I dont want to have another variable in between.

declare @test xml

set @test = ( select request_xml from App_Xml_Request where app_id='456')

select Attributes.Attribute.value(
'.'
,'varchar(MAX)'
)as AttributeValue
from @test.nodes(
'/Application/BusinessOwner/SSN'
) Attributes (Attribute)

Matija Lah said...

A single SELECT statement is enough.

E.g. (untested):

select Attributes.Attribute.query('.').value('.', 'varchar(max)') as AttributeValue
from App_Xml_Request
cross apply App_Xml_Request.request_xml.nodes
('
/Application/BusinessOwner/SSN
') Attributes (Attribute)
where (app_id = '456')


ML

Anonymous said...

I just wanted to say a BIG thank you - this helped me out in a big way! I've been struggling with the XML query syntax for some time, and this post cleared up the issues I was having.

Anonymous said...

Is it possible to separate values with any character like '|', for example:
Results for Query #3:

AttributeValue
-------------------
CatherineR.Abel

(1 row(s) affected)

It should be:
Results for Query #3:

AttributeValue
-------------------
Catherine|R.|Abel

(1 row(s) affected)

Matija Lah said...

Would this work for you?

select @xml.query('
for $i in (//attribute/value)
return concat($i, "|")
') as AttributeValueDelimited


Result:

AttributeValueDelimited
-----------------------
Catherine| R.| Abel|


ML

Anonymous said...

Thanks Matija, its working but its working fine when there is only one attributecollection, what if there are multiple collections?
Here is the scenario:

declare @xml xml
set @xml = N'
Report
ReportParameter Name="CostType"
ValidValues
ParameterValues
ParameterValue
Value R /Value
Label R (REG) /Label
/ParameterValue
ParameterValue
Value P /Value
Label P (PROD) /Label
/ParameterValue
ParameterValue
Value M /Value
Label M (MO) /Label
/ParameterValue
/ParameterValues
/ValidValues
/ReportParameter
ReportParameter Name="IncludeAllow"
ValidValues
ParameterValues
ParameterValue
Value True /Value
Label Yes /Label
/ParameterValue
ParameterValue
Value False /Value
Label No /Label
/ParameterValue
/ParameterValues
/ValidValues
/ReportParameter
/Report'
select @xml.value('(/Report/ReportParameter/@Name)[1]', 'VARCHAR(50)') as ParamName
, @xml.query('for $i in (//Report/ReportParameter/ValidValues/ParameterValues/ParameterValue/Value) return concat($i, "|")') as ParamValue

Result:
ParamName ParamValue
CostType R| P| M| True| False|

Required Result:
ParamName ParamValue
CostType R| P| M|
IncludeAllow True| False|

Matija Lah said...

In that case you first need to get the ReportParameter node list (/Report/ReportParameter), extract the Name attribute, and then create a delimited list of corresponding values (ValidValues/ParameterValues/ParameterValue/Value).

E.g.:

select Report.ReportParameter.query('data(@Name)').value('.', 'varchar(50)') as ParamName
,Report.ReportParameter.query
('
for $v in ValidValues/ParameterValues/ParameterValue/Value
return concat($v, "|")
') as ParamValue
from @xml.nodes
('
/Report/ReportParameter
') Report (ReportParameter)


For practice, extend the above query to produce an additional column, named ParamLabel, containing a delimited list of labels (ValidValues/ParameterValues/ParameterValue/Label).


ML

Anonymous said...

Wao! Its working and have solved my problem. Thanks a lot!

Matija Lah said...

No, problem!

You can find more answers in the SQL Server XML MSDN forum.


ML

Doremon said...

Thank you so much for this article. Helped me fix that darn error!!

Doremon said...

Thank you so much for posting this article. Helped me fix the error. Very concise and useful!

Unknown said...

brillant piece of information, I had come to know about your web-page from my friend hardkik, chennai,i have read atleast 9 posts of yours by now, and let me tell you, your webpage gives the best and the most interesting information. This is just the kind of information that i had been looking for, i'm already your rss reader now and i would regularly watch out for the new posts, once again hats off to you! Thanx a million once again, Regards, Single Row Function in sql