Tag Archives: xQuery

xQuery and iTunes

It’s been a while since I’ve posted an xQuery. So I dusted off the old saxon processor and thought I’d have a bit of fun with the iTunes library.
I got me some well formed xml by selecting the Podcasts node in iTunes’ left column and exported it to Podcasts.xml (File/Library/Export Playlist). That gives you this xml file. So what now? Run some xQuery against it, is what!

You have to examine the xml a bit to get a feel for how it is laid out. The xQuery which I ran searches for all podcasts by the artist Jupiter Broadcasting and sets out several pieces of meta-data for those podcast episodes (Name, Kind, Bit Rate, Sample Rate). The xQuery is:

declare namespace dave = "http://www.localhost:8090/xquery";

declare function dave:getJupiterBroadcastingShows($allShows as element()+) as element()+
{	
	let $JBShows := $allShows[contains(child::string[preceding-sibling::key[1]/child::text() = 'Artist'], 'Jupiter Broadcasting')]
	return $JBShows
};


let $list := doc("Podcasts.xml")/plist/dict/key[.="Tracks"]/following-sibling::dict[1]
let $key := $list/child::key
let $pods := $key/following-sibling::dict[1]
let $podCastJupiter := dave:getJupiterBroadcastingShows($pods)
return
	<podcasts>
		{
			for $keyVal in $podCastJupiter
			return
				<podcast>
					<name>
						{$keyVal/child::key[child::text() = 'Name']/following-sibling::string[1]/child::text()}
					</name>
					<filetype>
						{$keyVal/child::key[child::text() = 'Kind']/following-sibling::string[1]/child::text()}
					</filetype>
					<bitrate>
						{$keyVal/child::key[child::text() = 'Bit Rate']/following-sibling::integer[1]/child::text()}
					</bitrate>
					<samplerate>
						{$keyVal/child::key[child::text() = 'Sample Rate']/following-sibling::integer[1]/child::text()}
					</samplerate>
				</podcast>
		}
		
	</podcasts>

and the result of this query was:

<?xml version="1.0" encoding="UTF-8"?>
<podcasts>
   <podcast>
      <name>The Computer Action Show! Season 2 Episode 1</name>
      <filetype>MPEG audio file</filetype>
      <bitrate>96</bitrate>
      <samplerate>44100</samplerate>
   </podcast>
   <podcast>
      <name>The Computer Action Show! Season 2 Episode 2</name>
      <filetype>MPEG audio file</filetype>
      <bitrate>96</bitrate>
      <samplerate>44100</samplerate>
   </podcast>
   <podcast>
      <name>The Computer Action Show! Season 2 Episode 3</name>
      <filetype>MPEG audio file</filetype>
      <bitrate>80</bitrate>
      <samplerate>44100</samplerate>
   </podcast>
   <podcast>
      <name>The Computer Action Show! Season 1 Episode 4</name>
      <filetype>MPEG audio file</filetype>
      <bitrate>96</bitrate>
      <samplerate>44100</samplerate>
   </podcast>
   <podcast>
      <name>The Computer Action Show! Season 1 Episode 5</name>
      <filetype>MPEG audio file</filetype>
      <bitrate>96</bitrate>
      <samplerate>44100</samplerate>
   </podcast>
   <podcast>
      <name>The Computer Action Show! Season 1 Episode 6</name>
      <filetype>MPEG audio file</filetype>
      <bitrate>96</bitrate>
      <samplerate>44100</samplerate>
   </podcast>
   <podcast>
      <name>The Computer Action Show! Season 1 Episode 7</name>
      <filetype>MPEG audio file</filetype>
      <bitrate>96</bitrate>
      <samplerate>44100</samplerate>
   </podcast>
   <podcast>
      <name>The Computer Action Show! Season 1 Episode 9</name>
      <filetype>MPEG audio file</filetype>
      <bitrate>96</bitrate>
      <samplerate>44100</samplerate>
   </podcast>
   <podcast>
      <name>The Computer Action Show! Season 1 Episode 10</name>
      <filetype>MPEG audio file</filetype>
      <bitrate>96</bitrate>
      <samplerate>44100</samplerate>
   </podcast>
   <podcast>
      <name>The Computer Action Show! Season 2 Episode 4</name>
      <filetype>MPEG audio file</filetype>
      <bitrate>96</bitrate>
      <samplerate>48000</samplerate>
   </podcast>
</podcasts>

I know I know. It’s awesome.

xQuery and SQL Server – a Basic Look

The database table which I am using in this post: xmlDB.zip

xQuery comes in quite handy when it comes to querying Xml fields in SQL Server 2005 or 2008. SQL Server supports many xQuery language features, and its Xml data type provides a variety of functions to enable the use of xQuery.

To demonstrate, I wanted to keep things very simple and created a relation with the following schema:

schema
The ItemId is just an artificial identifier which performs the role of primary key for each tuple. The SalesRep field (which is of the Xml type) contains information about the sales reps. The Xml fragment stored in that field is similar to the ones which I have been querying in previous posts.

Starting off with a simple example, I have used the query function of the Xml type to return the Name element from the SalesRep field for each tuple:

SELECT SalesRep.query('//Name') as [RepName]
FROM Employees

which returns:
repName
To return the actual text nodes of particular Name elements, we can use:

SELECT SalesRep.query('//Name/child::text()') as [RepName]
FROM Employees
WHERE Office = 'Brisbane'

If we want to use an xQuery query in a WHERE clause, we need to use the exist function of SQL Server’s Xml type. For example, we may want to return every REP element with an ID of 8:

SELECT SalesRep.query('/REP') as [Rep]
FROM Employees
WHERE SalesRep.exist('/REP/ID/child::text()[. = "8"]') = 1

The way we do that, is to use a predicate in the WHERE clause. The query has asked for every element that is returned when the predicate is true i.e. when the exist function equals 1 (1 being true).

We may want to find those sales reps whose Sales were less than $10,000:

SELECT SalesRep.query('/REP') as Rep
FROM Employees
WHERE SalesRep.exist('/REP/Sales/child::text()[xs:integer(.) < 10000]') = 1

This is just a taste of what can be done in terms of utilizing xQuery to query XML fields in SQL Server databases. Such queries are much faster, and more efficient, than ones which cast the Xml field to NVARCHAR and then perform a LIKE %xyz% in the WHERE clause. Cool, eh?!

xQuery and Recursion

In my last post, I gave a couple of examples of writing and calling functions. In this post, I want to look at how to solve a problem with recursion. Again, I will work with the same XML document from my previous post. Here, I will look at how to format the sales numbers as currency values.

For the sake of simplicity, I will not deal with cent values. So, no decimals. This is just so that we don’t lose focus from the main objective of the exercise.

Formatting a currency, using the Australian region style, uses commas as thousand separators i.e. $10,000.

So, my first pass at solving this problem would be a function such as:

declare function dave:formatCurrencyLimited($dollarAmt as xs:string) as xs:string
{
	let $nrDigits := string-length($dollarAmt)
	return 
		concat("$", concat(substring($dollarAmt, 1, $nrDigits - 3), concat(",", substring($dollarAmt, $nrDigits - 2))))
};

You can see that I have used the built-in, string-formatting functions concat , substring and replace to return values in the currency format that we want.

Using that in the following xquery:

let $reps := doc("reps.xml")/SampleDataset
let $sales := $reps//Sales
return
	<Result>
		{
			for $eachSalesNr in $sales 
			return 
				<SalesAmount>{dave:formatCurrencyLimited($eachSalesNr/child::text())}</SalesAmount>
				
		}
	</Result>

would return the following result:

<?xml version="1.0" encoding="UTF-8"?>
<Result>
   <SalesAmount>$10,440</SalesAmount>
   <SalesAmount>$17,772</SalesAmount>
   <SalesAmount>$23,880</SalesAmount>
   <SalesAmount>$7,663</SalesAmount>
   <SalesAmount>$21,773</SalesAmount>
   <SalesAmount>$32,294</SalesAmount>
   <SalesAmount>$11,983</SalesAmount>
   <SalesAmount>$14,991</SalesAmount>
</Result>

Now that works fine when the values are less than 1,000,000. But for any value above 999,999, my function would only return a string with one comma e.g. 1000,000. Not knowing how big the potential sales could be, we can’t really get the job done properly with my 1st attempt.

To solve this problem, we can use recursion. If you are unfamiliar with recursion, you should read about it elsewhere, and then come back to see what I have done with the following function:

declare function dave:formatCurrency($dollarAmt as xs:string) as xs:string
{
	let $nrDigits := string-length($dollarAmt)
	return 
		if($nrDigits > 3)
		then concat(dave:formatCurrency(substring($dollarAmt, 1, $nrDigits - 3)), replace($dollarAmt, $dollarAmt, concat(",", substring($dollarAmt, $nrDigits - 2))))
		else concat("$", $dollarAmt)
};

Looking at that version of the function, we can see that the base case is satisfied where $nrDigits equals 3 or less. If we have not reached the base case, the function calls itself until the base case is reached (it reduces the xs:string parameter by 3 each time, until the base case is satisfied).

Now, if I change Aaron’s sales amount to 10440000, my new function will return the following result:

<?xml version="1.0" encoding="UTF-8"?>
<Result>
   <SalesAmount>$10,440,000</SalesAmount>
   <SalesAmount>$17,772</SalesAmount>
   <SalesAmount>$23,880</SalesAmount>
   <SalesAmount>$7,663</SalesAmount>
   <SalesAmount>$21,773</SalesAmount>
   <SalesAmount>$32,294</SalesAmount>
   <SalesAmount>$11,983</SalesAmount>
   <SalesAmount>$14,991</SalesAmount>
</Result>

And this function will work every time, regardless of how big the sales figure becomes.

xQuery Functions

xQuery supports functions. In this post, by way of an example, I will demonstrate how to go about writing xQuery functions. For the example, I will be querying the following XML document (simple data about sales representatives):

<?xml version="1.0" standalone="yes"?>
<SampleDataset>
  <REPS>
    <ID>1</ID>
    <Name>Aaron</Name>
    <RegionID>1</RegionID>
    <Sales>10440</Sales>
  </REPS>
  <REPS>
    <ID>2</ID>
    <Name>Larry</Name>
    <RegionID>2</RegionID>
    <Sales>17772</Sales>
  </REPS>
  <REPS>
    <ID>3</ID>
    <Name>Andrew</Name>
    <RegionID>3</RegionID>
    <Sales>23880</Sales>
  </REPS>
  <REPS>
    <ID>4</ID>
    <Name>Mary</Name>
    <RegionID>1</RegionID>
    <Sales>7663</Sales>
  </REPS>
  <REPS>
    <ID>5</ID>
    <Name>Sally</Name>
    <RegionID>4</RegionID>
    <Sales>21773</Sales>
  </REPS>
  <REPS>
    <ID>6</ID>
    <Name>Nguyen</Name>
    <RegionID>2</RegionID>
    <Sales>32294</Sales>
  </REPS>
  <REPS>
    <ID>7</ID>
    <Name>Francis</Name>
    <RegionID>4</RegionID>
    <Sales>11983</Sales>
  </REPS>
  <REPS>
    <ID>8</ID>
    <Name>Jerry</Name>
    <RegionID>3</RegionID>
    <Sales>14991</Sales>
  </REPS>
</SampleDataset>

Let’s say we want to find out the average sales amount, along with the details of those reps who have achieved sales above that average amount. First, we need to calculate the average amount of sales. To achieve that, we can write the following function getAverageSales:

declare namespace dave = "http://www.localhost:8090/xquery";

declare function dave:getAverageSales($allReps as element()) as xs:decimal
{
	let $avgSales := xs:decimal(avg($allReps/REPS/Sales))
	return $avgSales
};

A couple of things to notice.

  • I have created a namespace for the function – dave.
  • the function takes one parameter, being an element. The element which will be passed to this function will be the SampleDataset element i.e. the root element of the XML document.
  • the function has an xs:decimal return-type.
  • the function uses the built-in function, avg, on the sequence of nodes comprising the Sales. It then casts the result of that to the xs:decimal type, using the xs:decimal constructor. That amount is returned.

I haven’t really added much value to the avg function there. But I wanted to start with a simple example.

Next, we need a function that will take as parameters:

  • a sequence of elements for the sales representatives; and
  • the average sales amount.

This function will return a sequence of elements representing those sales representatives who achieved sales greater than the average. This can be written as follows:

declare function dave:getEmployeesBeatAvg($allReps as element()+, $avgSales as xs:decimal) as element()+
{
	let $empsBeatAvg := $allReps[Sales > $avgSales]
	return $empsBeatAvg
};

Finally, we just need to write the query itself. Calling the functions is much like how it is done in other languages. You call the function and pass it actual parameters:

let $reps := doc("reps.xml")/SampleDataset
let $avgSales := dave:getAverageSales($reps)
let $repsBeatAvg := dave:getEmployeesBeatAvg($reps/REPS, $avgSales)
return
	<Result>
		{
			<AvergageSales>{$avgSales}</AvergageSales>,
			<EmployeesAboveAverage>
				{
					for $eachRep in $repsBeatAvg
					order by $eachRep/Sales descending
					return
						<Employee>{$eachRep/ID, $eachRep/Name, $eachRep/RegionID, $eachRep/Sales}</Employee>
				}
			</EmployeesAboveAverage>
		}
	</Result>

This will give us the following result:

<?xml version="1.0" encoding="UTF-8"?>
<Result>
   <AvergageSales>17599.5</AvergageSales>
   <EmployeesAboveAverage>
      <Employee>
         <ID>6</ID>
         <Name>Nguyen</Name>
         <RegionID>2</RegionID>
         <Sales>32294</Sales>
      </Employee>
      <Employee>
         <ID>3</ID>
         <Name>Andrew</Name>
         <RegionID>3</RegionID>
         <Sales>23880</Sales>
      </Employee>
      <Employee>
         <ID>5</ID>
         <Name>Sally</Name>
         <RegionID>4</RegionID>
         <Sales>21773</Sales>
      </Employee>
      <Employee>
         <ID>2</ID>
         <Name>Larry</Name>
         <RegionID>2</RegionID>
         <Sales>17772</Sales>
      </Employee>
   </EmployeesAboveAverage>
</Result>

In my next post about xQuery, I will take a look at how we can take those amounts and format them as currency.

xQuery

I see xQuery as XSLT’s less popular, XML-derived brother. It has not been embraced to the same extent as XSLT, and does not get talked about as much on the Web. Where there is a void, there is opportunity; an opportunity to write about xQuery! So, I plan to write the odd post about xQuery, with demonstrations and observations that may prove valuable to someone learning this very useful and powerful XML query language.

The first thing to note is that you really need to know the rules of XML (the basics) to write xQuery queries.

The second thing to note is that you need to learn XPath before you learn xQuery. XPath is used throughout xQuery queries. Whilst a very useful tool for returning information about an XML document, XPath has its limitations and does not have the same transformational cababilities of xQuery.

xQuery is a functional language. Being the first functional language which I learnt, it took a bit of time to learn how to think in a “functional language” kind of way. Previously, I only knew object oriented/procedural programming paradigms.

So, on with a demonstration! For this demo, I am using the following XML file from Priscilla Walmsley’s excellent book, “XQuery”:

<catalog>
  <product dept="WMN">
    <number>557</number>
    <name language="en">Fleece Pullover</name>
    <colorChoices>navy black</colorChoices>
  </product>
  <product dept="ACC">
    <number>563</number>
    <name language="en">Floppy Sun Hat</name>
  </product>
  <product dept="ACC">
    <number>443</number>
    <name language="en">Deluxe Travel Bag</name>
  </product>
  <product dept="MEN">
    <number>784</number>
    <name language="en">Cotton Dress Shirt</name>
    <colorChoices>white gray</colorChoices>
    <desc>Our <i>favorite</i> shirt!</desc>
  </product>
</catalog>

Now, we can use xQuery, not just to return information about each product, but to also transform the results into an html file which can be displayed. To that end, I ran the following query, using the open source xQuery processor Saxon-HE:

declare option saxon:output "doctype-public=-//W3C//DTD XHTML 1.0 Transitional//EN";
declare option saxon:output "doctype-system=http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";

let $catalog := doc("catalog.xml")/catalog
return
	<html>
		<h1>Product Catalog</h1>
		<table style="border: 1px solid #C0C0C0; border-collapse: collapse;">
		<tr style="background-color: #C0C0C0;" >
			<td style="border: 1px solid #C0C0C0"><b>Product Nr</b></td>
			<td style="border: 1px solid #C0C0C0"><b>Description</b></td>
			<td style="border: 1px solid #C0C0C0"><b>Colours</b></td>
		</tr>
		{
			for $product in $catalog/product
				return

				<tr>
					<td style="border: 1px solid #C0C0C0">{$product/number/text()}</td>
					<td style="border: 1px solid #C0C0C0">{$product/name/text()}</td>
					<td style="border: 1px solid #C0C0C0">{$product/colorChoices/text()}</td>
				</tr>
		}
		</table>
	</html>

The output file, specified as an html document using Saxon at the command line, can be seen in this screen capture:

Screen shot of result as grid

Screen shot of result as grid


How cool is that?! It may not be all the rage, but I think xQuery is a very useful tool when working with XML documents, whether the documents are being used to store data, or to facilitate information exchange.