Thursday, June 10, 2010

Using Linq To XML to Parse XML with Multiple Namespaces

I am working on a SQL Server Reporting Services 2008 project. I decided to write a tool in C# that will document the reports that I have created.

What should make this easy is that the structure of the RDL (report) file is XML, which makes it a perfect candidate for using LINQ to XML to dig through its information.

Here’s the top of the RDL file I’m trying to process (C:\MyReport.RDL):

<?xml version="1.0" encoding="utf-8"?>
<Report xmlns:rd="http://schemas.microsoft.com/SQLServer/reporting/reportdesigner" xmlns="http://schemas.microsoft.com/sqlserver/reporting/2008/01/reportdefinition">
  <DataSources>
    <DataSource Name="SalesDM">
      <DataSourceReference>SalesDM</DataSourceReference>
      <rd:DataSourceID>60c7781c-ab9c-4c5f-b779-24afa5d5bac3</rd:DataSourceID>
      <rd:SecurityType>None</rd:SecurityType>
    </DataSource>
  </DataSources>
<DataSets>
...


I wrote the following code to parse the file:




// Load report into an XDocument
XDocument doc = XDocument.Load(@"C:\MyReport.RDL");
// Grab the DataSourceIDs of all the data sources
var dataSources = (from ds in doc.Descendants("DataSource")
       select ds.Element("DataSourceID").Value).ToList();


But for some reason, dataSources didn’t contain any results.



I found information on the web that said that I needed to create an XNamespace instance, and point it to the namespace defined in the xmlns: attribute at the top of the RDL file in the <Report> element. This XNamespace instance is then used when calling the Descendants() method of an XDocument.



What is slightly confusing is that at first glance, the <DataSource> element does not seem to have a namespace (as opposed to the DataSourceID element, which has the “rd:” prefix, signifying a namespace )



Here’s the trick: notice that the XML namespace:



xmlns=”http://schemas.microsoft.com/sqlserver/reporting/2008/01/reportdefinition”


doesn’t have a prefix (ex. “rd:”) – it is therefore the default XML namespace of all child elements of the <Report> element. Now that we know the namespace, we have to tell LINQ to XML what the namespace is when querying:






// Load report into an XDocument   
XDocument doc = XDocument.Load(@"C:\MyReport.RDL");   
// Get the XML namespace for the default namespace of the RDL file   
XNamespace ns = XNamespace.Get("http://schemas.microsoft.com/sqlserver/reporting/2008/01/reportdefinition");
// Grab the DataSourceIDs of all the data sources   
var dataSources = (from ds in doc.Descendants(ns+"DataSource")   
                   select ds).ToList();  


Prefixing “DataSource” with the namespace instance (ns) causes the LINQ query to return one item as expected.



All that we need to do to find the DataSourceID of the element. We just access the Element() method of ds, right? Close, but not quite. Take another look at the XML for the DataSourceID element:



<rd:DataSourceID>60c7781c-ab9c-4c5f-b779-24afa5d5bac3</rd:DataSourceID>


It has a different namespace than the element it is contained by. We must apply the same technique with a different XNamespace instance (ns2) when calling the Element() method. Here’s the final (working) code:



XDocument doc = XDocument.Load(@"C:\MyReport.RDL");
// Get the XML namespace for the default namespace of the RDL file
XNamespace ns = XNamespace.Get("http://schemas.microsoft.com/sqlserver/reporting/2008/01/reportdefinition");
// Get the XML namespace for the rd: namespace of the RDL file
XNamespace ns2 = XNamespace.Get("http://schemas.microsoft.com/SQLServer/reporting/reportdesigner");
// Grab the DataSourceIDs of all the data sources
var dataSources = (from ds in doc.Descendants(ns + "DataSource")
       select ds.Element(ns2+"DataSourceID").Value).ToList(); 


I hope this saves you some time!