Wednesday, July 3, 2013

XML Parser in SQL Server (T-SQL), How to parse XML in SQL

There are many cases when we want to read XML file and store data in SQL Server table

Suppose we have XML file as show below:

  
   
   Inside First Node  
   
   
   Inside Second Node 1  
   inside Second Node 2  
   
   
   Item 1  
   Item 2  
   Item 3  
   
 

There are two options to parse XML in T-SQL

1- nodes method msdn link

2- OPENXML msdn link

nodes method in T-SQL

nodes method is useful when you want to read a node data from a XML, Lets take an example for above xml

If I want to read data from a specific node 'insidesecondnode1' from above xml file

DECLARE @x xml 
SET @x=
'    
     
   Inside First Node    
     
     
   Inside Second Node 1    
   inside Second Node 2    
     
     
   Item 1    
   Item 2    
   Item 3    
     
'
SELECT T.c.query('text()') AS result
FROM   @x.nodes('/root/secondnode/insidesecondnode1') T(c)
GO

Result for above query is as shown below

In the line 18, what ever path you will give this query will result the data inside the node

If a node is present multiple time in xml for example 'item' node inside 'thirdnode' in above xml, then this query will fetch data from all the nodes

DECLARE @x xml 
SET @x=
'    
     
   Inside First Node    
     
     
   Inside Second Node 1    
   inside Second Node 2    
     
     
   Item 1    
   Item 2    
   Item 3    
     
'
SELECT T.c.query('text()') AS result
FROM   @x.nodes('/root/thirdnode/item') T(c)
GO

Result of above query:

If you want to fetch particular node in above case and id value also, then use below query:

      
   Inside First Node    
     
     
   Inside Second Node 1    
   inside Second Node 2    
     
     
   Item 1    
   Item 2    
   Item 3    
     
'
SELECT T.c.value('@id','int') ID
       ,T.c.query('text()') AS result
FROM   @x.nodes('/root/thirdnode/item[2]') T(c)
GO

In above query line 18, by appending with [number] will read only that specific node, in this case second node

Result of above query:

OPENXML in T-SQL

In nodes method there is one restriction, you can not pass SQL parameter in nodes method

Suppose you have many keys stored in one table as shown below

We can iterate over all the keys using this table to fetch corresponding data from xml file and store to another table as simple text

We can do this by applying a loop and passing all the keys stores in table

SQL Query to implement this approach is as shown below:

DECLARE @hdoc [int]
DECLARE @Key [varchar](1000)
DECLARE @XMLDoc [XML]
DECLARE @NumberOfKeys [int]
DECLARE @Counter [int] = 0;
DECLARE @Value [varchar](2000);

IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[TempTable]') AND TYPE IN (N'U'))
  BEGIN
   DROP TABLE [dbo].[TempTable]
  END
  -- Creating temp table to store values parsed from XML
  CREATE TABLE [dbo].[TempTable](
      [ID] [int],
   [Key] [varchar](2000),
   [Value] [varchar](2000)
  ) 
SET @XMLDoc=
'    
     
   Inside First Node    
     
     
   Inside Second Node 1    
   inside Second Node 2    
     
     
   Item 1    
   Item 2    
   Item 3    
     
'
SET @NumberOfKeys= (SELECT COUNT([Key]) FROM [JPDB].[dbo].[XMLKeysTable]);

EXEC sp_xml_preparedocument @hdoc OUTPUT, @XMLDoc

 WHILE(@Counter < @NumberOfKeys)
 BEGIN
   SELECT @Key = [Key] FROM [JPDB].[dbo].[XMLKeysTable] ORDER BY [Key] OFFSET @Counter ROWS FETCH NEXT 1 ROWS ONLY
   SELECT @Value = Value FROM OPENXML (@hdoc, @Key,1) WITH (Value [varchar](1000) 'text()[1]')
   INSERT INTO [dbo].[TempTable] 
   VALUES(@Counter, @Key, @Value)
   SET @Counter += 1;
  END

--Removes the internal representation of the XML document specified by the document handle and invalidates the document handle
EXEC sp_xml_removedocument @hdoc

SELECT * FROM [dbo].[TempTable]

Result of above query:

You can also refer my blog: Pass XML file in stored procedure as a input parameter from C#

2 comments: