Wednesday, July 3, 2013

XML Parser in SQL Server (T-SQL), How to parse XML in SQL

There are many cases when we want to read XML file and store data in SQL Server table

Suppose we have XML file as show below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
<root
 <firstnode
   <insidefirstnode>Inside First Node</insidefirstnode
 </firstnode
 <secondnode
   <insidesecondnode1>Inside Second Node 1</insidesecondnode1
   <insidesecondnode2>inside Second Node 2</insidesecondnode2
 </secondnode
 <thirdnode
   <item id="0">Item 1</item
   <item id="1">Item 2</item
   <item id="2">Item 3</item
 </thirdnode
</root>

There are two options to parse XML in T-SQL

1- nodes method msdn link

2- OPENXML msdn link

nodes method in T-SQL

nodes method is useful when you want to read a node data from a XML, Lets take an example for above xml

If I want to read data from a specific node 'insidesecondnode1' from above xml file

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
DECLARE @x xml
SET @x=
'<root>   
 <firstnode>   
   <insidefirstnode>Inside First Node</insidefirstnode>   
 </firstnode>   
 <secondnode>   
   <insidesecondnode1>Inside Second Node 1</insidesecondnode1>   
   <insidesecondnode2>inside Second Node 2</insidesecondnode2>   
 </secondnode>   
 <thirdnode>   
   <item id="0">Item 1</item>   
   <item id="1">Item 2</item>   
   <item id="2">Item 3</item>   
 </thirdnode>   
</root>'
SELECT T.c.query('text()') AS result
FROM   @x.nodes('/root/secondnode/insidesecondnode1') T(c)
GO

Result for above query is as shown below

In the line 18, what ever path you will give this query will result the data inside the node

If a node is present multiple time in xml for example 'item' node inside 'thirdnode' in above xml, then this query will fetch data from all the nodes

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
DECLARE @x xml
SET @x=
'<root>   
 <firstnode>   
   <insidefirstnode>Inside First Node</insidefirstnode>   
 </firstnode>   
 <secondnode>   
   <insidesecondnode1>Inside Second Node 1</insidesecondnode1>   
   <insidesecondnode2>inside Second Node 2</insidesecondnode2>   
 </secondnode>   
 <thirdnode>   
   <item id="0">Item 1</item>   
   <item id="1">Item 2</item>   
   <item id="2">Item 3</item>   
 </thirdnode>   
</root>'
SELECT T.c.query('text()') AS result
FROM   @x.nodes('/root/thirdnode/item') T(c)
GO

Result of above query:

If you want to fetch particular node in above case and id value also, then use below query:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
  <firstnode>   
   <insidefirstnode>Inside First Node</insidefirstnode>   
 </firstnode>   
 <secondnode>   
   <insidesecondnode1>Inside Second Node 1</insidesecondnode1>   
   <insidesecondnode2>inside Second Node 2</insidesecondnode2>   
 </secondnode>   
 <thirdnode>   
   <item id="0">Item 1</item>   
   <item id="1">Item 2</item>   
   <item id="2">Item 3</item>   
 </thirdnode>   
'
SELECT T.c.value('@id','int') ID
       ,T.c.query('text()') AS result
FROM   @x.nodes('/root/thirdnode/item[2]') T(c)
GO

In above query line 18, by appending with [number] will read only that specific node, in this case second node

Result of above query:

OPENXML in T-SQL

In nodes method there is one restriction, you can not pass SQL parameter in nodes method

Suppose you have many keys stored in one table as shown below

We can iterate over all the keys using this table to fetch corresponding data from xml file and store to another table as simple text

We can do this by applying a loop and passing all the keys stores in table

SQL Query to implement this approach is as shown below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
DECLARE @hdoc [int]
DECLARE @Key [varchar](1000)
DECLARE @XMLDoc [XML]
DECLARE @NumberOfKeys [int]
DECLARE @Counter [int] = 0;
DECLARE @Value [varchar](2000);
 
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[TempTable]') AND TYPE IN (N'U'))
  BEGIN
   DROP TABLE [dbo].[TempTable]
  END
  -- Creating temp table to store values parsed from XML
  CREATE TABLE [dbo].[TempTable](
      [ID] [int],
   [Key] [varchar](2000),
   [Value] [varchar](2000)
  )
SET @XMLDoc=
'<root>   
 <firstnode>   
   <insidefirstnode>Inside First Node</insidefirstnode>   
 </firstnode>   
 <secondnode>   
   <insidesecondnode1>Inside Second Node 1</insidesecondnode1>   
   <insidesecondnode2>inside Second Node 2</insidesecondnode2>   
 </secondnode>   
 <thirdnode>   
   <item id="0">Item 1</item>   
   <item id="1">Item 2</item>   
   <item id="2">Item 3</item>   
 </thirdnode>   
</root>'
SET @NumberOfKeys= (SELECT COUNT([Key]) FROM [JPDB].[dbo].[XMLKeysTable]);
 
EXEC sp_xml_preparedocument @hdoc OUTPUT, @XMLDoc
 
 WHILE(@Counter < @NumberOfKeys)
 BEGIN
   SELECT @Key = [Key] FROM [JPDB].[dbo].[XMLKeysTable] ORDER BY [Key] OFFSET @Counter ROWS FETCH NEXT 1 ROWS ONLY
   SELECT @Value = Value FROM OPENXML (@hdoc, @Key,1) WITH (Value [varchar](1000) 'text()[1]')
   INSERT INTO [dbo].[TempTable]
   VALUES(@Counter, @Key, @Value)
   SET @Counter += 1;
  END
 
--Removes the internal representation of the XML document specified by the document handle and invalidates the document handle
EXEC sp_xml_removedocument @hdoc
 
SELECT * FROM [dbo].[TempTable]

Result of above query:

You can also refer my blog: Pass XML file in stored procedure as a input parameter from C#

2 comments: