Lesson 7 - Reading XML via the SAX approach in C# .NET
In the previous lesson, Introduction to XML and writing via the SAX approach, we introduced the XML format and created a simple XML file using the SAX approach. In today's tutorial, we'll continue on with the demo and write the opposite process which is loading the XML file with users and building the appropriate object structure from it (the user list).
For completeness' sake, we'll show what our file.xml XML file looks like at the moment:
<?xml version="1.0" encoding="utf-8"?> <users> <user age="22"> <name>John Smith</name> <registered>3/21/2000</registered> </user> <user age="31"> <name>James Brown</name> <registered>10/30/2016</registered> </user> <user age="16"> <name>Tom Hanks</name> <registered>1/12/2011</registered> </user> </users>
Here's what our User.cs class looks like:
class User { public string Name { get; private set; } public int Age { get; private set; } public DateTime Registered { get; private set; } public User(string name, int age, DateTime registered) { Name = name; Age = age; Registered = registered; } public override string ToString() { return Name; } }
Now, let's create a new project, once again, it'll be a console application.
We'll name it XmlSaxReading and copy our XML file to its bin/debug
folder. We'll also need to add the User class to the project. We'll need to load
the users to a collection, so let's create an empty users
List.
We'll write the code right into the Main() method for the sake of simplicity
(you already know how to design it properly in an object-oriented way).
List<User> users = new List<User>();
Reading XML via SAX
The .NET framework provides the XmlReader class for reading XML via SAX.
Let's create an instance of it. Just like with the XmlWriter class, we'll use
the Create() factory method for it whose parameter is the filename. Don't forget
to add using System.Xml
. Everything will be in a using
block, which will take care of closing the file:
using (XmlReader xr = XmlReader.Create(@"file.xml")) { }
Let's prepare the necessary variables for the user properties. We can't
assign values directly to the instance since its properties are read-only.
Another option would be to allow for it to be modified externally, but if we did
we'd lose a part of the encapsulation. We'll initialize the properties with
default values which will remain there in case the value isn't written in the
XML file. The current element's name needs to be stored somewhere, so we'll
declare a string variable element
for it. We'll write the code in a
using
block.
string name = ""; int age = 0; DateTime registered = DateTime.Now; string element = "";
Let's start by parsing the file. The XmlReader class reads a file line by
line, from top to bottom. We call the Read() method on its instance. It returns
the next Node each time it calls it. A node can be an element, an attribute, or
an element's text value (we'll mainly focus on Element, Text, and EndElement),
another node type could be a comment, which isn't very important for us at the
moment. Once the reader reaches the end of the file, the Read()
method returns false
, otherwise, it returns true
.
We'll load the document nodes gradually using a while loop:
while (xr.Read())
{
}
There are several useful properties on the XmlReader instance. We'll be using
NodeType which is where the type of the current node, on which the reader is
located at, is stored. Next, we'll use the Name
and
Value
property in which the name of the current node and its value
is stored (if it has any).
We'll mainly focus on two types of nodes: Element and Text. Let's react to them. We'll add in empty condition bodies for now:
// reads the element if (xr.NodeType == XmlNodeType.Element) { } // reads the element value else if (xr.NodeType == XmlNodeType.Text) { }
Now, let's add code to the first condition. To be perfectly clear, we're reacting to the element reading, we'll need to perform two actions.
The key action will be to save the element name to the element
variable. This will enable us to determine which element's text we're reading in
the second condition.
Every time we encounter a user
element, we load the age
attribute using the getAttribute() method whose parameter is the attribute's
name. The current element's attribute can be read easily. However, it's not that
simple with reading its value. Although there are methods like
ReadContentAsType(), beware, they implicitly call the Read() method for some
reason which messes with the while loop. Reading values like this would work
properly in non-nested XML files. I tried to find a workaround for this but
the solutions were so awkward that I ended up not using the ReadContentAs...()
methods at all. Here's what the first condition looks like:
element = xr.Name; // the name of the current element if (element == "user") { age = int.Parse(xr.GetAttribute("age")); }
Let's move on to the next branch, i.e. processing the element's values. We'll
use the element
variable and add it to a switch
. We'll
assign the value to the corresponding property according to the value in
element
:
switch (element) { case "name": name = xr.Value; break; case "registered": registered = DateTime.Parse(xr.Value); break; }
We're already very close to finishing. The brighter ones among you surely
noticed that we won't be adding users anywhere. We'll do so after we reach the
closing user
element. Now, let's add one last condition:
// reads the closing element else if ((xr.NodeType == XmlNodeType.EndElement) && (xr.Name == "user")) users.Add(new User(name, age, registered));
We're done
For completeness' sake, here's the all of the code needed to load the file:
using (XmlReader xr = XmlReader.Create(@"file.xml")) { string name = ""; int age = 0; DateTime registered = DateTime.Now; string element = ""; while (xr.Read()) { // reads the element if (xr.NodeType == XmlNodeType.Element) { element = xr.Name; // the name of the current element if (element == "user") { age = int.Parse(xr.GetAttribute("age")); } } // reads the element value else if (xr.NodeType == XmlNodeType.Text) { switch (element) { case "name": name = xr.Value; break; case "registered": registered= DateTime.Parse(xr.Value); break; } } // reads the closing element else if ((xr.NodeType == XmlNodeType.EndElement) && (xr.Name == "user")) users.Add(new User(name, age, registered)); } }
We still have to print the users out so we know that we loaded them properly. We'll modify the ToString() method in the User class so that it returns all of the values:
public override string ToString() { return String.Format("{0}, {1}, {2}", Name, Age, Registered.ToShortDateString()); }
Then, we'll simply print the users:
// printing the loaded objects foreach (User u in users) { Console.WriteLine(u); } Console.ReadKey();
The result:
Console application
John Smith, 22, 3/21/2000
James Brown, 31, 10/30/2016
Tom Hanks, 16, 1/12/2011
If you didn't like the loading process much, I'm right there with you. Generating a new XML file via SAX is simple and natural, but loading is an awkward process with SAX. Next time, Working with XML files using the DOM approach in C# .NET, we'll look at DOM, the object-based approach for XML document operations.
Download
By downloading the following file, you agree to the license termsDownloaded 857x (39.36 kB)