Lesson 8 - Reading XML via the SAX approach in Java
In the previous lesson, Writing XML Files via the SAX Approach in Java, we introduced the XML format and created a simple XML file using the SAX approach. In today's tutorial, we'll continue on with the demo and write the opposite process which is loading the XML file with users and building the appropriate object structure from it (the user list).
For completeness' sake, we'll show what our file.xml
XML file
looks like at the moment:
<?xml version="1.0" encoding="utf-8"?> <users> <user age="22"> <name>John Smith</name> <registered>3/21/2000</registered> </user> <user age="31"> <name>James Brown</name> <registered>10/30/2016</registered> </user> <user age="16"> <name>Tom Hanks</name> <registered>1/12/2011</registered> </user> </users>
Here's what our User.java
class looks like:
public class User { private String name; private int age; private LocalDate registered; public static DateTimeFormatter dateTimeFormatter = DateTimeFormatter.ofPattern("M/d/yyyy"); public User(String name, int age, LocalDate registered) { this.name = name; this.age = age; this.registered = registered; } @Override public String toString() { return String.format("%s, %d, %s", name, age, dateTimeFormatter.format(registered)); } public String getName() { return name; } public int getAge() { return age; } public LocalDate getRegistered() { return registered; } }
Now, let's create a new project, once again, it'll be a console application.
We'll name it XmlSaxReading
and copy our XML file to the project
folder.
We'll inherit the new class from
org.xml.sax.helpers.DefaultHandler
. This will give us access to
methods we'll need later when parsing the file. We'll also add the
User
class to the project. We want to load the users into a
collection, so we'll create an empty ArrayList
named
users
.
private List<User> users = new ArrayList<>();
Constants
Before we move to reading, we'll create a helper class and store there constants with the names of each element in the XML file:
public final class Constants { public static final String USERS = "users"; public static final String USER = "user"; public static final String AGE = "age"; public static final String NAME = "name"; public static final String REGISTERED = "registered"; }
Reading XML via SAX
In the main class we'll create a private parse(String file)
method, which will accept the path to the XML file as a parameter:
private void parse(String file) throws SAXException, IOException, ParserConfigurationException { // TODO implement the method body }
In the body of this method, we'll start parsing. Java provides the abstract
SAXParser
class to read XML via SAX. To obtain an instance of this
class, we use a factory provided by the
SAXParserFactory.newInstance().newSAXParser()
class. We simply call
parse()
on the parser instance, passing the file we want to parse
and the handler that takes care of the parsing as parameters. The method body
will look like this:
private void parse(String file) throws SAXException, IOException, ParserConfigurationException { // Create a parser instance SAXParser parser = SAXParserFactory.newInstance().newSAXParser(); // Start parsing parser.parse(new File(file), this); // Print the users to the console users.forEach(System.out::println); }
Let's prepare the necessary variables for the user fields. We can't assign values directly to the instance since it has no setters. Another option would be to add setters, but if we did we'd lose a part of the encapsulation. We'll initialize the fields with default values which will remain there in case the value isn't written in the XML file. Then we'll create variables to indicate that we're processing the age or the registration date:
private String name = ""; private int age = 0; private LocalDate registered = LocalDate.now(); private boolean processingName = false; private boolean processingRegistered = false;
Now is the time to override the methods the DefaultHandler
class
provides us with. We'll override three methods: startElement()
,
endElement()
, and characters()
:
@Override public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { // This method is called every time the parser finds a new element } @Override public void endElement(String uri, String localName, String qName) throws SAXException { // This method is called every time the parser finds a closing tag } @Override public void characters(char[] ch, int start, int length) throws SAXException { // This method is called every time the parser offers us to read the element value }
startElement()
In the startElement()
method, two parameters are of particular
interest: qName
and attributes
. The first parameter
contains the name of the element that is currently being processed. The second
contains the attributes of this element. To find out which element is currently
being processed, we use a simple switch
:
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { switch (qName) { case Constants.USER: // We'll get the user's age from the attribute age = Integer.parseInt(attributes.getValue(Constants.AGE)); break; case Constants.NAME: // To process the name , we need to store an information we're processing it since we'll read the value at a different place processingName = true; break; case Constants.REGISTERED: // To process the registration date, we need to store a similar information processingRegistered = true; break; } }
endElement()
In the endElement()
method, which is called when a closing tag
is being processed, we'll simply set the corresponding indicator back to
false
:
public void endElement(String uri, String localName, String qName) throws SAXException { switch (qName) { case Constants.NAME: // If we were processing the name, we'll set this indicator to false processingName = false; break; case Constants.REGISTERED: // If we were processing the registration date, we'll set this indicator to false processingRegistered = false; break; case Constants.USER: // If we've read all the user data, we can create a new user instance and add it to the collection User user = new User(name, age, registered); users.add(user); break; } }
characters()
The last method we need to override is the characters()
method,
reading the value between the element tags. We'll use our indicators to find out
what we're going to read. So the method will look like this:
public void characters(char[] ch, int start, int length) throws SAXException { // We create a new string instance String text = new String(ch, start, length); if (processingName) { // When processing the name, we simple assign it name = text; } else if (processingRegistered) { // When processing the registration date, we parse it registered = LocalDate.parse(text, User.dateTimeFormatter); } }
If we have a lot of fields to assign to, the
characters()
method will start to grow out of control. An
alternative way of processing can be by using HashMap
s, where we
create a lambda function to process each field and store it in the
HashMap
. We use the field name as the key. You can read more about
the implementation in lesson on zip files.
And the parsing is done. Finally, we'll add the main()
method to
create a new instance and start parsing:
public static void main(String[] args) { try { new XmlSaxReading().parse("file.xml"); } catch (SAXException | IOException | ParserConfigurationException e) { e.printStackTrace(); } }
The executed code will result in three names read from the file:
Console application
John Smith, 22, 3/21/2000
James Brown, 31, 10/30/2012
Tom Hanks, 16, 1/1/2011
If you didn't like the reading too much, I'll tell you the truth. While generating a new XML file is very simple and natural using SAX, reading is quite confusing. Next time, in the lesson Working with XML files using the DOM approach in Java, we'll look at DOM, i.e. object-oriented approach for XML documents.
Did you have a problem with anything? Download the sample application below and compare it with your project, you will find the error easily.
Download
By downloading the following file, you agree to the license terms
Downloaded 4x (27.33 kB)
Application includes source codes in language Java