Get up to 80 % extra points for free! More info:

Lesson 4 - Working with XML files in PHP

In the previous lesson, Working with CSV files in PHP , we learnt to work with files using resources, about CSV files, and we used our knowledge in a practical example.

In this lesson, we will take a look at working with another file format that we mentioned in the introductory lesson of this course - XML. Here are two ways to access XML files:

  • SAX method (S imple A PI for X ML) - classes XMLWriter and XMLReader and
  • SimpleXML class.

Unlike the previous lessons, in which we only worked with functions, we will now move to object-oriented programming.

XML generation by SAX method - XMLWriter

We are getting inspiration from the article Introduction to XML and writing by SAX in the C# course and we will write a similar script in PHP. We will also use the same input data:

$users = new Users();
$users->Add(new User("Paul Goodman", 22, new OurDate(2000, 3, 21)));
$users->Add(new User("John Newlands", 31, new OurDate(2012, 10, 30)));
$users->Add(new User("Thomas Heart", 16, new OurDate(2011, 1, 12)));

XMLWriter class to generate XML using the SAX method:

$out = new XMLWriter();
$out->openMemory();
$out->startDocument('1.0', 'UTF-8');
$out->setIndent(TRUE);

$users->exportXML($out);

$out->endDocument();
echo $out->outputMemory(TRUE);

The most interesting method is probably exportXML() of the $users object. Other commands only set output parameters (XML version and encoding, indentation, etc.).

The Users class has two methods:

  • first to add another user and
  • the second to export the resulting XML:
class Users{
  private $list = array();

  public function Add($user) {
    $this->list[] = $user;
  }

  public function exportXML($out) {
    $out->startElement('users');
    foreach($this->list as $user) {
      $user->exportXML($out);
    }
    $out->endElement();
  }
}

The User class contains 3 attributes, but it's no problem to add more as needed:

class User {
  private $name;
  private $age;
  private $registration_date;

  public function __construct($name, $age, $registration_date) {
    $this->name= $name;
    $this->age = $age;
    $this->registration_date = $registration_date;
  }

  public function exportXML($out) {
    $out->startElement('user');
    $out->writeAttribute('age', $this->age);
    $out->writeElement('name', $this->name);
    $this->registration_date->exportXML($out);
    $out->endElement();
  }
}

We can see that the variable for the $name is added as an element, but $age only as an attribute. In reality, it would be more advantageous to use the year of birth instead of age, but I wanted the data structure to be similar to the referenced article.

The private object $registration_date stores the date in three numbers. It's just an example of how to work with nested objects. We would save the whole date more easily as one value:

class OurDate{
  private $day, $month, $year;

  public function __construct($year, $month, $day) {
    $this->day = $day;
    $this->month = $month;
    $this->year = $year;
  }

  public function exportXML($out) {
    $out->writeElement('registration_date', "$this->month/$this->day/$this->year");
  }
}

That is all. After running the script, we get the following result:

<?xml version="1.0" encoding="UTF-8"?>
<users>
 <user vek="22">
  <name>Paul Goodman</name>
  <registration_date>3/21/2000</registration_date>
 </user>
 <user vek="31">
  <name>John Newlands</name>
  <registration_date>10/30/2012</registration_date>
 </user>
 <user vek="16">
  <name>Thomas Heart</name>
  <registration_date>1/12/2011</registration_date>
 </user>
</users>

SAX is a fast and memory-efficient method for directly generating an XML or XHTML document. However, if we want to further process the output with a template system, it will be more advantageous to use a DOM with which the template systems work directly.

Note: The input data for the XMLWriter class must be in the UTF-8 encoding. Otherwise, it will not work properly with special characters, that occur in many other human languages. Output encoding can be selected as required.

XML reading by SAX method - XMLReader

Again, we are getting inspiration from the article Reading XML SAX in C# and writing a similar application in PHP. We will use the XML data we generated a while ago:

<?xml version="1.0" encoding="UTF-8"?>
<users>
 <user vek="22">
  <name>Paul Goodman</name>
  <registration_date>3/21/2000</registration_date>
 </user>
 <user vek="31">
  <name>John Newlands</name>
  <registration_date>10/30/2012</registration_date>
 </user>
 <user vek="16">
  <name>Thomas Heart</name>
  <registration_date>1/12/2011</registration_date>
 </user>
</users>

This time, the task is much more difficult than generating XML using the SAX method. We have to read individual tokens and control the storage of data by their order. For those interested, I recommend using the method only in specific cases, for example when they only need to select some data from a huge XML file. In other cases, it is more convenient to use the DOM method:

$data = new XMLReader();
$data->open('data.xml');

while($data->read()) {
  switch($data->name) {
    case 'users':
      $users = new Users($data);
      break;
  }
}

echo $users, "\n";

More attentive programmers have noticed that we use only 1 case in the switch block instead of the shorter if notation. This is because we use the methodology of a regular automaton for reading, for which the use of a switch is usual. When parsing a more complex document, we will certainly appreciate the easy extensibility:

class Users {
  private $list = array();

  public function __construct($data) {
    while($data->read()) {
      switch($data->name) {
        case 'user':
          if($data->nodeType == XMLReader::ELEMENT) {
            $this->list[] = new User($data);
          }
          break;

        case 'users':
          return;
      }
    }
  }

  public function __toString() {
    $out = array();
    foreach($this->list as $user) {
      $out[] = $user->__toString();
    }
    return implode("\n", $out);
  }
}

Here, too, the use of a switch does not look very attractive, but when using a more complex XML structure, we will certainly appreciate the ease of adding other elements:

class User {
  private $name;
  private $age;
  private $registration_date;

  public function __construct($data) {
    $this->age = $data->getAttribute('age');

    while($data->read()) {
      switch($data->name) {
        case 'name':
          $data->read();
          $this->name = $data->value;
          $data->read();
          break;

        case 'registration_date':
          $data->read();
          $this->registration_date = $data->value;
          $data->read();
          break;

        case '#text':
          break;

        default:
          return;
      }
    }
  }

  public function __toString() {
    return sprintf("%-20s %2d %10s", $this->name, $this->age, $this->registration_date);
  }
}

In the end, the most complicated class is User. First, we load the age attribute of the user element and save it as an attribute. Then do the same with the contents of the elements name and registration_date. If the parser encounters an unknown element, the constructor is terminated. The pseudo-element #text contains whitespace, which are among the elements of the source XML and which we need to get rid of.

The __toString() methods are for diagnostic output. After running the script, we get the following result:

Paul Goodman        22  3/21/2000
John Newlands       31  10/30/2012
Thomas Heart        16  1/12/2011

This example is not programmed as cleanly as it could be. A combination of valid input data that would not pass this process could certainly be found. It was supposed to be just a demonstration that even in PHP it is possible to use the SAX method for reading XML documents.

Reading XML with SimpleXML

We said, that reading XML using the SAX method is not suitable for normal use. Now let's see a better way to use the SimpleXML class.

The SimpleXML class is intended for easy conversion of an XML document into objects in PHP. Unlike the XMLReader class, however, we do not read the document in a loop one element at a time, but we load it whole into the object structure. This is very convenient because the slowest operation is performed by a standard library, which is optimized for this purpose.

We will use the same data again:

<?xml version="1.0" encoding="UTF-8"?>
<users>
 <user age="22">
  <name>Paul Goodman</name>
  <registration_date>3/21/2000</registration_date>
 </user>
 <user age="31">
  <name>John Newlands</name>
  <registration_date>10/30/2012</registration_date>
 </user>
 <user age="16">
  <name>Thomas Heart</name>
  <registration_date>1/12/2011</registration_date>
 </user>
</users>

The script listing the data is very short. Here, in contrast to the SAX method, we only need to create one Users class:

<?php
$data = new Users('data.xml');
echo $data, "\n";

class Users {
  private $list;

  public function __construct($xmlFile) {
    $this->list = new SimpleXMLElement($xmlFile, NULL, TRUE);
  }

  public function __toString() {
    $out = array();

    foreach($this->list as $user) {
      $out[] = sprintf("%-20s %2d %10s", $user->name, $user['age'], $user->registration_date);
    }

    return implode("\n", $out);
  }
}

If necessary, we can, of course, add methods for searching for users, password verification, etc. For our purpose, a simple list of users will suffice. After running the script, the following will appear in the browser:

Paul Goodman        22  3/21/2000
John Newlands       31  10/30/2012
Thomas Heart        16  1/12/2011

As we can see, reading a document with the SimpleXMLElement class is much easier than reading with the SAX method. This class is also about 10 times faster than XMLReader and better documented. It is therefore much more convenient for processing common XML documents.

The complete code of all the examples from this lesson can be downloaded as always at the bottom of the article:-)

In the next lesson, Working with INI files in PHP, we'll learn about text files in the INI format and how to work with them in PHP.


 

Did you have a problem with anything? Download the sample application below and compare it with your project, you will find the error easily.

Download

By downloading the following file, you agree to the license terms

Downloaded 2x (2.43 kB)
Application includes source codes in language PHP

 

Previous article
Working with CSV files in PHP
All articles in this section
Working with files in PHP
Skip article
(not recommended)
Working with INI files in PHP
Article has been written for you by Jan Štěch
Avatar
User rating:
No one has rated this quite yet, be the first one!
As an author I'm interested in PHP for web development, I'm also trying to improve in C#
Activities