Lesson 13 - String functions in PHP
In the previous exercise, Solved tasks for PHP lesson 12, we've practiced our knowledge from previous lessons.
In the previous lesson of the PHP basics course we learned how to work with arrays using loops. In today's lesson, we're going to finish up with our knowledge of strings and introduce you to the most important PHP string functions.
Strings and UTF-8
Some PHP functions start with the "mb_"
prefix. The main reason
behind this is, is to show that those functions, specifically, support the UTF-8
encoding (MB stands for MultiByte). Most websites use UTF-8 nowadays because it
supports most of the characters included in non-latin alphabets. This way, you
are able to use languages such as Russian, Chinese or to use special characters
like ☺
or ♥
. Most IDEs (e.g. NetBeans) have UTF-8
set as the default encoding. If you don't use UTF-8, you'll run into big issues
eventually. For example, you might need to use a third-party library. Its author
would definitely expect you to be using UTF. In other words, you wouldn't be
able to use their library just because you refuse to use UTF-8.
When using those functions in applications, we have to set the right encoding first. Otherwise, they won't work properly in older versions of PHP. The encoding only needs to be set once per request. If your entire website is displayed through the index file, as we have things set up now, setting the encoding at the beginning of the index file will do.
mb_internal_encoding("UTF-8");
Length of a string
We can determine the length of a string (the number of characters stored in
it) using the mb_strlen()
function. Let's make a little example,
before which we will have to set-up the right encoding.
{PHP}
mb_internal_encoding("UTF-8");
$text = "Blackholes are where God divided by zero";
$length = mb_strlen($text);
echo("Text length is $length characters.");
{/PHP}
The output:
Note: In some old student books and tutorials, you will find functions
without the "mb_"
prefix. For example, strlen()
instead of mb_strlen()
. Don't ever use those functions - they can't
work with UTF-8 and will output incorrect values. The letter "Č"
is stored in UTF encoding as 2
characters (2
bytes, as
a caron (ˇ
) and the letter C
). The function with the
"mb_"
prefix treats Č
as one character, whereas the
function without the prefix would treat it as 2
characters. In
other words, it would return an invalid length for strings with non-latin
characters and will not recognize them. PHP, due to backward compatibility, has
a lot of functions that don't support UTF encoding. You should always check
whether the function is multi-byte safe and find other multibyte versions of it
if it turns out that is doesn't. For all of you anglophones - Not using these
characters doesn't mean that users with names that include accents won't
register on your site. Don't exclude people that could potentially be in your
target market just because you refuse to use UTF-8.
Working with substrings
A specific part of a string is referred to as a substring. Let's look at a couple of examples since we'll be working with them a lot.
Finding position of substring
If we wanted to find the position of a concrete substring in a string or we
wanted to determine whether it is in the string at all, we would use the
mb_strpos()
function. To make it more interesting, we'll make it
case insensitive. First of all, we'd have to convert the entire string to
uppercase with the help of the mb_strtoupper()
function. After
that, we'd search for an upper-cased substring.
{PHP}
mb_internal_encoding("UTF-8");
$string = mb_strtoupper('PHP tutorials at ICT.social');
$substring = mb_strtoupper('ict.social');
if (mb_strpos($string, $substring) !== false)
echo "Found";
else
echo("Not Found");
{/PHP}
The output:
mb_strpos()
returns 0
if the substring is
found, the first time it finds it, and false
if it is not found at
all. This is why you will have to compare the result strictly based on
its data type (use !==
instead of !=
). The values
False
and 0
are one and the same in this case.
Otherwise, the substring would not be found if the string started out with it.
Go ahead and try it out.
Along with the mb_strpos()
function, there is the
mb_strrpos()
function (the additional r
stands for
reverse) which works in pretty much the same way. The main difference being that
it searches backwards and starts at the end of the string. This one's useful for
finding file extensions.
When it comes to searching, in terms of programming, we usually call the string being searched the haystack and the substring being searched for, the needle.
Retrieving a substring at given position
We can retreive a substring using the mb_substr()
function which
takes a string, a starting index and the length of the substring as parameters.
Let's go ahead and give it a try:
{PHP}
mb_internal_encoding("UTF-8");
$text = "Blackholes are where God divided by zero.";
echo mb_substr($text, 1, 8);
{/PHP}
The output:
We echoed out a substring that starts out from the 2nd character and is 8 characters long.
Accessing a specified character
In newer versions of PHP, we can work with strings just like with arrays:
{PHP}
mb_internal_encoding("UTF-8");
$text = "Some text";
echo $text[0];
{/PHP}
A similar result could be achieved a long time ago using curly brackets, but
the syntax was removed from PHP. The code above will print out the 1st character
in a string. Unfortunately, this way doesn't support UTF-8, so do not
use it. If you need to access a character, just copy it as a substring
using the aforementioned mb_substr()
function.
Replacing substrings
We can replace a substring with another very easily using the
str_replace()
function. For example, we can easily secure an e-mail
address against spam-bots with it. We'll replace "@"
with
"(at)"
so bots won't be able to recognize e-mails in our forms. We
will, however, be missing out on all of those ever so wonderful loans and spam
{PHP}
mb_internal_encoding("UTF-8");
$address = '[email protected]';
$securedAddress = str_replace('@', '(at)', $address);
echo $securedAddress;
{/PHP}
The output:
If there are any more occurrences of the substring, the function will replace all of them.
Replacing using a dictionary
Let's say that we want to replace multiple substrings, PHP provides the
strtr()
function that does just that, which stands for STRing
TRanslate. Yeah, I know, most of PHP's function names are very confusing. The
function takes a string and a dictionary as parameters. A dictionary is an
associative array wherein keys are the substrings that we want to replace and
whose values are the substrings that we want to be used in their place.
The function is commonly used to replace emoticons in a text with HTML images. Let's go ahead and give it a try:
{PHP}
mb_internal_encoding("UTF-8");
$dictionary = array(
':)' => '<img src="smiling.png" alt="smiling" />',
':D' => '<img src="laughing.png" alt="laughing" />',
);
echo strtr('Hi :) I feel good, cause I just discovered ict.social :D', $dictionary);
{/PHP}
The output HTML:
Hi <img src="smiling.png" alt="smiling" /> I feel good, cause I just discovered ict.social <img src="laughing.png" alt="laughing" />
Splitting strings into array of substrings
A couple of more important string functions are the explode()
and implode()
. Explode()
splits a string into an array
of substrings using a specified delimiter. Implode()
, on the other
hand, joins an array of substrings into a single string and inserts a delimiter
between the items. Sometimes, the delimiter is referred to as glue.
The following program receives a string containing several comma-separated numbers as the input, and computes the sum of their values:
{PHP}
mb_internal_encoding("UTF-8");
$input = "1,5,87,65,42,4,456,8,5,98,54,89";
$numbers = explode(',', $input);
echo array_sum($numbers);
{/PHP}
Explode()
, here, splits the string using commas and returns an
array of its parts. Then, with the help of the array_sum()
function, we compute the sum of the array's content. If you're coming from a
lower-level language than PHP, you will most likely be amazed at how simpler
everything is. This is because PHP is a high-level programming language. With
them, we focus on actually developing applications, not on fixing low-level
issues.
PHP string functions
To top it all off, we'll give you a list of the most important string functions which PHP offers. You can click them to see how to use them if you'd like, the links point to the official PHP manual. You don't have to remember them all, just remember they are there and if you happen to need them, you can look them up.
mb_internal_encoding | Sets/Gets the internal character encoding. |
mb_strlen | Gets a string length. |
mb_strpos | Finds the position of the first occurrence of a substring in a string. |
mb_substr | Gets a part of a string. |
mb_strtoupper | Converts a string to uppercase. |
mb_strtolower | Converts a string to lowercase. |
trim | Strips whitespaces (or other characters) from the beginning and from the end of a string. |
htmlspecialchars | Converts the special characters to HTML entities. |
htmlspecialchars_decode | Converts HTML entities back to the special characters. |
strip_tags | Strips HTML and PHP tags from a string. |
nl2br | Inserts HTML line breaks before all newlines in a string. |
str_replace | Replaces all occurrences of a sub-search string with the replacement substring. |
strtr | Translates characters or replaces substrings. |
parse_str | Parses the string into variables. |
explode | Splits a string into an array of substrings. |
implode | Joins an array of substrings to a single string. |
hash | Generates a hash value (message digest), we'll need that for storing password. |
In the next lesson, Solved tasks for PHP lesson 13, you'll learn how to declare your own functions in PHP.
In the following exercise, Solved tasks for PHP lesson 13, we're gonna practice our knowledge from previous lessons.