Lesson 9 - Strings in The C language - Working with single characters
Lesson highlights
Are you looking for a quick reference on characters and ASCII codes in C instead of a thorough-full lesson? Here it is:
Iterating through all the string characters:
{C_CONSOLE}
int i;
char sentence[] = "Hello ICT.social";
for (i = 0; sentence[i] != '\0'; i++)
printf("%c ", sentence[i]);
{/C_CONSOLE}
Converting between characters and their ASCII value:
{C_CONSOLE}
char c; // character
int i; // ordinal (ASCII) value of a character
// conversion from text to ASCII value
c = 'a';
i = (int)c;
printf("The character '%c' was converted to its ASCII value of %d\n", c, i);
// conversion from an ASCII value to text
i = 98;
c = (char)i;
printf("The ASCII value of %d was converted to its textual value of '%c'\n", c, i);
{/C_CONSOLE}
Would you like to learn more? A complete lesson on this topic follows.
In the last lesson, Strings in the C language, we learned that strings (texts) in the C language are just char arrays terminated by the null character. In today's lesson, we'll work with individual string characters, learn to use ASCII values, and create a sentence analyzer and a cipher program.
Printing text character by character
First of all, let's test that we can really treat text as a char
array. We'll start by printing a string, character by character:
{C_CONSOLE}
int i;
char sentence[] = "Hello ICT.social";
for (i = 0; sentence[i] != '\0'; i++)
printf("%c ", sentence[i]);
{/C_CONSOLE}
The output:
Console application
H e l l o I C T . s o c i a l
The loop iterates over all the string characters till it encounters the null character at the end of the string. In the result, all of the characters are printed to the console. I've added an empty space after each character to make the result more illustrative.
The ASCII value
Maybe you've already heard of the ASCII table. In the MS-DOS era, there was
practically no other way to store text. Individual characters were stored as
numbers of the byte datatype, i.e. of a range from 0
to
255
. The system provided the ASCII table which had 256
characters and each ASCII code (numerical code) was assigned to one
character.
Perhaps you understand why this method is no longer as relevant. The table
simply could not contain all the characters for all international alphabets, now
we use Unicode (UTF-8) encoding where characters are represented in a different
way. However, the C language still works with ASCII values by default. If we
wanted to use Unicode characters (UTF-8), we'd have to use so-called wide
characters. The key advantage of using plain ASCII codes to represent characters
is that the characters are stored in a table next to each other, alphabetically.
For example, at position 97
we'd find "a"
, at
98
"b"
etc. It is the same with numbers, but
unfortunately, the accent characters are messed up.
Now, let's convert a character into its ASCII value, and then create the character according to its ASCII value:
{C_CONSOLE}
char c; // character
int i; // ordinal (ASCII) value of a character
// conversion from text to ASCII value
c = 'a';
i = (int)c;
printf("The character %c was converted to its ASCII value of %d\n", c, i);
// conversion from an ASCII value to text
i = 98;
c = (char)i;
printf("The ASCII value of %d was converted to its textual value of %c\n", c, i);
{/C_CONSOLE}
Console application
The character a was converted to its ASCII value of 97
The ASCII value of 98 was converted to its textual value of b
Character occurrence in a sentence analysis
Let's write a simple program that analyzes a given sentence for us. We'll search for the number of vowels, consonants, digits, and other characters (e.g. spaces or punctuation marks).
We'll hard-code the input string into our code, so we won't have to write it
again every time. Once the program is complete, we'll replace the string with
scanf()
. We'll iterate over characters using a loop. I should start
out by saying that we won't focus as much on program speed here, we'll choose
practical and simple solutions.
First, let's define vowels, consonants, and digits. We don't have to count other characters since it'll be the string length minus the number of vowels, consonants, and digits. Let's set up variables for the individual counters, also, since it is a more complex code, we'll add in comments.
// Counters initialization int vowels_count = 0; int consonants_count = 0; int digits_count = 0; // the string that we want to analyze char s[] = "A programmer gets stuck in the shower because the instructions on the shampoo were: Lather, Wash, and Repeat."; // definition of character groups char vowels[] = "aeiouyAEIOUY"; char consonants[] = "bcdfghijklmnpqrstvwxzBCDFGHIJKLMNPQRSTVWXZ"; char digits[] = "0123456789"; // indexes int i; printf("The original message: %s\n", s); // the main loop iterating over characters till in meets the end of it for (i = 0; s[i] != '\0'; i++) { }
First of all, we reset the counters. For the definition of characters groups, we only need ordinary char arrays. The main loop iterates over each character in the char array s.
Now, let's increment the counters. For simplicity's sake, I'll focus on the loop instead of rewriting the code:
// the main loop iterating over characters until it gets to the end for (i = 0; s[i] != '\0'; i++) { if (contains_character(s[i], vowels) == 1) vowels_count++; else if (contains_character(s[i], consonants) == 1) consonants_count++; else if (contains_character(s[i], digits) == 1) digits_count++; }
Notice that we use the contains_character()
function which
determines whether a string contains a given character. We'll get to functions
like that at the end of this course, however, we'll skip ahead a little bit here
and add the contains_character()
function to make our program a bit
more interesting.
Insert the following code block above the main()
function. If
you have any problems doing so, just download the attached source code at the
end of the article.
int contains_character(char c, char s[]) { int i; for (i = 0; s[i] != '\0'; i++) if (s[i] == c) return 1; return 0; }
We won't describe the function now, let's get back to our code in the
main()
function. We have to look for the current character of the
sentence in the vowels first and eventually increase their counters. If we don't
find it in vowels, we'll look in consonants and eventually increase their
counter. We'll do the same with digits.
Now, all we're missing is the printing part at the end, i.e. displaying text:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int contains_character(char c, char s[])
{
int i;
for (i = 0; s[i] != '\0'; i++)
if (s[i] == c)
return 1;
return 0;
}
int main(int argc, char** argv)
{
// Counters initialization
int vowels_count = 0;
int consonants_count = 0;
int digits_count = 0;
// the string that we want to analyze
char s[] = "A programmer gets stuck in the shower because the instructions on the shampoo were: Lather, Wash, and Repeat.";
// definition of character groups
char vowels[] = "aeiouyAEIOUY";
char consonants[] = "bcdfghijklmnpqrstvwxzBCDFGHIJKLMNPQRSTVWXZ";
char digits[] = "0123456789";
// indexes
int i;
printf("The original message: %s\n", s);
// the main loop iterating over characters until it gets to the end
for (i = 0; s[i] != '\0'; i++)
{
if (contains_character(s[i], vowels) == 1)
vowels_count++;
else if (contains_character(s[i], consonants) == 1)
consonants_count++;
else if (contains_character(s[i], digits) == 1)
digits_count++;
}
printf("Vowels: %d\n", vowels_count);
printf("Consonants: %d\n", consonants_count);
printf("Digits: %d\n", digits_count);
printf("Other characters: %d\n", strlen(s) - vowels_count - consonants_count - digits_count);
}
Console application
A programmer gets stuck in the shower because the instructions on the shampoo were: Lather, Wash, and Repeat.
Vowels: 33
Consonants: 55
Digits: 0
Other characters: 21
That's it, we're done!
The Caesar cipher
Let's create a simple program that encrypts text. If you've ever heard of the
Caesar cipher, then you already know exactly what we're going to program. This
form of text encryption is based on shifting characters in the alphabet by a
certain fixed number of characters. For example, if we shift the word "hello" by
1
character forwards, we'd get "ifmmp". The user will be allowed to
select the number of character shifts.
Let's get right into it! We need variables for the original text, the
encrypted message, and the shift. Then, we need a loop iterating over each
character and printing an encrypted message. We'll also have to hard-code the
message defined in the code, so we won't have to write it over and over during
the testing phase. After we finish the program, we'll replace the contents of
the variable with the scanf()
function. The cipher doesn't work
with accent characters, spaces, and punctuation marks. We'll just assume the
user will not enter them. We'll also assume the user will enter lowercase
letters only to keep things simple. Ideally, we should remove accent characters
before encrypting, as well as anything other than letters, and convert all
letters to lowercase.
// variable initialization char s[] = "blackholesarewheregoddividedbyzero"; int shift = 1; int i; printf("Original message: %s\n", s); // loop iterating over characters for (i = 0; s[i] != '\0'; i++) { } // printing printf("Encrypted message: %s", s);
Now, let's move to the loop. We'll increase the value of the current character by however many shifts.
{C_CONSOLE}
// variable initialization
char s[] = "blackholesarewheregoddividedbyzero";
int shift = 1;
int i;
printf("Original message: %s\n", s);
// loop iterating over characters
for (i = 0; s[i] != '\0'; i++)
{
s[i] = s[i] + shift;
}
// printing
printf("Encrypted message: %s", s);
{/C_CONSOLE}
Console application
Original message: blackholesarewheregoddividedbyzero
Encrypted message: cmbdlipmftbsfxifsfhpeejwjefecz{fsp
Let's try it out! The result looks pretty good. However, we can see that the
characters after "z"
overflow to ASCII values of other characters
("{"
in the picture). Therefore, the characters are no longer just
alphanumeric, but other nasty characters. Let's set our characters up as a
cyclical pattern, so the shifting could flow smoothly from "z"
to
"a"
and so on. We'll get by with a simple condition that decreases
the ASCII value by the length of the alphabet so we'd end back up at
"a"
.
// loop iterating over characters for (i = 0; s[i] != '\0'; i++) { s[i] = s[i] + shift; if (s[i] > 'z') // overflow control s[i] = s[i] - 26; }
If i exceeds the ASCII value of 'z'
, we reduce it by
26
characters (the number of characters in the English alphabet).
It's simple and our program is now operational. Notice that we don't use direct
character codes anywhere. There's a 'z'
in the condition even
though we could write 122
there directly. I set it up this way so
that our program is fully encapsulated from explicit ASCII values, so it'd be
clearer on how it works. Try to code a decryption program as practice for
yourself.
In the next lesson, Solved tasks for C lessons 8-9, we'll see that there are still a couple more things we haven't touched base on that strings can do. Spoiler: We'll learn how to decode "Morse code".
In the following exercise, Solved tasks for C lessons 8-9, we're gonna practice our knowledge from previous lessons.
Did you have a problem with anything? Download the sample application below and compare it with your project, you will find the error easily.
Download
By downloading the following file, you agree to the license terms
Downloaded 2x (287.79 kB)
Application includes source codes in language C