Lesson 4 - More on the Python type system: Data types
In the previous exercise, Solved tasks for Python lessons 1-3, we've practiced our knowledge from previous lessons.
In the previous lesson, Solved tasks for Python lessons 1-3, we learned about the basic data types of
Python (int
, float
, and str
). In today's
tutorial, we're going to look at them in more detail and explain how to use them
correctly. Today is going to be more theoretical, and the next lesson will be
very practical. At the end, we'll make a few simple examples.
Python recognizes two kinds of datatypes, mutable and immutable.
Mutable and immutable data types
All the data types in Python work as references. This means the value is stored somewhere in computer memory and we access it using a reference to that exact place. This is a significant difference from other programming languages where variables directly hold values rather than the address to that value.
All Python variables hold addresses internally to their actual data in memory. This approach leads to the possibility of referencing a single value from multiple variables using the same address. Let's introduce a simple example:
{PYTHON}
s1 = "This is a sample text"
s2 = s1
print(s1)
print(s2)
In the code above, the text "This is a sample text"
is
stored in memory only once and there are two variables pointing (referencing) to
it - s1
and s2
. This saves memory and will be very
useful for us once we get to objects. Since it'd be confusing for multiple
variables to change when a single one of them is altered, basic data types in
Python are immutable. Don't be intimidated by this technical
term, it simply means that it's guaranteed that their value won't change. There
is no way to change the previous text, but we can assign a new one if we wanted
to change the value:
{PYTHON}
s1 = "This is a sample text"
s2 = s1
s2 = "This is some other text"
print(s1)
print(s2)
We'll get back to this topic in the object-oriented course. If you didn't get it at all, don't worry, we won't need it for a while
Other built-in data types
Let's look at the other data types that Python offers:
Boolean
Variables of the Boolean
type can only contain two values:
True
or False
. We'll use them when we get to
conditions. In a variable of the Boolean
type, we can store either
True
/False
or a logical expression. Let's use them on
a simple example:
{PYTHON}
b = False
expression = 15 > 5
print(b)
print(expression)
The program output:
Console application
False
True
Notice that the expression applies, i.e. is equal to True since
15
really is more than 5
. Going from expressions to
conditions isn't a far stretch, but we'll go into them in the next lesson.
Python can natively handle more types like complex numbers and sets, which we'll get to later on.
String functions
There are several more built-in data types out there, but we'll get to them later. Since we'll need to work with strings in our applications, let's look at the built-in methods which Python provides for this type. The difference between a function and a method is that a method always belongs to a variable (object) and we call them on this variable. We'll show some of them and try them out:
startswith()
,
endswith()
, and the in
operator
We can ask if a string starts with or ends with a substring. A substring is a
part of a string. Both of these methods will take a substring as a parameter and
return a Boolean
(True
/False
) value. If
we want to determine whether a string contains a substring (anywhere), we use
the in
operator to do so. We can't react to output yet, however,
we'll write the return values nonetheless:
{PYTHON}
s = "Rhinopotamus"
print(s.startswith("rhin"))
print(s.endswith("tamus"))
print("pot" in s);
print("lol" in s);
The program output:
Console application
False
True
True
False
We can see that everything works as expected. The first phrase failed because the string actually starts with a capital letter.
upper()
,
lower()
, capitalize()
, and title()
Distinguishing between capital and lowercase letters is not always what we
want. We'll often need to check for the presence of a substring in a
case-insensitive way. This situation can be solved using the
upper()
and lower()
methods which return a string all
in lowercase or uppercase. Let's make a more realistic example than
Rhinopotamus. The variable will contain a line from a configuration file, which
was written by the user. Since we can't rely on the user's input we'll try to
eliminate any possible errors (by ignoring letter cases).
{PYTHON}
config = "Fullscreen shaDows autosave"
config = config.lower()
print("Will the game run in fullscreen?")
print("fullscreen" in config)
print("Will shadows be turned on?")
print("shadows" in config)
print("Will sound be turned off?")
print("nosound" in config)
print("Would the player like to use autosave?")
print("autosave" in config)
The program output:
Console application
Will the game run in fullscreen?
True
Will shadows be turned on?
True
Will sound be turned off?
False
Would the player like to use autosave?
True
We can see that we're able to detect the presence of particular words in the string. First, we convert the entire string to lowercase or uppercase. Then, we check the presence of the word in lowercase or uppercase, respectively. By the way, simple processing of configuration script could actually look like this.
We can also easily convert the first letter of a text using the
capitalize()
method or even the first letter of each word using
title()
:
{PYTHON}
name = input("Enter your name: ")
print("Hi " + name.capitalize())
book = "Alice through the looking glass"
print(book.title())
The program above will greet you with the first letter of your name capitalized and write the book name in the title case.
Console application
Enter your name: John Smith
Hi John smith
Alice Through The Looking Glass
strip()
, lstrip()
,
and rstrip()
Another pitfall can be whitespace characters. Spaces are not visible for users, but they can cause program errors. Generally, it's a good idea to strip them from any input from the user, we can strip either all the whitespace characters in the entire string or only the leading/trailing ones. Python, in parsing functions, automatically strips the specified string before it even starts parsing it. Enter some spaces before the number and after the number in the following application:
{PYTHON}
s = input("Enter a number:")
print("Here's what you originally wrote: " + s);
print("Your text after the strip() method: " + s.strip())
a = int(s)
print("I converted the text you entered to a number. Here it is: " + str(a))
The result:
Console application
Enter a number: 10
Here's what you originally wrote: 10
Your text after the method strip(): 10
I converted the text you entered to a number. Here it is: 10
replace()
Probably, the most important method for strings is to be able to replace its parts with another text. We enter two substrings as parameters. The first one is the one want to be replaced and the second one will replace it. Since we know strings are immutable, the method returns a new string in which the replacement occurred. When the method doesn't find the substring, it returns the original string. We can also specify the maximum number of replacements to be performed.
Let's try it out:
{PYTHON}
s = "Java is the best!"
s = s.replace("Java", "Python")
print(s)
We'd get:
Console application
Python is the best!
Formatting operator
Python provides a very useful way to insert multiple variables into different
places in a string using placeholders. The placeholders are represented by a per
cent character (%
) and the data type symbol. Let's try it out:
{PYTHON}
a = 10
b = 20
c = a + b
s = "When we add up %d and %d, we get %d" % (a, b, c)
print(s)
The program output:
Console application
When we add up 10 and 20, we get 30
This is a very useful and clear way to build complex strings from a larger number of variables. Here are the most basic placeholders you can use:
%d
- Integers%s
- Strings (will be converted using thestr()
function)%f
- Floats
You might be wondering what to do if you need to write a percent character
followed by some of these letters without invoking this functionality. You'd
simply write %%
and it would be treated as simple text (it would
print out a single percent sign).
ljust()
, rjust()
,
center()
Now, let's cover the methods which do the exact opposite, i.e. add
whitespaces into text What are
these good for? Imagine that we have 100 variables and we want to arrange them
into a table. We'd modify the text using the rjust()
method with a
column width parameter, e.g. 20
characters. If the text only had
12
characters, 8
spaces would be inserted before it to
make it 20
characters long. The ljust()
method would
add 8
spaces after the text. Since we haven't gone over what is
needed to make such a table, we'll just keep these methods in mind and save them
for later.
len()
Lastly, but most importantly we have the global len()
function
(not a method). It returns an integer that represents the number of characters
in the string.
{PYTHON}
name = input("Type in your name: ")
print("Your name is %d characters long." % (len(name)))
The program's output:
Console application
Type in your name: John Smith
Your name is 10 characters long.
is*()
methods
We can ask whether a string can be converted into a given type (e.g. if it's a number). Here are the methods used to do so (we'll use them later to sanitize user inputs):
isalnum()
- Returns whether the string contains alphanumerical characters only (returnsFalse
for empty strings).isalpha()
- Returns whether the string contains alphabetical characters only.isdecimal()
- Returns whether the string contains decimal characters only.isdigit()
- Returns whether the string contains digits only.islower()
- Returns whether all the letters in the string are in lowercase (returnsFalse
if there are no alphabetical characters)isnumeric()
- Returns whether the string contains numeric characters only.isspace()
- Returns whether the string contains white characters only.istitle()
- Returns whether the string is in the title case.isupper()
- Returns whether all the letters in the string are capitalized (returnsFalse
if there are no alphabetical characters).
There's still a lot to go over and lots of other data types that we haven't covered. Regardless, there is a time for everything. In the next lesson, Conditions (branching) in Python, we'll introduce conditions and then loops, then we'll have enough knowledge to create interesting programs