sec01 - String

In this lecture, we will learn about String (character strings).

String

Basics of Strings

Python manages character information using the String data type. Strings are treated the same way whether they consist of a single character or a sequence of characters.

In other words, Python treats the strings in each line of the following code as identical.

"a"
"ABC"
"123.4"
"$%&"
"漢"
"にほんごOK"

Strings are enclosed in double quotation marks (") or single quotation marks ('). Both sides of a string must be enclosed in the same type of quotation marks.

Python recognizes the first quotation mark as the start of a string and the second quotation mark of the same type as the end of the string.

Let's look at the code. Python recognizes the following as a string.

"I am Nico."
'I am Nico.'

In Visual Studio Code, if it is recognized as a string, it will change to a light orange color as shown in the figure below.

If the quotation marks at both ends are not the same type, Visual Studio Code will display a red wavy line to indicate an error. In the figure below, the red wavy line indicates a warning that “the string definition has a starting point but is not closed with the same type of quotation marks.”

For strings containing quotation marks

Next, let's shorten the I am part of the previous sentence. Let's write the code as follows.

"I'm Nico."
'I'm Nico.'

First, as shown in the first line, if single quotation marks are enclosed within double quotation marks, they are recognized as a string without any errors. Since both ends are enclosed in double quotation marks (the same type of quotation marks), even if single quotation marks are mixed in between, there is no problem.

As shown in the second line, problems occur when single quotation marks are used inside single quotation marks. Visual Studio Code also displays an error message as shown in the figure below. (The red wavy line indicates the location of the problem.)

Why is the notation in the second line incorrect? In Python, when the same type of quotation mark as the starting point appears next, it is treated as the end point. In other words, the single quotation mark to the left of the first 'I' is recognized as the starting point, and the single quotation mark to the right is recognized as the end point. The portion between them is recognized as the range of the string. Since there is no opening quotation mark to the right of 'I', the part m Nico. is not recognized as part of the string, resulting in an error.

Which should I use, double quotation marks or single quotation marks?

Python recognizes text enclosed in the same type of quotation marks as a string. Furthermore, there is no difference in the processing results, so either can be used without issue.

As a webmaster, I generally use single quotation marks and only use double quotation marks when there are single quotation marks within a string. The reason is simple: single quotation marks make the code look cleaner than double quotation marks. There are many other programmers who use single quotation marks as well.

Assigning to variables

The format for assigning strings to variables is the same as for numeric values.

greeting_by_nico = "I'm Nico. Nice to meet you."
print(greeting_by_nico)

In the above example, the string I'm Nico. Nice to meet you. is assigned to the variable “greeting_by_nico”.

In the case of strings, use variable names that immediately indicate what kind of information (text) is assigned to the variable.