What is a string in Python?
In Python, a string is an array of 16-bit Unicode bytes (and 8-bit ANSI bytes for Python 2), where each string character is denoted by one byte. In Python, a single character is also a string of length 1. The square brackets "[]" can be used to access the characters in the string. Python strings are immutable, which means that once they are created, they can no longer be changed. All string processing methods return a copy of the string and do not modify the original. String literals can be enclosed in single, double, or triple quotes. You can use single quotes in double and triple quoted strings and vice versa.
The built-in Python "str" library provides essential methods out of the box for searching, concatenating, reversing, splitting, comparing strings, and more.
Splitting a Python string using the string.split() method
The easiest and most common way to split a string is to use the string.split(separator, maxsplit) method. By default, the string.split() method breaks a line by spaces, tabs, and line breaks. To split a string by another character, or even another string, you can pass the delimiter as a parameter to the string.split() method.
An example of splitting a string at a custom separator:
An example of splitting a string using a substring as a delimiter:
Python string.split() method syntax
The string.split() method accepts two parameters.
Where:
- separator (optional) - the delimiter by which the line will be split. If no separator is specified, then string.split() method will split the line at using spaces, tabs, and line breaks.
- maxsplit (optional) - the maximum number of splits that the string.split() method will perform. If maxsplit is not specified, the number of splits is unlimited. The default is -1, which means that the string.split() method will split the string into all occurrences of the separator.
An example of limiting the number of splits by passing the maxsplit parameter to the string.split() method.
Splitting Python string in reverse order (right to left)
The string.rsplit() pair method is similar to the string.rsplit() method and has the same signature, but splits the string from right to left if the maxsplit parameter is specified.
Splitting a string at line breaks using the string.splitlines() method
The Python splitlines() method splits a text at line breaks. Line breakers can be one of the following characters: "\n", "\r", "\r\n", "\v", or "\f".
Python string.splitlines() method syntax
The string.splitlines() method accepts one parameter.
Where:
- keeplinebreaks (optional) - specifies whether line breaks should be included (True) or not (False) in list items (default is False).
Splitting Python strings using Regular Expressions
Besides using the string.split() and string.splitlines() methods, you can also split strings using the Python Regex library. To do this, you first need to import the re library and call re.split(), passing a regular expression as the first parameter and the string itself to be split as the second parameter.
Regular expressions are more challenging to use and maintain, but they provide more flexibility for splitting strings based on complex conditions.
Splitting Python string using the range operator
Since strings in Python are an array of bytes, you can use the range operator to take a range of characters from a string, just like you do for collections.
Conclusion
Python provides an essential set of methods for splitting strings. In most cases, it will be sufficient to use the string.split() and string.splitlines() methods to split a string at spaces, tabs, and line breaks. For complex conditions, you can use regular expressions, and in some simple cases, it may be sufficient to take a range of characters from a string using the range operator.