Python has three basic methods namely strip(), lstrip() and rstrip() for doing stripping on the given strings. The basic syntax is as follows :-
str.strip([chars])
str.lstrip([chars])
str.rstrip([chars])
chars
Optional. String specifying the set of characters to be removed.
If omitted or None, the chars argument defaults to removing whitespace.
The chars argument is not a prefix; rather, all combinations of its
values are stripped.
The strip() method returns a copy of the string with the leading and trailing characters removed.Likewise, lstrip() and rstrip() removes the specified characters from left and right of the string respectively.
But, since providing the character sequence into this strip methods is optional, so, by default, when no character sequence is provided, only removal of white spaces from the string, is done by these methods. The strip() method removes both leading as well as trailing white spaces from the given string and returns the new copy.Likewise, lstrip() and rstrip() method returns the string with leading and trailing white spaces removed respectively. Let’s have a look at the following example :-
# input string is word 'spacious' with 3 spaces to left and 5 spaces to the right
>>> a = ' spacious '
>>> len(a)
16
>>> b =a.strip()
>>> b
'spacious'
>>> len(b)
8
>>> len(a)
16
>>> c = a.lstrip()
>>> c
'spacious '
>>> len(c)
13
>>> d = a.rstrip()
>>> d
' spacious'
>>> len(d)
11
>>> len(a)
16
So, with a closer look, we assigned b=a.strip() which, removed the leading and trailing white spaces from string a and assigned the resulting string to b. But, when we printed a it is still the same original string. The next line len(a) confirms the same. Whereas the len(b) gives 8 removing the 3 white spaces to the left and 5 white spaces to the right. But, c = a.lstrip() removed white spaces only to the left, hence c has length equal to 13 (8 characters from the word and 5 spaces to the right). In the same way, d = a.lstrip() removed white spaces only to the right of the string, hence len(d) gives 11 (3 spaces from left and 8 characters from the word). So, this concludes, all these strip methods acts upon the provided string and returns the resulting string without modifying the original string.
With this, I guess we are pretty much clear about strip(), lstrip() and rstrip() , when no argument is passed. Now let’s proceed towards passing character set to these methods and see how they work.
So, when a character set is passed as an argument to any of these strip methods, the interpreter tries finding all possible combinations of the provided character set from the given string and removes the same. All these methods continue stripping operation, until a character that is not specified in the given character set, is found in the string. So, let’s understand it with example :-
>>> 'ABBA'.strip('AB')
''
>>> 'ABCBA'.strip('AB')
'C'
>>> 'ABCBBAA'.strip('AB')
'C'
>>> 'ABCBBAA'.lstrip('AB')
'CBBAA'
>>> 'ABCBBAA'.rstrip('AB')
'ABC'
In the first example ‘ABBA’.strip(‘AB’), all possible combinations of A and B (i.e. any of ‘AB’,’BA’,’AA’ or ‘BB’) are searched. And strip() removes all such combinations, so output is ”.
In the next two examples, for ‘ABCBA’ and ‘ABCBBAA’, all possible combinations like ‘AB’,’AA’, ‘BA’ and ‘AA’ are removed. But, since ‘C’ was not in the passed character set, so was ignored, and hence was returned.
The last two examples show the behavior of lstrip() and rstrip(). In case of lstrip() the combinations of ‘A’ and ‘B’ from the left of the string were tried and stripped. When, ‘C’ was encountered, but was not found in the provided character set, interpreter stopped stripping and the remaining string ‘CBBAA’ was returned. Notice, this did not remove the combinations (‘AA’ & ‘BB’) right to C in the string. Likewise, in rstrip(), combinations from right to left were searched and stripped ‘AA’ & ‘BB’. On, encountering ‘C’, the stripping was stopped. And no combinations from left were removed. So the returned string was ‘ABC’.
By now, we are familiar how these three strip methods differ from each other in terms of behavior. So, let’s focus on any one method and go deeper into it. Once, we are clear on any one of these, we can relate them with other two methods. Let’s proceed with rstrip() method.
>>> 'ABCABAB'.rstrip('M')
'ABCABAB'
>>> 'ABCABAB'.rstrip('B')
'ABCABA'
>>> 'ABCABAB'.rstrip('A')
'ABCABAB'
In the first case, the provided character ‘M’ was not found in the string while searched from right. So, nothing was stripped and the string was returned as it is. In the next one, given character ‘B’ was found at right most end, so was stripped. But, moving ahead towards left, it encountered ‘A’, which was not in the provided character set. So the interpreter stopped stripping there and returned ‘ABCABA’.
In the last example, the character provided ‘A’ was not found in the rightmost end of the string. So, interpreter stopped there without moving further left. Even if, there are ‘A’s present in the string, the interpreter did not stripped them, since rstrip() searches for the character from right to left of the string and expects presence of any of the provided character at rightmost end. Else, searching stops there only.
Now let’s go one level deeper. Let’s try a little difficult examples :-
>>> 'www.docomo.org'.rstrip('rgo')
'www.docomo.'
Here, the given character set ‘rgo’ does not matches with the stripped portion ‘org’. But, still the string ‘org’ was removed. This is because, the order of the occurrence of the given characters in the string, doesn’t matter for stripping. These strip methods, don’t search for the sequence (or order) of the given characters in the string. The characters provided as parameter, is a set, not a string that it will be searched for exact match. So, out of the characters provided, interpreter searches for all possible combinations (not the exact order) till it finds a character that does not belong to the provided character set. Now, let’s look at the below code piece :-
>>> 'www.docomo.org'.rstrip('rg')
'www.docomo.o'
>>> 'www.docomo.org'.rstrip('zrgp')
'www.docomo.o'
>>> 'www.docomo.org'.rstrip('or')
'www.docomo.org'
First example is pretty clear. In second, the interpreter stripped whatever characters it found in the given string(‘r’ &’g’), out of the provided character set. And, the characters, that are provided in the character set to be stripped, but not present in the string (‘z’ & ‘p’) in this case, are simply ignored. In the last example, even though all the characters from the character set (i.e. ‘o’ and ‘g’) are present, in the string (‘www.docomo.org’), but while traversing from right to left (since rstrip() is provided), the first character found in the string is ‘g’ which is not present in given character set ‘or’, so the interpreter stopped there only and no stripping was done on the string. Hence, the input string was returned as it is.
In the below case :-
>>> 'www.docomo.com'.rstrip('cm')
'www.docomo.co'
Even if both the characters (‘c’ and ‘m’) are present in the string, but rstrip() removed only ‘m’, not ‘c’. This is because, after stripping ‘m’, the interpreter encountered ‘o’ (right to left traversal in rstrip()), which is not in the provided character set. So, stripping stopped and the string returned with only ‘m’ removed from it.
Now, let’s try what will be the output of following one :-
>>> 'www.docomo.com'.rstrip('com')
Many of us will get ‘www.d’ as the answer to the above one. And the justification is :- the provided character set ‘com’ matches ‘com’ as well as ‘ocomo’ from docomo
in the string. But, with a closer look, a ‘.’ (dot) can be found in between ‘docomo’ and ‘com’, which is also a character. But, this ‘.’ (dot) is not in our given character set to be stripped. So, interpreter stopped stripping on encountering dot(‘.’) and will return ‘www.docomo.’.
But what if, we pass ‘.com’ (or any shuffled set of these characters like ‘mo.c’, or ‘c.mo’) as the character set? Then, it will strip dot(‘.’) as well and proceed leftward to find further matches of given characters. And thus, the match found would be ‘ocomo.com’, and will be stripped, leaving ‘www.d’ as the output.
>>> 'www.docomo.com'.rstrip('com')
'www.docomo.'
>>> 'www.docomo.com'.rstrip('mc.o')
'www.d'
Now, since we are clear on rstrip(), so we can also analyze how strip() and lstrip() would behave.
Hope this helps you to have a deeper understanding on Python’s strip(), lstrip() and rstrip() methods.
Cheers… 🙂