Added new post.
This commit is contained in:
parent
d9d979db1c
commit
beeddfc597
|
@ -0,0 +1,105 @@
|
|||
---
|
||||
title: An O(n) Python 3 algorithm that halves the number of characters to be removed.
|
||||
tags: [algorithm, string, python]
|
||||
updated: 2018-03-02 17:00
|
||||
description: A Python algorithm that halves the number of characters to be removed
|
||||
---
|
||||
|
||||
It's been a while.
|
||||
|
||||
Some days ago I thought to use this algorithm to remove duplicate escape
|
||||
characters in my [md-toc](http://github.com/frnmst/md-toc) program.
|
||||
I then realized I didn't need it and also noticed that dealing correctly with
|
||||
escape characters seems very hard. So I generalized the algorithm to get
|
||||
any number of a specified charatcter in a string, except `\` !!, to be
|
||||
halvened.
|
||||
|
||||
## Algorithm
|
||||
|
||||
Here's the algorithm along with some unit tests:
|
||||
|
||||
```python
|
||||
import math
|
||||
import unittest
|
||||
|
||||
|
||||
def halve_characters(s, remove_char):
|
||||
assert isinstance(s, str)
|
||||
assert isinstance(remove_char, str)
|
||||
assert len(remove_char) == 1
|
||||
assert remove_char != '\\'
|
||||
|
||||
i = 0
|
||||
final_string = str()
|
||||
while i < len(s):
|
||||
if s[i] == remove_char:
|
||||
j = i
|
||||
count = 1
|
||||
match = True
|
||||
while j < len(s) - 1 and match:
|
||||
if s[j] == s[j + 1]:
|
||||
count += 1
|
||||
else:
|
||||
match = False
|
||||
j += 1
|
||||
remove_char_count = math.floor(count / 2)
|
||||
final_string += remove_char_count * remove_char
|
||||
i += count
|
||||
else:
|
||||
final_string += s[i]
|
||||
i += 1
|
||||
|
||||
return final_string
|
||||
|
||||
|
||||
|
||||
class Test(unittest.TestCase):
|
||||
def test_halve_characters(self):
|
||||
self.assertEqual(halve_characters('thiis is a stringiioo. Hiiii', 'i'), 'this s a strngioo. Hii')
|
||||
self.assertEqual(halve_characters('****\n', '*'), '**\n')
|
||||
self.assertEqual(halve_characters('\\\n\\\\\n', '\n'), '\\\\\\')
|
||||
self.assertEqual(halve_characters('', 'a'), '')
|
||||
self.assertEqual(halve_characters('abcdefffghifffff', 'f'), 'abcdefghiff')
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
unittest.main()
|
||||
```
|
||||
|
||||
## Explanation
|
||||
|
||||
The idea behind it is that we iterate character by
|
||||
character through the string until we find the an element that needs
|
||||
to be removed: `if s[i] == remove_char`
|
||||
|
||||
If this condition is true, we set a counter ``count = 1`` which is a
|
||||
counter for the number of `remove_char` characters.
|
||||
|
||||
We assume that the next character in the string will also be `remove_char`.
|
||||
If it is it then we continue to count the occurrencies of `remove_char`.
|
||||
|
||||
Once we reach the end of the string or if there are no more
|
||||
consecutive `remove_chars` we compute the halved number of
|
||||
remove characters with `math.floor(count / 2)` and concatenate
|
||||
`remove_char * math.floor(count / 2)` to the final string. Using the floor
|
||||
function implies that the count is approximated to the nearest smaller integer
|
||||
number.
|
||||
|
||||
If the current character is not `if s[i] != remove_char` we simply add it
|
||||
to the final string.
|
||||
|
||||
## Complexity
|
||||
|
||||
The complexity is O(n) because each character is inspected only once:
|
||||
|
||||
``` python
|
||||
while i < len(s):
|
||||
if s[i] == remove_char:
|
||||
[...]
|
||||
# Remember what count is.
|
||||
i += count
|
||||
else:
|
||||
i += 1
|
||||
```
|
||||
|
||||
Enjoy!
|
Loading…
Reference in New Issue