dev Franco Masotti 4 years ago
parent
commit
beeddfc597
1 changed files with 105 additions and 0 deletions
1. 105
_posts/2018-03-02-algorithm.md

#### 105 _posts/2018-03-02-algorithm.md Unescape Escape View File

 `@ -0,0 +1,105 @@` `---` `title: An O(n) Python 3 algorithm that halves the number of characters to be removed.` `tags: [algorithm, string, python]` `updated: 2018-03-02 17:00` `description: A Python algorithm that halves the number of characters to be removed` `---` ``` ``` `It's been a while.` ``` ``` `Some days ago I thought to use this algorithm to remove duplicate escape` `characters in my [md-toc](http://github.com/frnmst/md-toc) program.` `I then realized I didn't need it and also noticed that dealing correctly with` `escape characters seems very hard. So I generalized the algorithm to get` `any number of a specified charatcter in a string, except `\` !!, to be` `halvened.` ``` ``` `## Algorithm` ``` ``` `Here's the algorithm along with some unit tests:` ``` ``` ````python` `import math` `import unittest` ``` ``` ``` ``` `def halve_characters(s, remove_char):` ` assert isinstance(s, str)` ` assert isinstance(remove_char, str)` ` assert len(remove_char) == 1` ` assert remove_char != '\\'` ``` ``` ` i = 0` ` final_string = str()` ` while i < len(s):` ` if s[i] == remove_char:` ` j = i` ` count = 1` ` match = True` ` while j < len(s) - 1 and match:` ` if s[j] == s[j + 1]:` ` count += 1` ` else:` ` match = False` ` j += 1` ` remove_char_count = math.floor(count / 2)` ` final_string += remove_char_count * remove_char` ` i += count` ` else:` ` final_string += s[i]` ` i += 1` ``` ``` ` return final_string` ``` ``` ``` ``` ``` ``` `class Test(unittest.TestCase):` ` def test_halve_characters(self):` ` self.assertEqual(halve_characters('thiis is a stringiioo. Hiiii', 'i'), 'this s a strngioo. Hii')` ` self.assertEqual(halve_characters('****\n', '*'), '**\n')` ` self.assertEqual(halve_characters('\\\n\\\\\n', '\n'), '\\\\\\')` ` self.assertEqual(halve_characters('', 'a'), '')` ` self.assertEqual(halve_characters('abcdefffghifffff', 'f'), 'abcdefghiff')` ``` ``` ``` ``` `if __name__ == '__main__':` ` unittest.main()` ````` ``` ``` `## Explanation` ``` ``` `The idea behind it is that we iterate character by ` `character through the string until we find the an element that needs ` `to be removed: `if s[i] == remove_char`` ``` ``` `If this condition is true, we set a counter ``count = 1`` which is a` `counter for the number of `remove_char` characters.` ``` ``` `We assume that the next character in the string will also be `remove_char`.` `If it is it then we continue to count the occurrencies of `remove_char`.` ``` ``` `Once we reach the end of the string or if there are no more ` `consecutive `remove_chars` we compute the halved number of` `remove characters with `math.floor(count / 2)` and concatenate ` ``remove_char * math.floor(count / 2)` to the final string. Using the floor ` `function implies that the count is approximated to the nearest smaller integer ` `number.` ``` ``` `If the current character is not `if s[i] != remove_char` we simply add it` `to the final string.` ``` ``` `## Complexity` ``` ``` `The complexity is O(n) because each character is inspected only once:` ``` ``` ```` python` `while i < len(s):` ` if s[i] == remove_char:` ` [...]` ` # Remember what count is.` ` i += count` ` else:` ` i += 1` ````` ``` ``` `Enjoy!`