|
1 | 1 | """ |
2 | 2 | bmh_search.py |
3 | 3 |
|
4 | | - This module implements bmh search to find a substring in a string |
| 4 | + Implementation of bmh search to find a substring in a string |
5 | 5 |
|
6 | 6 | BMH Search Overview: |
7 | 7 | -------------------- |
8 | 8 | Uses a bad-character shift of the rightmost character of the window to |
9 | 9 | compute shifts. |
10 | 10 |
|
11 | | - The trick to this algorithm is `bmbc`, a lookup table with a default |
12 | | - value equal to the length of the pattern to be searched, so that |
13 | | - the algorithm can skip `len(pattern)` indices through the string |
14 | | - for efficiency's sake. For example, if we're searching through the |
15 | | - string "cotton milled paper" for the pattern "grumble", we look at |
16 | | - the last letter "r" (BMH goes backwards through a string) and notices |
17 | | - that it is not equal to "e". Thus, we can afford to jump our search |
18 | | - index back a whole seven characters. |
19 | | -
|
20 | | - However, not all the entries in `bmbc` are equal to `len(pattern)`. |
21 | | - If we searched the string "adventure time" for "grumble", we'd find |
22 | | - the "e" to match but mismatch the "m" and "l" in the string and |
23 | | - pattern, respectively. In this case, we can only jump back six |
24 | | - characters safely, which is why `bmbc` contains values that are not |
25 | | - simply `len(pattern)`. |
26 | | -
|
27 | | - Pre: a string > substring. |
28 | | -
|
29 | | - Post: returns a list of indices where the substring was found. |
30 | | -
|
31 | 11 | Time: Complexity: O(m + n), where m is the substring to be found. |
32 | 12 |
|
33 | 13 | Space: Complexity: O(m), where m is the substring to be found. |
34 | 14 |
|
35 | 15 | Psuedo Code: https://github.com/FooBarWidget/boyer-moore-horspool |
36 | | - (converted from C++) |
37 | 16 |
|
38 | | - bmh_search.search(string, substring) -> list[integers] |
39 | | - bmh_search.search(string, substring) -> list[empty] |
40 | 17 | """ |
41 | 18 |
|
42 | 19 |
|
|
0 commit comments