Skip to content

Commit 273702e

Browse files
committed
Add number theory, string matching, dynamic programming codes
1 parent bbe68d7 commit 273702e

23 files changed

Lines changed: 641 additions & 123 deletions

README.md

Lines changed: 20 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -17,17 +17,6 @@ So,if you wannt to view the notes which contain latex math formulas and are in m
1717
* [.](.)
1818
* [README.md](./README.md)
1919
* [backtracking](./backtracking)
20-
* [computationalMethod](./computationalMethod)
21-
* [README.md](./computationalMethod/README.md)
22-
* [interplotion.py](./computationalMethod/interplotion.py)
23-
* [iteration.py](./computationalMethod/iteration.py)
24-
* [least_square.py](./computationalMethod/least_square.py)
25-
* [linear_equation.py](./computationalMethod/linear_equation.py)
26-
* [numerical_differential.py](./computationalMethod/numerical_differential.py)
27-
* [numerical_integration.py](./computationalMethod/numerical_integration.py)
28-
* [solve-linear-by-iteration.py](./computationalMethod/solve-linear-by-iteration.py)
29-
* [tongyu_equation.py](./computationalMethod/tongyu_equation.py)
30-
* [vector_norm.py](./computationalMethod/vector_norm.py)
3120
* [dataStructure](./dataStructure)
3221
* [allOone](./dataStructure/allOone)
3322
* [bTree.py](./dataStructure/bTree.py)
@@ -45,7 +34,7 @@ So,if you wannt to view the notes which contain latex math formulas and are in m
4534
* [redBlackTree.py](./dataStructure/redBlackTree.py)
4635
* [redBlackTree0.py](./dataStructure/redBlackTree0.py)
4736
* [splayTree.py](./dataStructure/splayTree.py)
48-
* [trie.py](./dataStructure/trie.py)
37+
* [trie](./dataStructure/trie)
4938
* [winnerTree.py](./dataStructure/winnerTree.py)
5039
* [divideAndConquer](./divideAndConquer)
5140
* [min_distance_of_n_points.py](./divideAndConquer/min_distance_of_n_points.py)
@@ -59,22 +48,40 @@ So,if you wannt to view the notes which contain latex math formulas and are in m
5948
* [red-black-tree.md](./docs/red-black-tree.md)
6049
* [sort.md](./docs/sort.md)
6150
* [src](./docs/src)
51+
* [string-matching.md](./docs/string-matching.md)
6252
* [tree.md](./docs/tree.md)
6353
* [dynamicProgramming](./dynamicProgramming)
6454
* [Vec2d.hs](./dynamicProgramming/Vec2d.hs)
6555
* [lcs.py](./dynamicProgramming/lcs.py)
6656
* [matrixChainMultiply.py](./dynamicProgramming/matrixChainMultiply.py)
6757
* [splitStripe.hs](./dynamicProgramming/splitStripe.hs)
6858
* [splitStripe.py](./dynamicProgramming/splitStripe.py)
59+
* [stoneGame.py](./dynamicProgramming/stoneGame.py)
6960
* [testVec2d.hs](./dynamicProgramming/testVec2d.hs)
61+
* [wildcard_matching.py](./dynamicProgramming/wildcard_matching.py)
7062
* [math](./math)
63+
* [README.md](./math/README.md)
64+
* [euler.py](./math/euler.py)
65+
* [factor.py](./math/factor.py)
66+
* [gcd.py](./math/gcd.py)
7167
* [isPrime.py](./math/isPrime.py)
68+
* [modulo_equation.py](./math/modulo_equation.py)
7269
* [num_weight.py](./math/num_weight.py)
7370
* [permute_back_track.py](./math/permute_back_track.py)
7471
* [permute_cantor.c](./math/permute_cantor.c)
7572
* [permute_divide_and_conquer.py](./math/permute_divide_and_conquer.py)
7673
* [permute_next_arrangement.c](./math/permute_next_arrangement.c)
7774
* [primesLEn.hs](./math/primesLEn.hs)
75+
* [numericalAnalysis](./numericalAnalysis)
76+
* [README.md](./numericalAnalysis/README.md)
77+
* [interplotion.py](./numericalAnalysis/interplotion.py)
78+
* [iteration.py](./numericalAnalysis/iteration.py)
79+
* [least_square.py](./numericalAnalysis/least_square.py)
80+
* [linear_equation.py](./numericalAnalysis/linear_equation.py)
81+
* [numerical_differential.py](./numericalAnalysis/numerical_differential.py)
82+
* [numerical_integration.py](./numericalAnalysis/numerical_integration.py)
83+
* [solve-linear-by-iteration.py](./numericalAnalysis/solve-linear-by-iteration.py)
84+
* [vector_norm.py](./numericalAnalysis/vector_norm.py)
7885
* [search](./search)
7986
* [8Astar.py](./search/8Astar.py)
8087
* [BFS_knight.hs](./search/BFS_knight.hs)
@@ -96,6 +103,7 @@ So,if you wannt to view the notes which contain latex math formulas and are in m
96103
* [rabin_karp.py](./string/rabin_karp.py)
97104
* [src](./string/src)
98105
* [sunday.py](./string/sunday.py)
106+
* [wildcard_matching.py](./string/wildcard_matching.py)
99107
* [utils](./utils)
100108
* [config.py](./utils/config.py)
101109
* [genReadme.py](./utils/genReadme.py)

dataStructure/trie/mapSum.py

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
#coding: utf-8
2+
''' mbinary
3+
#######################################################################
4+
# File : mapSum.py
5+
# Author: mbinary
6+
7+
# Blog: https://mbinary.coding.me
8+
# Github: https://github.com/mbinary
9+
# Created Time: 2018-12-14 23:11
10+
# Description:
11+
实现一个 MapSum 类里的两个方法,insert 和 sum。
12+
13+
对于方法 insert,你将得到一对(字符串,整数)的键值对。字符串表示键,整数表示值。如果键已经存在,那么原来的键值对将被替代成新的键值对。
14+
15+
对于方法 sum,你将得到一个表示前缀的字符串,你需要返回所有以该前缀开头的键的值的总和。
16+
17+
示例 1:
18+
19+
输入: insert("apple", 3), 输出: Null
20+
输入: sum("ap"), 输出: 3
21+
输入: insert("app", 2), 输出: Null
22+
输入: sum("ap"), 输出: 5
23+
leetcode-ch:677 https://leetcode-cn.com/problems/map-sum-pairs/
24+
#######################################################################
25+
'''
26+
27+
class node:
28+
def __init__(self,ch,n=0):
29+
self.ch = ch
30+
self.n = n
31+
self.children={}
32+
def __getitem__(self,ch):
33+
return self.children[ch]
34+
def __setitem__(self,ch,nd):
35+
self.children[ch]=nd
36+
def __len__(self):
37+
return len(self.children)
38+
def __iter__(self):
39+
return iter(self.children.values())
40+
def __delitem(self,ch):
41+
del self.children[ch]
42+
def __contains__(self,ch):
43+
return ch in self.children
44+
class MapSum:
45+
46+
def __init__(self):
47+
"""
48+
Initialize your data structure here.
49+
"""
50+
self.root = node('')
51+
52+
def insert(self, key, val):
53+
"""
54+
:type key: str
55+
:type val: int
56+
:rtype: void
57+
"""
58+
nd = self.root
59+
for i in key:
60+
if i not in nd:
61+
nd[i] = node(i)
62+
nd = nd[i]
63+
nd.n = val
64+
65+
def visit(self,nd):
66+
ret = nd.n
67+
for chd in nd:
68+
ret+=self.visit(chd)
69+
return ret
70+
def sum(self, prefix):
71+
"""
72+
:type prefix: str
73+
:rtype: int
74+
"""
75+
nd = self.root
76+
for ch in prefix:
77+
if ch in nd:
78+
nd = nd[ch]
79+
else:return 0
80+
return self.visit(nd)
81+

docs/string-matching.md

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# String Matching algorithm
2+
3+
![](https://upload-images.jianshu.io/upload_images/7130568-e10dc137e9083a0e.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
4+
5+
## Rabin-Karp
6+
We can view a string of k characters (digits) as a length-k decimal number. E.g., the string “31425” corresponds to the decimal number 31,425.
7+
- Given a pattern P [1..m], let p denote the corresponding decimal value.
8+
- Given a text T [1..n], let $t_s$ denote the decimal value of the length-m substring T [(s+1)..(s+m)] for s=0,1,…,(n-m).
9+
- let `d` be the radix of num, thus $d = len(set(s))$
10+
- $t_s$ = p iff T [(s+1)..(s+m)] = P [1..m].
11+
- p can be computed in O(m) time. p = P[m] + d\*(P[m-1] + d\*(P[m-2]+…)).
12+
- t0 can similarly be computed in O(m) time.
13+
- Other $t_1,\ldots,t_{n-m}$ can be computed in O(n-m) time since $t_{s+1} can be computed from ts in constant time.
14+
Namely,
15+
16+
$$
17+
t_{s+1} = d*(t_s-d^{m-1} * T[s+1])+T[s+m+1]
18+
$$
19+
However, it's no need to calculate $t_{s+1}$ directly. We can use modulus operation to reduce the work of caculation.
20+
21+
We choose a small prime number. Eg 13 for radix( noted as d) 10.
22+
Generally, d\*q should fit within one computer word.
23+
24+
We firstly caculate t0 mod q.
25+
Then, for every $t_i (i>1)$
26+
assume
27+
$$
28+
t_{i-1} = T[i+m-1] + 10*T[i+m-2]+\ldots+10^{m-1}*T[i-1]
29+
$$
30+
denote $ d' = d^{m-1}\ mod\ q$
31+
thus,
32+
$$
33+
\begin{aligned}
34+
t_i &= (t_{i-1} - d^{m-1}*T[i-1]) * d + T[i+m]\\
35+
&\equiv (t_{i-1} - d^{m-1}*T[i-1]) * d + T[i+m] (mod\ q)\\
36+
&\equiv (t_{i-1}- ( d^{m-1} mod \ q) *T[i-1]) * d + T[i+m] (mod\ q)\\
37+
&\equiv (t_{i-1}- d'*T[i-1]) * d + T[i+m] (mod\ q)
38+
\end{aligned}
39+
$$
40+
41+
So we can compare the modular value of each ti with p's.
42+
Only if they are the same, then we compare the origin chracter, namely $T[i],T[i+1],\ldots,T[i+m-1]$ and the pattern.
43+
Gernerally, this algorithm's time approximation is O(n+m), and the worst case is O((n-m+1)\*m)
44+
45+
**Problem: this is assuming p and ts are small numbers. They may be too large to work with easily.**
46+
47+
## FSM
48+
A FSM can be represented as (Q,q0,A,S,C), where
49+
- Q is the set of all states
50+
- q0 is the start state
51+
- $A\in Q$ is a set of accepting states.
52+
- S is a finite input alphabet.
53+
- C is the set of transition functions: namely $q_j = c(s,q_i)$.
54+
55+
Given a pattern string S, we can build a FSM for string matching.
56+
Assume S has m chars, and there should be m+1 states. One is for the begin state, and the others are for matching state of each position of S.
57+
58+
Once we have built the FSM, we can run it on any input string.
59+
## KMP
60+
>Knuth-Morris-Pratt method
61+
62+
The idea is inspired by FSM. We can avoid computing the transition functions. Instead, we compute a prefix functi`Next` on P in O(m) time, and Next has only m entries.
63+
> Prefix funtion stores info about how the pattern matches against shifts of itself.
64+
65+
- String w is a prefix of string x, if x=wy for some string y
66+
- String w is a suffix of string x, if x=yw for some string y
67+
- The k-character prefix of the pattern P [1..m] denoted by Pk.
68+
- Given that pattern prefix P [1..q] matches text characters T [(s+1)..(s+q)], what is the least shift s'> s such that P [1..k] = T [(s'+1)..(s'+k)] where s'+k=s+q?
69+
- At the new shift s', no need to compare the first k characters of P with corresponding characters of T.
70+
Method: For prefix pi, find the longest proper prefix of pi that is also a suffix of pi.
71+
next[q] = max{k|k\<q and pk is a suffix of pq}
72+
73+
For example: p = ababaca, for p5 = ababa, Next[5] = 3. Namely p3=aba is the longest prefix of p that is also a suffix of p5.
74+
75+
Time approximation: finding prefix function `next` take O(m), matching takes O(m+n)
76+
77+
## Boyer-Moore
78+
- The longer the pattern is, the faster it works.
79+
- Starts from the end of pattern, while KMP starts from the beginning.
80+
- Works best for character string, while KMP works best for binary string.
81+
- KMP and Boyer-Moore
82+
- Preprocessing existing patterns.
83+
- Searching patterns in input strings.
84+
## Sunday
85+
### features
86+
- simplification of the Boyer-Moore algorithm;
87+
- uses only the bad-character shift;
88+
- easy to implement;
89+
- preprocessing phase in O(m+sigma) time and O(sigma) space complexity;
90+
- searching phase in O(mn) time complexity;
91+
- very fast in practice for short patterns and large alphabets.
92+
### description
93+
The Quick Search algorithm uses only the bad-character shift table (see chapter Boyer-Moore algorithm). After an attempt where the window is positioned on the text factor y[j .. j+m-1], the length of the shift is at least equal to one. So, the character y[j+m] is necessarily involved in the next attempt, and thus can be used for the bad-character shift of the current attempt.
94+
95+
The bad-character shift of the present algorithm is slightly modified to take into account the last character of x as follows: for c in Sigma, qsBc[c]=min{i : 0 < i leq m and x[m-i]=c} if c occurs in x, m+1 otherwise (thanks to Darko Brljak).
96+
97+
The preprocessing phase is in O(m+sigma) time and O(sigma) space complexity.
98+
99+
During the searching phase the comparisons between pattern and text characters during each attempt can be done in any order. The searching phase has a quadratic worst case time complexity but it has a good practical behaviour.
100+
101+
For instance,
102+
![image.png](https://upload-images.jianshu.io/upload_images/7130568-76d130ae24603d51.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
103+
104+
In this example, t0, ..., t4 = a b c a b is the current text window that is compared with the pattern. Its suffix a b has matched, but the comparison c-a causes a mismatch. The bad-character heuristics of the Boyer-Moore algorithm (a) uses the "bad" text character c to determine the shift distance. The Horspool algorithm (b) uses the rightmost character b of the current text window. The Sunday algorithm (c) uses the character directly right of the text window, namely d in this example. Since d does not occur in the pattern at all, the pattern can be shifted past this position.
105+
106+
107+
# Reference:
108+
1. Xuyun, ppt, String matching
109+
2. [Sunday-algorithm](http://www.inf.fh-flensburg.de/lang/algorithmen/pattern/sunday.htm)
110+
3. GeeksforGeeks, [KMP Algorithm](https://www.geeksforgeeks.org/kmp-algorithm-for-pattern-searching/)

dynamicProgramming/stoneGame.py

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
#coding: utf-8
2+
''' mbinary
3+
#######################################################################
4+
# File : stoneGame.py
5+
# Author: mbinary
6+
7+
# Blog: https://mbinary.coding.me
8+
# Github: https://github.com/mbinary
9+
# Created Time: 2018-12-14 14:32
10+
# Description:
11+
亚历克斯和李用几堆石子在做游戏。偶数堆石子排成一行,每堆都有正整数颗石子 piles[i] 。游戏以谁手中的石子最多来决出胜负。石子的总数是奇数,所以没有平局。
12+
亚历克斯和李轮流进行. 每回合,玩家从行的开始或结束处取走整堆石头。 这种情况一直持续到没有更多的石子堆为止,此时手中石子最多的玩家获胜。
13+
那么先手一定会赢吗? 是的, 求出先手比后手多的石子数.
14+
leetcode-cn 877: https://leetcode-cn.com/problems/stone-game/
15+
#######################################################################
16+
'''
17+
def stoneGame(li):
18+
'''li: list, len(li)%2==0, sum(li)%2==1'''
19+
def f(p,q):
20+
ret = dp[p][q]
21+
if ret is None:
22+
if p+1==q:
23+
ret = abs(li[p]-li[q])
24+
else:
25+
# max min
26+
# take the first one
27+
n1 = li[p] + min(-li[p+1]+f(p+2,q),-li[q]+f(p+1,q-1))
28+
# take the last one
29+
n2 = li[q] + min(-li[p]+f(p+1,q-1),-li[q-1]+f(p,q-2))
30+
ret = max(n1,n2)
31+
dp[p][q] = ret
32+
#print(li[p:q+1],ret)
33+
return ret
34+
n = len(li)
35+
dp = [[None for i in range(n)] for i in range(n)]
36+
return f(0,n-1)
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
#coding: utf-8
2+
''' mbinary
3+
#######################################################################
4+
# File : wildcard_matching.py
5+
# Author: mbinary
6+
7+
# Blog: https://mbinary.coding.me
8+
# Github: https://github.com/mbinary
9+
# Created Time: 2018-12-13 22:46
10+
# Description:
11+
wild card '*' matches 0 or any chars, and '?' matches any single char.
12+
#######################################################################
13+
'''
14+
15+
16+
'''
17+
idea
18+
19+
dynamic programming
20+
21+
dp[m+1][n+1]: bool
22+
23+
i:n, j:m
24+
dp[j][i] indicates if s[:i+1] matches p[:j+1]
25+
26+
initial: dp[0][0] = True, dp[0][i],dp[j][0] = False
27+
only if p startswith '*', dp[1][0] = True.
28+
29+
if p[j] = '*': dp[j][i] = dp[j-1][i] or dp[j][i-1]
30+
elif p[j] = '?': dp[j][i] = dp[j-1][i-1]
31+
else : dp[j][i] = dp[j-1][i-1] and s[i] == p[j]
32+
'''
33+
34+
# leetcode: q44 https://leetcode.com/problems/wildcard-matching/description/
35+
36+
def isMatch(self, s, p):
37+
"""
38+
:type s: str
39+
:type p: str pattern str including wildcard
40+
:rtype: bool
41+
"""
42+
n,m = len(s),len(p)
43+
last = [False]*(n+1)
44+
last[0] = True
45+
for j in range(m):
46+
if p[j]=='*':
47+
for i in range(n):
48+
last[i+1] = last[i+1] or last[i]
49+
elif p[j]=='?':
50+
last.pop()
51+
last.insert(0,False)
52+
else:
53+
li = [False]
54+
for i in range(n):
55+
li.append( last[i] and p[j]==s[i])
56+
last = li
57+
return last[-1]

math/README.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Number Theory
2+
>See more details on [my blog](https://mbinary.coding.me/number-theory.html)
3+
4+
## gcd.py
5+
- gcd(a,b)
6+
- xgcd(a,b): return gcd(a,b),x,y, where ax+by = gcd(a,b)
7+
8+
## isPrime
9+
- isPrime(n): using Miller-Rabin primalrity test.
10+
- primeSieve
11+
12+
## euler.py
13+
- phi(n): euler function
14+
- sigma(n): return the sum of all factors of n
15+
16+
## factor.py
17+
- factor(n): return a list of numbers that are the factorization of n
18+
19+
## modulo_equation.py
20+
Solving modulo equations
21+
22+
```python
23+
solver = solve()
24+
res = solver.solveLinear(3,6,9)
25+
26+
res = solver.solveHigh(1,8,3,11)
27+
28+
res = solver.solveGroup([(5,11,2),(3,8,5),(4,1,7)])
29+
```
30+
31+
# Permutation
32+

0 commit comments

Comments
 (0)