String - Substring

Catalogue
  1. 1. is Substring
    1. 1.1. 28. Implement strStr()
      1. 1.1.1. Solution1. Naive
      2. 1.1.2. Solution2. KMP
      3. 1.1.3. Solution3. Robin-Karp
  2. 2. Sliding Window
    1. 2.1. 3. Longest Substring Without Repeating
    2. 2.2. 159. Longest Substring with At Most Two Distinct Characters
    3. 2.3. 340. Longest Substring with At Most K Distinct Characters
    4. 2.4. 76. Minimum Window Substring
    5. 2.5. 438. Find All Anagrams in a String

is Substring

28. Implement strStr()

  • input : 2 string

    Solution1. Naive

  • Time : O(m * n)
  • try every possible start index in haystack
1
2
3
4
5
6
7
class Solution(object):
def strStr(self, haystack, needle):
m, n = len(needle), len(haystack)
for i in xrange(n-m+1):
if haystack[i:i+m] == needle:
return i
return -1

Solution2. KMP

  • Time : O(m + n)
  • know the basic idea of KMP and can walk through some test cases
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    class Solution(object):
    def getNext(self, pattern):
    m = len(pattern)
    next = [0] * m
    k = 0
    for i in xrange(1, m):
    while k > 0 and pattern[k] != pattern[i]:
    k = next[k-1]
    if pattern[k] == pattern[i]:
    k += 1
    next[i] = k
    return next

    def strStr(self, haystack, needle):
    if not needle:
    return 0
    next = self.getNext(needle)
    n = len(haystack)
    p = 0
    for i in xrange(n):
    while p > 0 and needle[p] != haystack[i]:
    p = next[p-1]
    if needle[p] == haystack[i]:
    p += 1
    if p == len(needle):
    return i-len(needle)+1
    return -1

Solution3. Robin-Karp

  • Time : O(m+n)
  • Principle: if we can hash the short string to a hashed value (e.g. integer, which is unique). Then we can just compare each substring of s1’s hashed value and compare it with s2’s hash value.
  • Assumption: only lower case letter (base = 26 a–z)
  • Each time you slide the window, add right, delete left and times 26 in the middle!!!

Sliding Window

A string has $n^2$ substrings! These type of question ask you to find a subtring meet some certain condition!

  1. When to move the right pointer and when to move the left pointer
  2. use hashset
1
2
3
4
5
6
7
8
9
10
11
12
13
14
class Solution(object):
def lengthOfLongestSubstring(self, s):
visited = set()
left = right = 0
res = 0
while right < len(s):
if s[right] not in visited:
visited.add(s[right])
res = max(res, right-left+1)
right += 1
else:
visited.remove(s[left])
left += 1
return res

159. Longest Substring with At Most Two Distinct Characters

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class Solution(object):
def lengthOfLongestSubstringTwoDistinct(self, s):
dic = {}
n = len(s)
cnt = 0
left = right = 0
res = 0
while right < n:
if s[right] in dic:
dic[s[right]] += 1
right += 1
else:
while cnt >= 2:
dic[s[left]] -= 1
if dic[s[left]] == 0:
cnt -= 1
del dic[s[left]]
left += 1
dic[s[right]] = 1
cnt += 1
right += 1
res = max(res, right-left)
return res

340. Longest Substring with At Most K Distinct Characters

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class Solution(object):
def lengthOfLongestSubstringKDistinct(self, s, k):
chars = [0] * 256
start = 0
res = cnt = 0
for i in xrange(len(s)):
idx = ord(s[i])
if chars[idx] == 0:
cnt += 1
chars[idx] += 1

while cnt > k:
idx = ord(s[start])
chars[idx] -= 1
if chars[idx] == 0:
cnt -= 1
start += 1
res = max(res, i-start+1)
return res

76. Minimum Window Substring

  • it will be easy to let right pointer be the iteration pointer
  • cnt is the flag to show whether it satisfies the condition
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    class Solution(object):
    def minWindow(self, s, t):
    tmap = {}
    for c in t:
    tmap[c] = tmap.get(c, 0) + 1

    cnt = len(t)
    start = 0
    res, ret = len(s)+1, ""
    for i in xrange(len(s)):
    if s[i] in tmap:
    tmap[s[i]] -= 1
    if tmap[s[i]] >= 0:
    cnt -= 1

    while cnt == 0:
    win = i-start+1
    if win < res:
    res = win
    ret = s[start:i+1]
    if s[start] in tmap:
    if tmap[s[start]] >= 0:
    cnt += 1
    tmap[s[start]] += 1
    start += 1
    return ret

438. Find All Anagrams in a String

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
class Solution(object):
def findAnagrams(self, s, p):
if len(s) < len(p):
return []

dic = collections.defaultdict(int)
for ch in p:
dic[ch] += 1

cnt = len(dic.keys())
k = len(p)
for i in range(k):
if s[i] in dic:
dic[s[i]] -= 1
if dic[s[i]] == 0:
cnt -= 1

res = []
if cnt == 0:
res.append(0)

for i in range(k, len(s)):
if s[i] in dic:
dic[s[i]] -= 1
if dic[s[i]] == 0:
cnt -= 1

if s[i-k] in dic:
dic[s[i-k]] += 1
if dic[s[i-k]] == 1:
cnt += 1

if cnt == 0:
res.append(i-k+1)
return res
Share