Reservoir Sampling
Shuffling Algorithm
pokers(spades, hearts, diamonds, clubs)
Iteration 1
Call random(0-51), lets say, random_numbers1 = 5
P(every card can showup in position 51) = 1/52Iteration 2
Call random(0-50), lets say, random_numbers2 = 8
Every card was NOT selected during previous iteration = 1 - 1/52
P(every card showup in position 50) = (1-1/52) * 1/51 = 1/52Iteration 3
Every card was NOT selected during previous iteration = 1 - 1/52
Every card was NOT selected during previous iteration = 1 - 1/51
P(every card showup in position 50) = (1-1/52) (1-1/51) 1/50 = 1/52
…..
- Select random k elements in an array of size n
Sampling for an unlimited data flow
How to do sampling for an unlimited data flow and when reading the n-th element we are required to return one random number among all numbers read so far, such that the probability of returning any element read so far is 1/n.
- O(1) sapce complexity
- t = 0 call random[0-0] = random_num1, if random_num1 == 0 -> solu = input[0]
P(input[0] will be returned as the solu) = 100% - t = 1 call random[0-1] = random_num2, if random_num2 == 0 -> solu = input[1]
P(input[0] will be returned as the solu) = 1 - 50%
P(input[1] will be returned as the solu) = 50% - t = 2 call random[0-2] = random_num3, if random_num3 == 0 -> solu = input[2]
P(input[0] will be returned as the solu) = 1/2 (1 - 1/3) = 1/3
P(input[1] will be returned as the solu) = 1/2 (1 - 1/3) = 1/3
P(input[2] will be returned as the solu) = 1/3
….
Return k out of n log
- same thing as above!
- first k element put into the reservoir
- when new element comes in, generate a random number from 0 to its index
- if index < k, replace the new element with the k postion element
398. Random Pick Index
1 | class Solution(object): |
Return a random largest number’s index
Randomization
Random(5) -> Random(25) -> Random(7)
0 1 2 3 4
5 6 7 8 9
10 11 12 13 14
15 16 17 18 19
20 21 22 23 24
- call Random(5) for the 1st time = random_num1[0—4] = row number
- call Random(5) for the 2nd time = random_num2[0—4] = col number
generated_number = r1 * 5 + r2
470. Implement Rand10() Using Rand7()
1 | class Solution(object): |
Design Random(1,000,000) with Random(5)
- $log_5(1000000)$
528. Random Pick with Weight
- PreSum + Binary Search
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21class Solution(object):
def __init__(self, w):
if not w: return
s = [w[0]]
for weight in w[1:]:
s.append(s[-1] + weight)
self.sum = s
print(self.sum)
def pickIndex(self):
num = random.randint(1, self.sum[-1])
start, end = 0, len(self.sum)-1
while start < end:
mid = (start + end) / 2
if self.sum[mid] < num:
start = mid + 1
else:
end = mid
return start
Random
380. Insert Delete GetRandom O(1)
1 | class RandomizedSet(object): |