In: Computer Science
Consider a one-dimensional random walker (using numpy) that can move every second. With probability pl = 1/3 it moves to the left, with probability pr = 1/3 it moves to right, and with probability pr = 1/3 it rests and does not move. Assuming at time t = 0, the random walker is at x = 0, plot (using matplotlib) the probability density function and the cumulative probability function for t = 10, t = 100, and t = 1000 seconds. Make just two plots; each showing all three time points. Remember that you need to simulate random walks many times to get good statistics. Make the same two plots for pl = 0, pr = 1/2, and pr = 1/2. Do you understand why these plots look different? The plots that you make should be designed well. For example, they should label curves, axes, etc. (USING PYTHON)
imagine we would stand on the node of document 1 and would randomly (uniform) decide which edge to take to get to another node. Since we have only two edges, the probability that we would go to topic 1 is 0.5 just like the probability that we would go to topic 2. In total, we have a sum of 1.
If we do this imagination for all nodes, we end up with a matrix M with the starting points as columns and the probabilities in the respective rows. This matrix can be created in python with the following code:
import numpy as np
M = np.array([
[0, 0, 0, 1./3., 0.5],
[0, 0, 0, 1./3., 0],
[0, 0, 0, 1./3., 0.5],
[0.5, 1, 0.5, 0, 0],
[0.5, 0, 0.5, 0, 0]
])
The usual random walk wouldn’t go just one step but a couple of steps. Hence, the probability that you would end up on node document 3 from document 1 after 2 steps is the combination of these probabilities. To calculate the probabilities where a walker would end up after a couple of steps, the following script can help.
v = np.array([1., .0, .0, .0, .0]) # starting point
for step in range(n):
v = np.dot(M, v)
The variable v tells us the probability of the current nodes. As we are starting at document 1, the probability of being at that node is 100%. The dot multiplication is taking this probability and performs a vector multiplication with the probabilities of the next possible node. Hence, we get a vector which tells us the probabilities where we could end up based on the probabilities where we have been before. For the first step, this is simply the first column of M. From there on, we could be at topic 1 or topic 2 and have to multiply this probability with the probabilities of the next paths.