Given a set of actions that a member in the network is capable of taking and the probability that each action may be taken at the given time step, the function will select an output based on the probability.
kero.multib.nDnet.py def choose_action_continuous_probability(action_set, probability_set, N=1): return action_out, action_index
|action_set||List of strings, [a1, a2, …, aN] where each action ak.|
|probability_set||List of float, [p1, p2, …, pN] where each pk is a (possibly unnormalized) probability that action ak will occur. Normalization is done simply by pk/(p1+p2+…+pN).|
|N||Integer, the number of actions to be chosen.
Default = 1
|return action_out||If N=1¸ then this is a string, one chosen out of a1, a2, … aN
If N>1, then this is a list of strings [act1, …, actN] each chosen out of a1, a2, … aN.
|return action_index||If N=1, integer, else if N>1, list of integers.
Same as action_out, except that rather than the string identifier, the integer index corresponding to the action is returned.
In example 2 below, the actions and corresponding indices are “x”:0, “y”:1, “z”:2, “GG”:3.
Example Usage 1
Example Usage 2
In this example, the function is called 100000 times and we can see that for each action, the ratio of the number of an action to 100000 approaches the actual probability.
import numpy as np import kero.multib.nDnet as nd action_set=["x","y","z","GG"] probability_set=[0.25,0.25,0.5,1] number_of_appearances = np.zeros(len(action_set)) # print(number_of_appearances) ac_set,ac_ind=nd.choose_action_continuous_probability(action_set, probability_set, N=100000) count = 0 for ac,i in zip(ac_set,ac_ind): if count<10: print(ac,":",i) count = count + 1 number_of_appearances[i] = number_of_appearances[i] + 1 s = np.sum(number_of_appearances) fraction_of_appearances = [x/s for x in number_of_appearances] print("...") print("fraction list: ",fraction_of_appearances) p_norm = np.sum(probability_set) probability_norm = [x/p_norm for x in probability_set] print("probability norm list: ", probability_norm )
An example output is the following. As we use larger and larger sample size, the fraction of actions over the entire simulation will go closer to the actual probability that the each action occurs. We print the first 10 actions chosen.
x : 0 GG : 3 GG : 3 y : 1 z : 2 z : 2 GG : 3 GG : 3 GG : 3 z : 2 ... fraction list: [0.1272, 0.12569, 0.25048, 0.49663] probability norm list: [0.125, 0.125, 0.25, 0.5]
kero version: 0.5.1 and above