Skip to content

Sampling action when using deep policy #99

@sunghwan87

Description

@sunghwan87

Hi, thank you for this package I'm really enjoying learning about active inference.

I deeply appreciate the contributors to this package.

However, I have a question when I tried to implement the explore-exploit task in this article (Smith et al., A step-by-step tutorial on active inference and its application to empirical data, https://doi.org/10.1016/j.jmp.2021.102632) which is already implemented in MATLAB and "pymdp".

I tried to run a loop for active inference of deep policy (two time-steps) according to the "complete recipe for active inference" as written in the "pymdp" tutorial notebook, but I found that the "sample_action" method of the "Agent" class only sample action from the first timestep of policy (each policy has the shape of (2,2), the first dim is the number of timesteps and the second dim is the number of factors) using "control.sample_policy" function as below:

(line 674-675, control.py)

for factor_i in range(num_factors):
     selected_policy[factor_i] = policies[policy_idx][0, factor_i]

My setting of the agent class was:

timepoints = [0,1,2]
agent = Agent(
    A = A_gm,
    B = B,
    C = C,
    D = D_gm,
    E = E,
    pA = pA,
    pD = pD,
    policies = policies,
    policy_len = policies[0].shape[0],
    inference_horizon = len(timepoints),  
    inference_algo="MMP",
    sampling_mode="full",
    modalities_to_learn=[1],
    use_BMA = True,
    policy_sep_prior = False,
)

In my thought, to sample the action of the other timestep in each policy, line 675 would be better if changed like this:

selected_policy[factor_i] = policies[policy_idx][timestep, factor_i]

If I didn't understand this package well, then please let me know how to correct it.

Thank you!

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions