Skip to content

Conversation

@marimeireles
Copy link
Collaborator

No description provided.

…observations = 0.5 for partial observable agents
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

return histSjA_RewardTensor(self.baseenv, self.h)

def ObservationTensor(self):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this function to be able to generate different observation tensors for each agent.

@@ -0,0 +1,170 @@
# AUTOGENERATED! DO NOT EDIT! File to edit: ../../nbs/Environments/02_HeterogeneousObservationsEnv.ipynb.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the changes are within this file. It's largely adapting the ebase file to deal with multiple observations.

@@ -0,0 +1,127 @@
# AUTOGENERATED! DO NOT EDIT! File to edit: ../../nbs/Environments/12_MultipleObsSocialDilemma.ipynb.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file simply implements the social dilemma layer into the heterogeneous observation env. file.
I've initially tried to incorporate the "contract" idea because I saw this in the Uncertain Environment file, however, I don't really understand the dynamics of contract and I don't think it's fully functional, I need to work on it.
I thought it wasn't relevant for our project as IPD only has the one state .. Please let me know if I misunderstood this.

@marimeireles marimeireles marked this pull request as draft March 15, 2024 15:09
@marimeireles
Copy link
Collaborator Author

marimeireles commented Mar 15, 2024

I'm also a bit confused on whether it's possible to have observation tensors summing for numbers > 1 or < 1. I guess the only reason why we cannot is because of the generate_stochastic_observations... But we could change that.
I'm not sure if it makes sense, but I thought it could be possible for an agent to have for example, 0% chance of observing something. Or having tensors looking like [0, 0.8, 0.6, 0.4] or like [0.7,0.,0.,0.]. If this is not possible are there other reasons why it is not possible other than using generate_stochastic_observations in the step function?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant