Skip to content
This repository was archived by the owner on Nov 1, 2024. It is now read-only.
This repository was archived by the owner on Nov 1, 2024. It is now read-only.

Difference between train and test #39

@FlashPlayer13

Description

@FlashPlayer13

Hi Anton,

I am not clear about what we are using as input for value network in train and test phases.

In train phase we are using both public beliefs as inputs. For example in poker we use ranges for each agent. This ranges are vectors with mostly non-zero numbers in most of cases.
But in test phase we know our exact infostate. And for example in poker our range contains all zeros except one hand with 1. At the same time our opponent's range is still a vector with mostly non-zero numbers.

And my question is:
Is it ok to train with input which filled with non-zeros, but test with input with half of zeros (our range)?

Or maybe we should sample hole cards for each train iteration and therefore use as input hero range as all zeros except one and opponent range as full distribution between all possible hands?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions