[Low priority] Building off of deep q learning example

Hey there! First of all: thank you so much for releasing this, documenting things, putting it on pypi, etc. etc., really appreciate it :)

I've been trying to get a fun "search and rescue" example working where a drone with a search radius explores a map until it finds the objective. Right now I am having trouble getting the `state` input properly... assuming I have the rest understood. It seems like I should keep doing `DQNAgent.learn` over and over until I am satisfied? Was kind of confused by the DQNA example with all the socketIO stuff, wasn't sure how that was driving the learning.

``` python
# some psuedo code
height = width = 40
map = numpy array[40, 40]
ACTIONS = ('up', 'down', 'left', 'right')

agent = DQNAgent(height * width, len(self.ACTIONS))

while True:
    state = map.copy()
    action = self.agent.get_action(state)
    reward = 0 # not sure what to set this on the first "learn"

    drone.do_action(ACTIONS[action])

    # Get state after action has changed it
    next_state = map.copy()

    reward = drone.get_current_reward()

    self.agent.learn(state, action, reward, next_state)
```

However the problem is I get:
![image](https://cloud.githubusercontent.com/assets/2185159/10478264/1a780cba-7211-11e5-9d54-4e8bd9655fd1.png)

`Wrong number of dimensions: expected 2, got 3 with shape (1, 40, 40).`

If you're feeling crazy [here's the actual source](https://github.com/dev-coop/plithos/blob/feature/refactor/src/plithos/simulations/dqn_single_drone.py).

I may have this all ass backwards, apologies if this is a silly question.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Low priority] Building off of deep q learning example #18

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Low priority] Building off of deep q learning example #18

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions