-
Notifications
You must be signed in to change notification settings - Fork 65
Description
Hey there! First of all: thank you so much for releasing this, documenting things, putting it on pypi, etc. etc., really appreciate it :)
I've been trying to get a fun "search and rescue" example working where a drone with a search radius explores a map until it finds the objective. Right now I am having trouble getting the state input properly... assuming I have the rest understood. It seems like I should keep doing DQNAgent.learn over and over until I am satisfied? Was kind of confused by the DQNA example with all the socketIO stuff, wasn't sure how that was driving the learning.
# some psuedo code
height = width = 40
map = numpy array[40, 40]
ACTIONS = ('up', 'down', 'left', 'right')
agent = DQNAgent(height * width, len(self.ACTIONS))
while True:
state = map.copy()
action = self.agent.get_action(state)
reward = 0 # not sure what to set this on the first "learn"
drone.do_action(ACTIONS[action])
# Get state after action has changed it
next_state = map.copy()
reward = drone.get_current_reward()
self.agent.learn(state, action, reward, next_state)Wrong number of dimensions: expected 2, got 3 with shape (1, 40, 40).
If you're feeling crazy here's the actual source.
I may have this all ass backwards, apologies if this is a silly question.
