Some attempt towards making sentience
This project implements an advanced reinforcement learning agent that solves the CartPole environment using a unique combination of deep Q-learning and emotional state adaptation. The agent incorporates metacognitive feedback and emotional influences to enhance its learning process.
- Emotional Learning System: Dynamic emotional state adjustment based on rewards
- Curiosity
- Fear
- Reward-seeking behavior
- Advanced Neural Network Architecture:
- Multi-layer perceptron with dropout layers
- Experience replay buffer
- Target network for stable learning
- Metacognitive Feedback Loop: Real-time performance monitoring and adaptation
- Visual Rendering: Real-time visualization of the CartPole environment
python class ImprovedNN(nn.Module): def init(self, input_size=4, hidden_size=128, output_size=2): super(ImprovedNN, self).init() self.network = nn.Sequential( nn.Linear(input_size, hidden_size), nn.ReLU(), nn.Dropout(0.1), nn.Linear(hidden_size, hidden_size), nn.ReLU(), nn.Dropout(0.1), nn.Linear(hidden_size, output_size) )
-
Experience Replay Buffer
- Stores and samples past experiences
- Enables batch learning
- Capacity: 10,000 experiences
-
Emotional State System
- Dynamic adjustment based on performance
- Influences exploration vs exploitation
- Affects reward scaling
-
Training Parameters
- Learning rate: 0.0005
- Batch size: 64
- Epsilon decay: 0.995
- Target network update frequency: 10 episodes
- Achieves high scores (475+) in CartPole environment
- Stable learning through emotional adaptation
- Quick convergence to optimal policy
- Python 3.x
- PyTorch
- OpenAI Gym
- Pygame (for visualization)
- NumPy
-
Initialization:
- Sets up the CartPole environment
- Initializes neural networks and emotional state
- Prepares experience replay buffer
-
Training Loop:
- Collects experiences through environment interaction
- Updates emotional state based on rewards
- Trains neural network using experience replay
- Updates target network periodically
-
Emotional Adaptation:
def update_emotional_state(reward): if reward > 0: emotional_state['reward_seeking'] += 0.1 emotional_state['fear'] = max(0, emotional_state['fear'] - 0.05) emotional_state['curiosity'] = min(1.0, emotional_state['curiosity'] + 0.02)
Feel free to submit issues and enhancement requests!
- OpenAI Gym for the CartPole environment
- PyTorch team for the deep learning framework