Pong Agent with Reinforcement Learning

This project implements an AI agent to play a version of Pong using PyTorch and PyGame, utilizing reinforcement learning concepts. The agent learns to control the left paddle and maximize its score over training epochs by interacting with the environment.

Features

Reinforcement Learning Framework:
- The agent uses a policy gradient-based reinforcement learning approach.
- Actions are sampled from a probability distribution generated by the neural network.
Reward Mechanism:
- +100 points for scoring a goal.
- -500 points for missing the ball.
- +50 points for hitting the ball.
- Small rewards or penalties for moving closer to or away from the ball to encourage strategic positioning.
Training Process:
- Rewards are accumulated and discounted over time to emphasize future rewards.
- The loss function uses the log-probabilities of the agent's actions, scaled by the discounted reward, to optimize the policy.
Neural Network Design:
- Input: The state of the game (paddle position, ball position, and ball velocity).
- Output: A probability distribution over possible actions (up or down).
- Architecture: Fully connected layers with ReLU activations and a Softmax output layer.
PyGame Visualization:
- Real-time display of the Pong game, including paddle and ball movements.
- Scores for both the agent (left paddle) and the opponent (right paddle) are displayed.

How It Works

The game starts, and the agent observes the game state: paddle position, ball position, and ball velocity.
The agent outputs probabilities for moving the paddle up or down.
An action is sampled from the probability distribution, and the paddle is updated accordingly.
Rewards are calculated based on the agent's actions and the game outcome.
At the end of each game, the agent uses the accumulated rewards to optimize its policy.

Neural Network

The agent is implemented as a neural network with the following architecture:

Input Layer: 5 features (paddle position, ball position, and ball velocity).
Hidden Layers: Two fully connected layers with ReLU activations.
Output Layer: A probability distribution over 3 actions (up, down, or no action) using a Softmax function.

Reward System

The reward system incentivizes the agent to:

Score points by returning the ball effectively.
Position itself optimally near the ball to increase its chances of returning it.
Avoid penalties by missing the ball or failing to move strategically.

Usage

Install Dependencies:
- PyTorch: pip install torch
- PyGame: pip install pygame
Run the Code:
- Execute the Python script to start training and visualizing the Pong game.
Adjust Parameters:
- Modify hyperparameters like learning rate, discount factor, and paddle speed to experiment with different training behaviors.

Visualization

The game interface includes:

A dynamic Pong game environment with moving paddles and a bouncing ball.
Real-time updates of scores for both the AI agent and the opponent.

Future Enhancements

Improve the reward function for more sophisticated strategies.
Train the agent using more advanced reinforcement learning algorithms like DDPG or PPO.
Add multiplayer support or implement a more competitive opponent.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
optimized		optimized
volatile		volatile
README.md		README.md
game.png		game.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pong Agent with Reinforcement Learning

Features

How It Works

Neural Network

Reward System

Usage

Visualization

Future Enhancements

About

Uh oh!

Releases

Packages

Languages

sunnythecode/PongAgent

Folders and files

Latest commit

History

Repository files navigation

Pong Agent with Reinforcement Learning

Features

How It Works

Neural Network

Reward System

Usage

Visualization

Future Enhancements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages