Inverted-Pendulum Project

What is an inverted pendulum?

An inverted pendulum is a dynamic system with the center of mass above its pivot point, generally constrained to one rotational degree of freedom. While a pendulum is stable, inverting a pendulum requires active balancing applied to it. Generally, this is performed by applying torque to the pendulum directly, or shifting the base longitudinally.

History of Project

Final TD3 Performance

TD3.Inverted.Pendulum.v2.0.1.-.Carwyn.Collinsworth.720p.h264.youtube.1.mp4

Postsite / Retrospective Analysis

Reinforcement learning is INCREDIBLY CHALLENGING to transfer from simulation to reality - especially if you don't know the dynamics of the environment.

The biggest difficulty was matching my simulation to the environment:

Most inverted pendulum simulations (CartPole-v1 from gymnasium) have force as the input quantity. My hardware used a stepper motor, with discrete choices for step sizes and timing.
Due to the stepper motor, I also was constrained to a maximum speed.
The default parameters for the gymnasium environment do not seem to map to true dynamics. It seemed like my pendulum fell much faster than the simulation rod.
Traditional methods such as PID control loops are well known for a reason - they work, are simple to set up. The only annoyance is tuning.

Presite Updates

DQN Model balances pendulum, but cannot converge from hanging (stable) initialization in a reasonable amount of time and resources.

TD3 trained to convergence on episodes of length 1000 steps (0.02s resolution = 20s). Video below. Note: there are perturbations in the test, which explain some unexpected movements. These represent external forces acting on the pole, or noise in the system.

TD3_best_policy.mp4

Reward function that worked well for TD3:

reward = math.cos(theta % (2*math.pi)) + 1
if theta > -1 * math.pi/24 and theta < math.pi/24:
    reward += 10
if reward < 0: reward = 0

Note the lack of an explicit x-position component, which explains the full use of horizontal space by the agent.

2024

After a long intermission, I have decided to re-approach the project. However, I have an agenda in mind. As I will be traveling back to where the hardware is for one week only, I wish to simulate a solutions type of job, where the employee is dispatched to an on-site location, where they are responsible for quickly getting a system up and running. Therefore, I plan to prepare for this trip, and see how much I can minimize the on-site time necessary to get the pendulum balanced. Furthermore, I want to make the goals specific and as numerous as possible, so the goal is the following:

Enable the pendulum to balance "continuously". For testing purposes, I define this as "achieved" when the following test are completed successfully:
- 5 individual tests, all 5 must pass first try with a unique codebase.
- Sample an initial pendulum angle from vertical within the range of [-10,10] degrees.
- Sample an initial longitudinal position within the middle 30% of the total range.
- Set up the initial conditions by hand, and release contact with the pole as soon as movement is perceived after starting the program.
- The pole must balance for 60 seconds for "success" to be achieved.
The same algorithm should enable the pole to swing up from a stationary position. I define this as "achieved" when the following test are completed successfully:
- Begin with the pendulum stationary hanging at the lowest point (180 degrees from vertically upward (azimuth/heading/yaw(?)))
- Execute the program, and without any contact from the tester, the pendulum should reach a balanced (varying <10 degrees from vertical over at least 10 seconds) within 30 seconds.
Just for funsies, a separate algorithm should enable the pendulum to make rotations of the pendulum. I define this as "achieved" when the following test are completed successfully:
- Starting the test with the rod stationary in the downwards position, the rod should make >=10 full rotations in under 45 seconds.
Reach goal: if I achieve this, consider me a god (lowercase g). Modify the system somehow (make it visually based perhaps?) to enable a double pendulum to balance. This will involve hardware modifications to add another joint.

2018

In the naive implementation (2018), a deterministic model was created as a decision tree, with a few possible movements based from binning various encoder angles. It was during this year when the hardware component was constructed, and the results were presented in an independent research project class.

Unfortunately, the feedback loop was not responsive enough to maintain a balanced state, however the proof of concept was there.

During this phase, a few iterations of design were explored. Primarily, testing was performed on both a Raspberry PI and an Arguino Mega. There were moments of significant garbage collection, which justified use of the Arduino. Furthermore, the initial design of the hardware used a drawer slide and a servo, which was not an ideal combo due to the control of the servo, and the significant mass of the drawer slide, which contributed to a system with minimal ability to impart movement to the pendulum.

y2mate.so.-.Inverted.Pendulum.Run-dlMxPfKm4ik-360p-1718676656.mp4

Linear.Inverted.Pendulum.mp4

These can also be viewed on my youtube channel.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
Documents		Documents
ROS2_Documentation		ROS2_Documentation
arduino_code		arduino_code
models		models
preliminary_tests		preliminary_tests
ros_ws/src		ros_ws/src
.gitignore		.gitignore
2024_Solutions_Planning.md		2024_Solutions_Planning.md
README.md		README.md
notes.md		notes.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Inverted-Pendulum Project

What is an inverted pendulum?

History of Project

Final TD3 Performance

Postsite / Retrospective Analysis

Presite Updates

2024

2018

About

Uh oh!

Releases

Packages

Uh oh!

Languages

carwyn987/Inverted_Pendulum

Folders and files

Latest commit

History

Repository files navigation

Inverted-Pendulum Project

What is an inverted pendulum?

History of Project

Final TD3 Performance

Postsite / Retrospective Analysis

Presite Updates

2024

2018

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages