Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
91 commits
Select commit Hold shift + click to select a range
a6cb33a
Code organization
farari7 Nov 16, 2022
7dd2c5b
Code Changes after first Discussion
farari7 Nov 21, 2022
10dd270
Implement Iteration Loop for Learning Estimation
farari7 Nov 21, 2022
5704da8
Better comments for the code
farari7 Nov 22, 2022
a41ff81
Error Correction
farari7 Nov 22, 2022
7fb91c6
learning plot for different strategies added
sahar-jahani Nov 23, 2022
a5dbb45
Error Correction
farari7 Nov 23, 2022
516e4a1
merge branches
farari7 Nov 23, 2022
3b0ed0b
Final Results
farari7 Nov 23, 2022
4637b8a
Final Results
farari7 Nov 23, 2022
79573a9
fixed a mistake
sahar-jahani Nov 23, 2022
3f12b97
Update environmentModel.py
EdwardPlumb Nov 23, 2022
e907664
Update learningAgents.py
EdwardPlumb Nov 23, 2022
e532cc4
Update LearningAgent_Simulations.ipynb
EdwardPlumb Nov 24, 2022
af20112
Update environmentModel.py
EdwardPlumb Nov 24, 2022
1a1ba8a
number of plots corrected
sahar-jahani Nov 24, 2022
7532734
Update LearningAgent_Simulations.ipynb
EdwardPlumb Nov 24, 2022
c243e8a
Update environmentModel.py
KaterinaPapadaki Nov 24, 2022
b1d7d70
Update environmentModel.py
EdwardPlumb Nov 24, 2022
669697b
returnsComputation updated
sahar-jahani Nov 25, 2022
c5549bf
Update environmentModel.py
EdwardPlumb Nov 25, 2022
ef57777
Update environmentModel.py
EdwardPlumb Nov 25, 2022
05fada7
Update learningAgents.py
EdwardPlumb Nov 25, 2022
a4eeee7
Merge branch 'temp' into learningAgents
sahar-jahani Nov 25, 2022
1634a4e
Update Results.ipynb
EdwardPlumb Nov 25, 2022
80fff8a
Update learningAgents.py
EdwardPlumb Nov 25, 2022
2179bbb
Solver Class
farari7 Nov 30, 2022
f3af4ac
fixed some syntax errors to be able to run the code
sahar-jahani Nov 30, 2022
97fe10f
mixed avdersary
sahar-jahani Dec 13, 2022
2b948c0
error in opening the file
sahar-jahani Dec 17, 2022
a94039e
saving the nns
sahar-jahani Dec 17, 2022
6482cb5
Update environmentModel.py
EdwardPlumb Dec 18, 2022
ad7c319
Update learningAgents.py
EdwardPlumb Dec 18, 2022
1cf7382
Update environmentModel.py
EdwardPlumb Dec 18, 2022
228936f
Update environmentModel.py
EdwardPlumb Dec 18, 2022
2df1fed
Update learningAgents.py
EdwardPlumb Dec 18, 2022
93bde44
stage added in the state
sahar-jahani Dec 19, 2022
5772ed2
Update learningAgents.py
EdwardPlumb Dec 21, 2022
8d203e7
adversary history added to state
sahar-jahani Dec 21, 2022
f80f715
Merge branch 'learningAgents' of https://github.com/stengel/EquiLearn…
sahar-jahani Dec 21, 2022
ade17f0
normalized states
sahar-jahani Dec 22, 2022
fa754f4
check trained network is added
sahar-jahani Dec 23, 2022
965f8d8
Small changes + testing
EdwardPlumb Jan 8, 2023
f89b230
new test folder
sahar-jahani Jan 18, 2023
093b4d8
Create learningAgentsBase.py
sahar-jahani Jan 20, 2023
658363f
added actor critic
sahar-jahani Feb 1, 2023
58b044c
mainGame is added. writing PGM
sahar-jahani Mar 31, 2023
0f0d13f
delete the files that are needed
sahar-jahani Apr 11, 2023
a78e26e
end effect can be seen in base
sahar-jahani Apr 24, 2023
80bc144
Template file can be used to run
sahar-jahani Apr 24, 2023
e7f3ab5
working version
sahar-jahani May 3, 2023
c0271be
population game is debugged
sahar-jahani May 15, 2023
cc99a51
global variables class added. small errors fixed
sahar-jahani May 15, 2023
3c04c2e
Update globals.py
sahar-jahani May 15, 2023
9f3de55
small changes
sahar-jahani May 16, 2023
4442f18
actionStep bug fixed
sahar-jahani May 17, 2023
114aeea
adding multiprocessing. not complete yet
sahar-jahani May 24, 2023
c6f27d1
mp not working yet.
sahar-jahani May 30, 2023
b3d66fa
multi processing is working
sahar-jahani May 31, 2023
07f2819
reduced the memory and some small changes
sahar-jahani Jun 1, 2023
7e72093
delete cache
sahar-jahani Jun 1, 2023
efc6b75
txs auto checkin
sahar-jahani Jun 19, 2023
40e4b32
replay buffer + multiprocessing
sahar-jahani Jul 10, 2023
f63bc76
txs auto checkin
sahar-jahani Jul 12, 2023
7ec39a8
txs auto checkin
sahar-jahani Jul 22, 2023
afe2cef
baseline3 is added
sahar-jahani Aug 1, 2023
a07d841
txs auto checkin
sahar-jahani Aug 1, 2023
61f2490
baseline3 gamma is set NOW
sahar-jahani Aug 2, 2023
4a2806f
small changes
sahar-jahani Aug 2, 2023
dc3dbd2
short onehot encoding failed
sahar-jahani Aug 8, 2023
db1c9ca
error with seed
sahar-jahani Aug 9, 2023
f7b5323
fixed sum errors
sahar-jahani Aug 9, 2023
f11d510
continous model with multi processing added
sahar-jahani Aug 21, 2023
5f4aece
CoSAC vs CoPPO works
sahar-jahani Aug 29, 2023
a0eaeae
ready to run the population_game
sahar-jahani Sep 8, 2023
53ba4ac
giving No of processes from arguments
sahar-jahani Sep 15, 2023
03bebc1
txs auto checkin
sahar-jahani Sep 16, 2023
f0282b8
txs auto checkin
sahar-jahani Sep 18, 2023
8d7b741
txs auto checkin
sahar-jahani Sep 18, 2023
e7bc2c7
txs auto checkin
sahar-jahani Sep 18, 2023
b81c268
running checkpoint
sahar-jahani Sep 19, 2023
a075710
Oct2 checkpoint added
sahar-jahani Oct 10, 2023
118ea46
old files that I forgot to upload
sahar-jahani Oct 16, 2023
274d9ea
new structure and changes to test different params
sahar-jahani Nov 2, 2023
9614f6d
for tuning lr and memory the src files are different
sahar-jahani Nov 8, 2023
ba2e2f1
double oracle with new structure
sahar-jahani Nov 28, 2023
c205444
source files for extend_game are added
sahar-jahani Mar 1, 2024
987f686
new changes
sahar-jahani Mar 21, 2024
1835fe1
Latest Double Oracle code
sahar-jahani Apr 10, 2024
568c7a5
equilibria structure changed, the changes should be applied to DO.py
sahar-jahani Apr 18, 2024
ad36d51
equi changes applied to DO.py
sahar-jahani Apr 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
42 changes: 42 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
__pycache__/*
*.aux
*.nav
*.snm
*.toc
*.blg
*.bbl
*.gif

*.out
*.log
*.sync*

*/*.aux
*/*.nav
*/*.snm
*/*.toc
*/*.blg
*/*.bbl
*/*.gif

*/*.out
*/*.log
*/*.sync*

*/*/*.aux
*/*/*.nav
*/*/*.snm
*/*/*.toc
*/*/*.blg
*/*/*.bbl
*/*/*.gif

*/*/*.out
*/*/*.log
*/*/*.sync*


*.pyc
tex/PGM.pdf
tex/PGM.pdf
tex/PGM.pdf
162 changes: 162 additions & 0 deletions DoubleOracle/DO.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@

import numpy as np
from stable_baselines3 import SAC, PPO
import time
from src.environments import ConPricingGame
import src.globals as gl
import src.classes as cl
import os


def initial_matrix(env_class, random=False):
if not random:

strt1 = cl.Strategy(
cl.StrategyType.static, model_or_func=cl.myopic, name="myopic")
strt2 = cl.Strategy(
cl.StrategyType.static, model_or_func=cl.const, name="const", first_price=132)
strt3 = cl.Strategy(
cl.StrategyType.static, model_or_func=cl.guess, name="guess", first_price=132)
# strt4 = cl.Strategy(
# cl.StrategyType.static, model_or_func=cl.spe, name="spe")
init_low = [strt1, strt2, strt3]
init_high = [strt1, strt2, strt3]
else:
model_name = f"rndstart_{job_name}"
log_dir = f"{gl.LOG_DIR}/{model_name}"
model_dir = f"{gl.MODELS_DIR}/{model_name}"
if not os.path.exists(f"{model_dir}.zip"):
train_env = env_class(tuple_costs=None, adversary_mixed_strategy=None, memory=12)
model = SAC('MlpPolicy', train_env,
verbose=0, tensorboard_log=log_dir, gamma=gl.GAMMA, target_entropy=0)
model.save(model_dir)

strt_rnd = cl.Strategy(strategy_type=cl.StrategyType.sb3_model,
model_or_func=SAC, name=model_name, action_step=None, memory=12)

init_low = [strt_rnd]
init_high = [strt_rnd]

low_strts, high_strts = db.get_list_of_added_strategies()
return cl.BimatrixGame(
low_cost_strategies=init_low+low_strts, high_cost_strategies=init_high+high_strts, env_class=env_class)


if __name__ == "__main__":

gl.initialize()

env_class = ConPricingGame

num_rounds = 3
num_procs = 1
start_random = True
job_name = "test"

db_name = job_name+".db"
db = cl.DataBase(db_name)
cl.set_job_name(job_name)
cl.create_directories()
equilibria = []

# params
lrs = [0.0003, 0.00016]
memories = [12, 18]
algs = [SAC]

start_game = initial_matrix(env_class=env_class, random=start_random)

bimatrix_game = cl.load_latest_game(game_data_name=f"game_{job_name}", new_game=start_game)

cl.prt("\n" + time.ctime(time.time())+"\n"+("-"*50)+"\n")

all_equilibria = bimatrix_game.compute_equilibria()
equilibria = all_equilibria[:min(len(all_equilibria), gl.NUM_TRACE_EQUILIBRIA)]
game_size = bimatrix_game.size()

# low_cost_probabilities, high_cost_probabilities, low_cost_payoff, high_cost_payoff = bimatrix_game.compute_equilibria()
for round in range(num_rounds):
cl.prt(f"Round {round} of {num_rounds}")

added_low = 0
added_high = 0
# for equilibrium in dictionaries:
for equi in equilibria:
new_equi_low = 0
new_equi_high = 0

# low_prob_str = ", ".join(
# map("{0:.2f}".format, equi["low_cost_probs"]))
# high_prob_str = ", ".join(
# map("{0:.2f}".format, equi["high_cost_probs"]))
cl.prt(
f'equi: {str(equi.row_support)}, {str(equi.col_support)}\n payoffs= {equi.row_payoff:.2f}, {equi.col_payoff:.2f}')

# train a low-cost agent
high_mixed_strat = cl.MixedStrategy(
strategies_lst=bimatrix_game.high_strategies, probablities_lst=((equi.col_probs+([0]*added_high)) if
added_high > 0 else equi.col_probs))

for alg in algs:
for lr in lrs:
for mem_i, memory in enumerate(memories):

print(f'training low-cost agents with alg={str(alg)}, lr={lr:.4f}, memory={memory}')

results = cl.train_processes(db=db, env_class=env_class, costs=[gl.LOW_COST, gl.HIGH_COST],
adv_mixed_strategy=high_mixed_strat, target_payoff=equi.row_payoff,
num_procs=num_procs, alg=alg, lr=lr, memory=memory)
for result in results:
acceptable, agent_payoffs, adv_payoffs, agent_strategy, expected_payoff, base_agent_name = result
if acceptable:
new_equi_low += 1
added_low += 1
bimatrix_game.low_strategies.append(agent_strategy)
bimatrix_game.add_low_cost_row(agent_payoffs, adv_payoffs)
cl.prt(
f'low-cost player {agent_strategy.name} , payoff= {expected_payoff:.2f} added, base={base_agent_name} ,alg={str(alg)}, lr={lr:.4f}, memory={memory}')

# train a high-cost agent
low_mixed_strat = cl.MixedStrategy(
strategies_lst=bimatrix_game.low_strategies, probablities_lst=((equi.row_probs+([0]*added_low)) if added_low > 0 else equi.row_probs))

for alg in algs:
for lr in lrs:
for memory in memories:
print(f'training high-cost player with alg={str(alg)}, lr={lr:.4f}, memory={memory}')

results = cl.train_processes(db=db, env_class=env_class, costs=[gl.HIGH_COST, gl.LOW_COST],
adv_mixed_strategy=low_mixed_strat, target_payoff=equi.col_payoff,
num_procs=num_procs, alg=alg, lr=lr, memory=memory)
for result in results:
acceptable, agent_payoffs, adv_payoffs, agent_strategy, expected_payoff, base_agent_name = result
if acceptable:
new_equi_high += 1
added_high += 1
bimatrix_game.high_strategies.append(agent_strategy)
bimatrix_game.add_high_cost_col(adv_payoffs, agent_payoffs)

cl.prt(
f'high-cost player {agent_strategy.name} , payoff= {expected_payoff:.2f} added, base={base_agent_name}, alg={str(alg)}, lr={lr:.4f}, memory={memory}')

# because high_mixed_strt is defined before the changes to bimatrix.high_strategies. (error in str(high_mixed))
if new_equi_high > 0:
high_mixed_strat.strategy_probs += [0]*new_equi_high

# if new_equi_low>0 or new_equi_high>0:
# equilibria.append(
# [equi["low_cost_probs"], equi["high_cost_probs"], equi["low_cost_payoff"], equi["high_cost_payoff"]])
# to do: add the equilibria to the db
db.insert_new_equi(game_size=game_size, low_strategy_txt=str(low_mixed_strat), high_strategy_txt=str(
high_mixed_strat), low_payoff=equi.row_payoff, high_payoff=equi.col_payoff, low_new_num=new_equi_low, high_new_num=new_equi_high)

if added_low == 0 and added_high == 0:
gl.N_EPISODES_BASE *= 1.1
gl.N_EPISODES_LOAD *= 1.1
else:
all_equilibria = bimatrix_game.compute_equilibria()
equilibria = all_equilibria[:min(len(all_equilibria), gl.NUM_TRACE_EQUILIBRIA)]
game_size = bimatrix_game.size()

all_equilibria = bimatrix_game.compute_equilibria()
equilibria = all_equilibria[:min(len(all_equilibria), gl.NUM_TRACE_EQUILIBRIA)]
Binary file added DoubleOracle/NOV24.db
Binary file not shown.
Binary file added DoubleOracle/game_NOV24.pickle
Binary file not shown.
Loading