[pull] master from QueensGambit:master by pull[bot] · Pull Request #18 · DrMeosch/CrazyAra

pull · 2020-07-08T19:10:21Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

* Thread which prints out "readyok" after a given amount of ms unless killed. * This is to avoid running into time outs of e.g. cutechess on Multi- GPU systems when deserializing complex NN architectures.

removed unused variable "updateIntervalMS"

UCI-Option "Timeout_MS"(#99) * replaced constant TIME_OUT_IS_READY_MS by UCI option "Timeout_MS" * default value is 13000 after this time "readyok" will be replied to avoid running in a time-out * update to engine version 0.9.2.post0

* replaced `inCheck` parameter for `is_terminal()` by `position.checkers()` * removed variant = UCI::variant_from_name(Options["UCI_Variant"]); in set() * removed check_result() in FairyState

* fixed tablebase overwrite of NodeType * fixes blunder as seen in https://tcec-chess.com/#div=q43t&game=293&season=21 * added Unit-Test * bumped version

avoid overwriting NodeData for tablebase nodes delete inCheck from Node

* updated requirements.txt changed "python-chess" to "chess" * updated requirements.txt specified version number * updated requirements.txt set back to old python-chess version * added bottleneck_residual_block_v2() added efficient_channel_attention_moduel() added ic_layer added hard_sigmoid * * added sandglass_block * added get_se_layer * * added efficient_scaling.py * * updated efficient_scaling.py * * update efficient_scaling.py * * changed learning rate * * added global_pool=True * fixed "eca_se" look-up * update kernel size * update train_cnn.ipynb * * added preact_resnet_se.py * * enabled raw_features for pre_act_resnet_se * added bn layer for value head * fixed train_cnn.ipynb loading * * implemented Risev3.3 * * clean-up and comments

* added AUTHORS

This PR disables the flipping the board for the racing kings variant as it is not necessary and lowers performance. Both python and C++ code is affected. * added flip_board() * use boolean flipBoard * add flip_board() into board_to_planes() * added comments and simplified flip_board() * fix load_pgn_dataset return values for analyze_train_data.ipynb

treat tablebases as terminals TimeControl "5+0.1" Score of ClassicAra 0.9.3-Dev - TB - Terminal - 3 Threads vs ClassicAra 0.9.3-Dev -TB - Default - 3 Threads: 50 - 8 - 38 [0.719] Elo difference: 163.0 +/- 55.9, LOS: 100.0 %, DrawRatio: 39.6 % 96 of 1000 games finished.

* Adding printouts to improve readability of logs * Added tests for all lichess variants * For RL: Explicitly set `model_contender_dir` by rl_loop.py * Update nn_index logic. now it is possible to use --nn_update_idx again, when starting a gpu. From now on, it only needs to be passed to the trainer gpu. * Deleting multiprocessing import in rl_loop as it is not needed * Support different input names of the currently supported variants * Readded multiprocessing * Keep logs of every training (not only the last one) * Create logs dir after the old one has been moved. Change the point when logs get moved, so the renaming process contains the current nn-update-index * Implemented support for chess960 and logic for other 960 methods. * Note that RL only works with gluon atm * Delete gameidx and gamePGNSelfplay if it exists * Add Timout_MS to rl_config.py * Deleted duplicate action_to_uci method. * Using "open in writing mode" to delete content RL files in Selfplay. * Seperate function to handle uci_variant names. Fixed a bug, where the variant was not set correctly for some variants. Co-authored-by: maxalexger <saaQQMBFCY8Putb>

* Add initial version of train_cli.py and its dependencies * Add alpha_vile model to train_cli_util.py * Update train_cli and related files Rename validate_train_results.py into validate_train_results_util.py Update validate_train_results_util.py Add validate_train_results.ipynb * Fix model_config.py initialization Add fill_train_config() * Add export of configs * Update documentation * Update rl_loop.py and rl_training.py according to new train_cli.py Update documentation Add model_type and use_custom_architecture to train_config.py * Put test "Chess960 Input Planes V3" into MODE_CHESS block

…ne arguments (#214) The info-strings are being parsed as help information and all the cmd-line arguments when runing the script are exported as well.

Added Artworks section

Added description about the artworks.

Added white space Fixed link to artworks section

* - added game phase detection file - adjusted initial Dockerfile - minor changes to convert_pgn_to_planes.ipynb and pgn_to_planes_converter.py * - changed openspiel git * - changed openspiel git * fixed phase ids * added dataset creation option for specific phases * param changes in train_cnn.ipynb * - fixed plys_to_end list to only include values for moves that really have been used - added counter for games without positions for current phase * - changes to train_cnn to make it compatible - added analyse_game_phases.py to analyse game phase distribution and other information - minor changes * mcts phase integration working, some improvements missing * - added phase_to_nets map to make sure the right net is used for each phase - board->get_phase now expects the total amount of phases as an argument - phaseCountMap is now immediately filled * - added game phase vector to created datasets - added sample weighting to losses pytorch training files - load_pgn_dataset() now returns a dict - added file for easily generating normalized cutechess-cli commands * minor fixes for weighted training * - fixes and improvements to prs.py from cutechess-cli - added file to generate plots based on cutechess results * - changes for continuing training from tar file (pytorch) * - added python file for training (exported notebook) * - added python file for executing cutechess shell commands * - added the option to specify additional eval sets (unweighted) to pass through the trainer agent - you can now pass a phase to load_pgn_dataset to load a non default dataset * - minor changes * - minor changes for debugging * - bugfix in train_cnn.py for additional dataloaders * - bugfix in to correctly determine train iterations - added printing total positions in dataset when loading * - minor changese in prs.py * - minor changes for chess 960 * - reverted mode and version back to 2 and 3 * fixed bug when executing isready multiple times consecutively while setting networkLoaded back to false * alternative bugfix attempt for linux * - temporary fix for chess960 wrong training representation - adjusted cutechess run file to support 960 matches * - changes to incorporate 960 dataset analysis - new and adjusted graphs in game_phase_detector.py (should be put into a separate file) - new plots in create_cutechess_plots.py * chess960 input representation fix (c++ engine files still unadjusted and assuming a wrong input representation) * - added plot generating notebooks to git (/etc folder) - moved game phase analysis code from game_phase_detector.py to own file (analyse_game_phase_definition.py) - minor changes in train_cnn.py - adjusted .gitignore * - added support for naive movecount phases * - minor path fix in dataset_loader.py * undone temporary fix for broken chess960 input representation * - added support for phases by movecount in c++ code (currently always assumes phases by movecount) - set default value for UCI_Chess960 back to false - minor fixes * - minor plotting adjustments - added colorblind palette * - adjusted run_cutechess_experiments.py to be able to do experiments against stockfish * - added documentation * - minor assertion change in train_cnn.py * - cleaned code and removed sections that are not needed anymore * - changed underscore naming to camelCase naming in several cases * - added UCI option Game_Phase_Definition with options "lichess" and "movecount" and corresponding searchsettings enum GamePhaseDefinition * - added searchSettings to RawNetAgent to access selected gamePhaseDefinition * - aligned train_cnn.ipynb with code inside train_cnn.py * - cleaned cell outputs of main notebooks * - further notebook output cleanings * - removed files unnecessary for pull request and reverted several files back to initial state of fork * - reverted .gitignore and Dockerfile to older state * - .gitignore update to different previous state * Update crazyara.cpp Fix compile error * Update board.cpp Fix error: control reaches end of non-void function * Add GamePhase get_phase to states * Add GamePhase OpenSpielState::get_phase() * Update get_data_loader() to load dict instead --------- Co-authored-by: Felix Helfenstein <f.helfenstein@yahoo.de>

* Add command line script that allows parsing lichess puzzles * Add recursive python path * Added model loading in train_cli.py Fixed model export Set kernels for alpha-vile-large * Update get_validation_data * Added processes commad-line param * Check for additional_loaders not None

* Fix rawNetAgent and compile bugs for RL mode Change ordering of netBatchesVector (threads go first now) Use raw pointers in NeuralNetAPIUser now to resolve ownership problem Remove create_new_net_batches Add fill_nn_vectors Add fill_single_nn_vector * Make Nodes_Limit available in RL mode * remove unused code

Add git pull command before building

Call model.merge_bn() if available

- return 0 if no phases is enabled Set Game_Phase_Definition default to "lichess"

Only use phase selection for nets.size() > 1

* Fix runtime errors in rl_loop.py * Add save_cur_phase(const StateObj *pos) and all needed changes for it * Update Selfplay() constructor and make settings const * add const to get_num_phases() and remove const from SearchLimits * add save_cur_phase() in header and fix constructor in header of Selfplay() * add const to rlSettings * Add missing ) in traindataexporter.cpp * Add phaseVector export * Add save_cur_phase(pos); to traindataexporter.cpp * Update rl_training.py: get_validation_data() * Update trainer_agent_pytorch.py check if delete path is a file

* Update TensorrtAPI to TensorRT 10 * delete retrieve_indices_by_name() * add member SampleUniquePtr<IRuntime> runtime * replace getBindingDimensions() by getTensorShape() * replace setBindingDimensions() by setInputShape() * add link_libraries(stdc++fs) to CMakeLists.txt * add include_directories("$ENV{TENSORRT_PATH}/samples/") to CMakeLists.txt * Introduce BACKEND_TENSORRT_10 and BACKEND_TENSORRT_8 for backup

* Change exporter into vector<unique_ptr> exporters (one exporter object for each phase) * Add phase id specifier for export and make sure that only the appropriate phase exports the sample * Use max_samples_per_iteration() to end generation * Add check_for_moe(model_dir) * Add directories for MoE * Launch the training procedure multiple times for MoE * Add logging message * Start adding special cases for MoE * Added _move_all_files_wrapper() for cleaner code * Minor comment update * Fix compile errors * Add select_nn_index() to RawNetAgent * Fix condition * Add TODO, remove unnecessary '/' * Update compress_dataset * Update fileNameExport in selfplay.cpp * update get_current_model_tar_file to use model 0 for Moe * Reset generatedSamples in go() * add missing / * Skip game export for 0 samples * use phase0 in get_number_generated_files() * fix export of phases, update get_current_model_tar_file() * update get_current_model_tar_file() * update prepare_data_for_training() * update _move_all_files_wrapper() * remove unneeded / * Update get_current_model_tar_file() with phases * Update planes_train_dir and planes_val_dir in rl_loop.py * Simplify code and use reversed() for training * Add missing / in compress_dataset * Add _retrieve_end_idx(data) for MoE * Implement _include_data_from_replay_memory_wrapper() which handles MoE and non MoE cases * Implement staged learning v2.0, i.e. first train on full data and then each phase separately * Make use_moe_staged_learning auto detect * Skip "phaseNone" for counting phases * Fix condition for "phase_idx is None" * Skip "phaseNone" when loading models * Fix suffix * Create model_dir_archive for phaseNone * Add load checkpoint logging info * Set q_value_ratio to 0 for RL and add Exception for wdl is True conflict * Use middle phase for validation in staged learning on full data

Change default value for phase_definition to "lichess"

Update comment about "phase"

Add --device-id in example call for sl-training

Add get_rise_v33_large_model()

* Replace mutex with spinlock * Optimize spinlock * Used uint_fast8_t in Spinlock * Add check for numPhases

* Start with parallel RL Add option "Number Parallel Games" Add ->get_local_batch_size() Make NeuralNetAPIUser a member * Separate the neural user from SearchThread * Create run_inference() wrapper Rename id to agentID * Add selfplayFileMutex Make some variables to * * Fix some compilation problems * Fix remaining compile errors * Add run_selfplay_thread and gameThreads * Put MCTSAgent and RawNetAgent into vectors * Fix compile errors * Add mutex, condition variable and batch counter * Use batch variables * Make use of NeuralNetAPIUser object for SearchThread * Fix compile problems * Add helper method handle_fwd_pass() * Fill information for batchCounter and batchMutex * Use shared_ptr for batchCounter and batchMutex * Add numberOfGames / NUMBER_OF_PARALLEL_GAMES * Add agentID to name identifier * Use unique_ptr instead of raw objects * Add lock for initialisation + update offsets * Initialize timeManager * Add debug message * Try recursive_mutex * Fix compile problem recursive_mutex * Change to recursive_mutex * Revert recursive mutex + make condition variable as MCTSAgent member * Move cout statement * Avoid iteration * Use CXX 20 and barriers * Use ReusableBarrier as a replacement for barrier<> * Clean up of unused variables * Increase default batch size for RL * Simplify expression in compute_offset() * Add set_agent_id() for SearchThread and MCTSAgent * Set CXX standard back to 17 * Add InferenceQueue, InferenceWorker, InferenceRequest, InferenceResult * Fix typo * Update CMakeLists.txt Add missing inference files * Move template function to header (InferenceQueue) * Move if (newNodes->size() == 0) and use offset for input values * Change memcpy argument * Use right policy size * Update maxBatchSize for worker * Use fwd_pass_queue() * Use correct clipping * Simplify expression for handle_fwd_pass() * Give each MCTSAgent its own nnUser * Maybe fix inference bug * Remove unused variables * Fix game sample export * Integrate risev3-large into train loop * Update safeguard condition * Update rl_config.py * Update MCTSAgent::set_root_node_predictions() * Add get_alpha_zero_model_small(), resnet-small * Make RL loop fully parallel * Fix compile errors * Compressing individual files * Adjust export for num_parallel_games > 1 * Replace get_local_batch_size() with get_main_batch_size() * Change local batch size in UCI-options * Update default value for MaxInitPly * Replace searchSettings->get_local_batch_size() with searchSettings->batchSize * Replace searchSettings->get_local_batch_size() with searchSettings->batchSize * readd increment of gameIdx * Update go command for selfplay * Update SelfPlay::max_samples_per_iteration() * Change one logging.info call to logging.debug() * Update generate_random_nn.py Use get_default_model() * Update generate_random_nn.py Remove mxnet code

pull bot added ⤵️ pull merge-conflict Resolve conflicts manually labels Jul 8, 2020

QueensGambit force-pushed the master branch from d8f1158 to 9b3f1a9 Compare September 7, 2020 16:09

QueensGambit force-pushed the master branch from 69b951a to 82833f7 Compare October 21, 2020 20:38

QueensGambit force-pushed the master branch from 35be325 to 4ae81bc Compare November 19, 2020 23:21

QueensGambit force-pushed the master branch 4 times, most recently from 4c16f38 to fd4fcf2 Compare January 22, 2021 21:58

QueensGambit force-pushed the master branch from df1b05a to c199417 Compare February 7, 2021 11:43

QueensGambit force-pushed the master branch from de76360 to 1ba95aa Compare April 23, 2021 21:49

QueensGambit and others added 19 commits May 5, 2021 23:14

Time-out readyok thread (#98)

9786942

* Thread which prints out "readyok" after a given amount of ms unless killed. * This is to avoid running into time outs of e.g. cutechess on Multi- GPU systems when deserializing complex NN architectures.

update engine version to 0.9.2

93b12e0

removed unused variable "updateIntervalMS"

* update to engine version 0.9.2.post0 (#99)

d5da8c1

UCI-Option "Timeout_MS"(#99) * replaced constant TIME_OUT_IS_READY_MS by UCI option "Timeout_MS" * default value is 13000 after this time "readyok" will be replied to avoid running in a time-out * update to engine version 0.9.2.post0

Simplified state interface (#100)

e254e69

* replaced `inCheck` parameter for `is_terminal()` by `position.checkers()` * removed variant = UCI::variant_from_name(Options["UCI_Variant"]); in set() * removed check_result() in FairyState

removed inCheck from unit tests (#100)

9ad8807

Syzygy probe fix (#102)

eaa6465

* fixed tablebase overwrite of NodeType * fixes blunder as seen in https://tcec-chess.com/#div=q43t&game=293&season=21 * added Unit-Test * bumped version

Fix potential crash (#103)

157a870

avoid overwriting NodeData for tablebase nodes delete inCheck from Node

added nullptr check (#103)

53f252e

deactivate Timeout_MS thread for value 0 (#99)

15b7d7b

reply "readyok" only once (#99)

ed7b13a

addresses automatic code review for #104

c3132c1

Author information (#106)

94e1a89

* added AUTHORS

make sure bestMoveIdx != bestQIdx for veto move (#86)

6884a36

synchronize gc for RL (#97)

ac182a7

fix play_move_and_update() because of #100

2182cae

QueensGambit and others added 30 commits October 31, 2023 16:52

Update comment for test (#211)

746923b

Replace "init value" by "value" in cout

ca16c6d

Allow overwriting all configuration in train_config.py via command-li…

83dfacf

…ne arguments (#214) The info-strings are being parsed as help information and all the cmd-line arguments when runing the script are exported as well.

Update README.md

e0de690

Added Artworks section

Update README.md

ab7c3c3

Added description about the artworks.

Update README.md

9e927c3

Added white space Fixed link to artworks section

Add merge_bn() call to Risev3.0

5a12a23

Fix tar_file in train_config.py (#216)

54e92b7

add visualize_AlphaVile_large.ipynb

04509dd

Update Dockerfile

81aa7fd

Add git pull command before building

Fix kernel 5x5 for AlphaVile (normal, small, tiny)

09b5b5a

Fix additional dataloader creation (#220)

524e048

Update AUTHORS

8737baf

Update train_cnn.ipynb

36582c6

Call model.merge_bn() if available

Add select_nn_index() for phase selection (#216)

b4216ea

- return 0 if no phases is enabled Set Game_Phase_Definition default to "lichess"

Update mctsagent.cpp (#216)

f984efa

Only use phase selection for nets.size() > 1

Fix compile error (#216)

c045744

Update main_config.py

5d5eae4

Change default value for phase_definition to "lichess"

Update main_config.py

c4b30be

Update comment about "phase"

Update README.md

bee1620

Add --device-id in example call for sl-training

Update rise_mobile_v3.py

fa4749b

Add get_rise_v33_large_model()

Spinlock instead of Mutex (#224)

173ba98

* Replace mutex with spinlock * Optimize spinlock * Used uint_fast8_t in Spinlock * Add check for numPhases

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from QueensGambit:master#18

[pull] master from QueensGambit:master#18
pull[bot] wants to merge 538 commits intoDrMeosch:masterfrom
QueensGambit:master

pull bot commented Jul 8, 2020 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Conversation

pull bot commented Jul 8, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

pull bot commented Jul 8, 2020 •

edited

Loading