[pull] master from QueensGambit:master#18
Open
pull[bot] wants to merge 538 commits intoDrMeosch:masterfrom
Open
[pull] master from QueensGambit:master#18pull[bot] wants to merge 538 commits intoDrMeosch:masterfrom
pull[bot] wants to merge 538 commits intoDrMeosch:masterfrom
Conversation
4c16f38 to
fd4fcf2
Compare
* Thread which prints out "readyok" after a given amount of ms unless killed. * This is to avoid running into time outs of e.g. cutechess on Multi- GPU systems when deserializing complex NN architectures.
removed unused variable "updateIntervalMS"
UCI-Option "Timeout_MS"(#99) * replaced constant TIME_OUT_IS_READY_MS by UCI option "Timeout_MS" * default value is 13000 after this time "readyok" will be replied to avoid running in a time-out * update to engine version 0.9.2.post0
* replaced `inCheck` parameter for `is_terminal()` by `position.checkers()` * removed variant = UCI::variant_from_name(Options["UCI_Variant"]); in set() * removed check_result() in FairyState
* fixed tablebase overwrite of NodeType * fixes blunder as seen in https://tcec-chess.com/#div=q43t&game=293&season=21 * added Unit-Test * bumped version
avoid overwriting NodeData for tablebase nodes delete inCheck from Node
* updated requirements.txt changed "python-chess" to "chess" * updated requirements.txt specified version number * updated requirements.txt set back to old python-chess version * added bottleneck_residual_block_v2() added efficient_channel_attention_moduel() added ic_layer added hard_sigmoid * * added sandglass_block * added get_se_layer * * added efficient_scaling.py * * updated efficient_scaling.py * * update efficient_scaling.py * * changed learning rate * * added global_pool=True * fixed "eca_se" look-up * update kernel size * update train_cnn.ipynb * * added preact_resnet_se.py * * enabled raw_features for pre_act_resnet_se * added bn layer for value head * fixed train_cnn.ipynb loading * * implemented Risev3.3 * * clean-up and comments
* added AUTHORS
This PR disables the flipping the board for the racing kings variant as it is not necessary and lowers performance. Both python and C++ code is affected. * added flip_board() * use boolean flipBoard * add flip_board() into board_to_planes() * added comments and simplified flip_board() * fix load_pgn_dataset return values for analyze_train_data.ipynb
treat tablebases as terminals TimeControl "5+0.1" Score of ClassicAra 0.9.3-Dev - TB - Terminal - 3 Threads vs ClassicAra 0.9.3-Dev -TB - Default - 3 Threads: 50 - 8 - 38 [0.719] Elo difference: 163.0 +/- 55.9, LOS: 100.0 %, DrawRatio: 39.6 % 96 of 1000 games finished.
* Adding printouts to improve readability of logs * Added tests for all lichess variants * For RL: Explicitly set `model_contender_dir` by rl_loop.py * Update nn_index logic. now it is possible to use --nn_update_idx again, when starting a gpu. From now on, it only needs to be passed to the trainer gpu. * Deleting multiprocessing import in rl_loop as it is not needed * Support different input names of the currently supported variants * Readded multiprocessing * Keep logs of every training (not only the last one) * Create logs dir after the old one has been moved. Change the point when logs get moved, so the renaming process contains the current nn-update-index * Implemented support for chess960 and logic for other 960 methods. * Note that RL only works with gluon atm * Delete gameidx and gamePGNSelfplay if it exists * Add Timout_MS to rl_config.py * Deleted duplicate action_to_uci method. * Using "open in writing mode" to delete content RL files in Selfplay. * Seperate function to handle uci_variant names. Fixed a bug, where the variant was not set correctly for some variants. Co-authored-by: maxalexger <saaQQMBFCY8Putb>
* Add initial version of train_cli.py and its dependencies * Add alpha_vile model to train_cli_util.py * Update train_cli and related files Rename validate_train_results.py into validate_train_results_util.py Update validate_train_results_util.py Add validate_train_results.ipynb * Fix model_config.py initialization Add fill_train_config() * Add export of configs * Update documentation * Update rl_loop.py and rl_training.py according to new train_cli.py Update documentation Add model_type and use_custom_architecture to train_config.py * Put test "Chess960 Input Planes V3" into MODE_CHESS block
…ne arguments (#214) The info-strings are being parsed as help information and all the cmd-line arguments when runing the script are exported as well.
Added Artworks section
Added description about the artworks.
Added white space Fixed link to artworks section
* - added game phase detection file - adjusted initial Dockerfile - minor changes to convert_pgn_to_planes.ipynb and pgn_to_planes_converter.py * - changed openspiel git * - changed openspiel git * fixed phase ids * added dataset creation option for specific phases * param changes in train_cnn.ipynb * - fixed plys_to_end list to only include values for moves that really have been used - added counter for games without positions for current phase * - changes to train_cnn to make it compatible - added analyse_game_phases.py to analyse game phase distribution and other information - minor changes * mcts phase integration working, some improvements missing * - added phase_to_nets map to make sure the right net is used for each phase - board->get_phase now expects the total amount of phases as an argument - phaseCountMap is now immediately filled * - added game phase vector to created datasets - added sample weighting to losses pytorch training files - load_pgn_dataset() now returns a dict - added file for easily generating normalized cutechess-cli commands * minor fixes for weighted training * - fixes and improvements to prs.py from cutechess-cli - added file to generate plots based on cutechess results * - changes for continuing training from tar file (pytorch) * - added python file for training (exported notebook) * - added python file for executing cutechess shell commands * - added the option to specify additional eval sets (unweighted) to pass through the trainer agent - you can now pass a phase to load_pgn_dataset to load a non default dataset * - minor changes * - minor changes for debugging * - bugfix in train_cnn.py for additional dataloaders * - bugfix in to correctly determine train iterations - added printing total positions in dataset when loading * - minor changese in prs.py * - minor changes for chess 960 * - reverted mode and version back to 2 and 3 * fixed bug when executing isready multiple times consecutively while setting networkLoaded back to false * alternative bugfix attempt for linux * - temporary fix for chess960 wrong training representation - adjusted cutechess run file to support 960 matches * - changes to incorporate 960 dataset analysis - new and adjusted graphs in game_phase_detector.py (should be put into a separate file) - new plots in create_cutechess_plots.py * chess960 input representation fix (c++ engine files still unadjusted and assuming a wrong input representation) * - added plot generating notebooks to git (/etc folder) - moved game phase analysis code from game_phase_detector.py to own file (analyse_game_phase_definition.py) - minor changes in train_cnn.py - adjusted .gitignore * - added support for naive movecount phases * - minor path fix in dataset_loader.py * undone temporary fix for broken chess960 input representation * - added support for phases by movecount in c++ code (currently always assumes phases by movecount) - set default value for UCI_Chess960 back to false - minor fixes * - minor plotting adjustments - added colorblind palette * - adjusted run_cutechess_experiments.py to be able to do experiments against stockfish * - added documentation * - minor assertion change in train_cnn.py * - cleaned code and removed sections that are not needed anymore * - changed underscore naming to camelCase naming in several cases * - added UCI option Game_Phase_Definition with options "lichess" and "movecount" and corresponding searchsettings enum GamePhaseDefinition * - added searchSettings to RawNetAgent to access selected gamePhaseDefinition * - aligned train_cnn.ipynb with code inside train_cnn.py * - cleaned cell outputs of main notebooks * - further notebook output cleanings * - removed files unnecessary for pull request and reverted several files back to initial state of fork * - reverted .gitignore and Dockerfile to older state * - .gitignore update to different previous state * Update crazyara.cpp Fix compile error * Update board.cpp Fix error: control reaches end of non-void function * Add GamePhase get_phase to states * Add GamePhase OpenSpielState::get_phase() * Update get_data_loader() to load dict instead --------- Co-authored-by: Felix Helfenstein <f.helfenstein@yahoo.de>
* Add command line script that allows parsing lichess puzzles * Add recursive python path * Added model loading in train_cli.py Fixed model export Set kernels for alpha-vile-large * Update get_validation_data * Added processes commad-line param * Check for additional_loaders not None
* Fix rawNetAgent and compile bugs for RL mode Change ordering of netBatchesVector (threads go first now) Use raw pointers in NeuralNetAPIUser now to resolve ownership problem Remove create_new_net_batches Add fill_nn_vectors Add fill_single_nn_vector * Make Nodes_Limit available in RL mode * remove unused code
Add git pull command before building
Call model.merge_bn() if available
- return 0 if no phases is enabled Set Game_Phase_Definition default to "lichess"
Only use phase selection for nets.size() > 1
* Fix runtime errors in rl_loop.py * Add save_cur_phase(const StateObj *pos) and all needed changes for it * Update Selfplay() constructor and make settings const * add const to get_num_phases() and remove const from SearchLimits * add save_cur_phase() in header and fix constructor in header of Selfplay() * add const to rlSettings * Add missing ) in traindataexporter.cpp * Add phaseVector export * Add save_cur_phase(pos); to traindataexporter.cpp * Update rl_training.py: get_validation_data() * Update trainer_agent_pytorch.py check if delete path is a file
* Update TensorrtAPI to TensorRT 10
* delete retrieve_indices_by_name()
* add member SampleUniquePtr<IRuntime> runtime
* replace getBindingDimensions() by getTensorShape()
* replace setBindingDimensions() by setInputShape()
* add link_libraries(stdc++fs) to CMakeLists.txt
* add include_directories("$ENV{TENSORRT_PATH}/samples/") to
CMakeLists.txt
* Introduce BACKEND_TENSORRT_10 and BACKEND_TENSORRT_8 for backup
* Change exporter into vector<unique_ptr> exporters (one exporter object for each phase) * Add phase id specifier for export and make sure that only the appropriate phase exports the sample * Use max_samples_per_iteration() to end generation * Add check_for_moe(model_dir) * Add directories for MoE * Launch the training procedure multiple times for MoE * Add logging message * Start adding special cases for MoE * Added _move_all_files_wrapper() for cleaner code * Minor comment update * Fix compile errors * Add select_nn_index() to RawNetAgent * Fix condition * Add TODO, remove unnecessary '/' * Update compress_dataset * Update fileNameExport in selfplay.cpp * update get_current_model_tar_file to use model 0 for Moe * Reset generatedSamples in go() * add missing / * Skip game export for 0 samples * use phase0 in get_number_generated_files() * fix export of phases, update get_current_model_tar_file() * update get_current_model_tar_file() * update prepare_data_for_training() * update _move_all_files_wrapper() * remove unneeded / * Update get_current_model_tar_file() with phases * Update planes_train_dir and planes_val_dir in rl_loop.py * Simplify code and use reversed() for training * Add missing / in compress_dataset * Add _retrieve_end_idx(data) for MoE * Implement _include_data_from_replay_memory_wrapper() which handles MoE and non MoE cases * Implement staged learning v2.0, i.e. first train on full data and then each phase separately * Make use_moe_staged_learning auto detect * Skip "phaseNone" for counting phases * Fix condition for "phase_idx is None" * Skip "phaseNone" when loading models * Fix suffix * Create model_dir_archive for phaseNone * Add load checkpoint logging info * Set q_value_ratio to 0 for RL and add Exception for wdl is True conflict * Use middle phase for validation in staged learning on full data
Change default value for phase_definition to "lichess"
Update comment about "phase"
Add --device-id in example call for sl-training
Add get_rise_v33_large_model()
* Replace mutex with spinlock * Optimize spinlock * Used uint_fast8_t in Spinlock * Add check for numPhases
* Start with parallel RL Add option "Number Parallel Games" Add ->get_local_batch_size() Make NeuralNetAPIUser a member * Separate the neural user from SearchThread * Create run_inference() wrapper Rename id to agentID * Add selfplayFileMutex Make some variables to * * Fix some compilation problems * Fix remaining compile errors * Add run_selfplay_thread and gameThreads * Put MCTSAgent and RawNetAgent into vectors * Fix compile errors * Add mutex, condition variable and batch counter * Use batch variables * Make use of NeuralNetAPIUser object for SearchThread * Fix compile problems * Add helper method handle_fwd_pass() * Fill information for batchCounter and batchMutex * Use shared_ptr for batchCounter and batchMutex * Add numberOfGames / NUMBER_OF_PARALLEL_GAMES * Add agentID to name identifier * Use unique_ptr instead of raw objects * Add lock for initialisation + update offsets * Initialize timeManager * Add debug message * Try recursive_mutex * Fix compile problem recursive_mutex * Change to recursive_mutex * Revert recursive mutex + make condition variable as MCTSAgent member * Move cout statement * Avoid iteration * Use CXX 20 and barriers * Use ReusableBarrier as a replacement for barrier<> * Clean up of unused variables * Increase default batch size for RL * Simplify expression in compute_offset() * Add set_agent_id() for SearchThread and MCTSAgent * Set CXX standard back to 17 * Add InferenceQueue, InferenceWorker, InferenceRequest, InferenceResult * Fix typo * Update CMakeLists.txt Add missing inference files * Move template function to header (InferenceQueue) * Move if (newNodes->size() == 0) and use offset for input values * Change memcpy argument * Use right policy size * Update maxBatchSize for worker * Use fwd_pass_queue() * Use correct clipping * Simplify expression for handle_fwd_pass() * Give each MCTSAgent its own nnUser * Maybe fix inference bug * Remove unused variables * Fix game sample export * Integrate risev3-large into train loop * Update safeguard condition * Update rl_config.py * Update MCTSAgent::set_root_node_predictions() * Add get_alpha_zero_model_small(), resnet-small * Make RL loop fully parallel * Fix compile errors * Compressing individual files * Adjust export for num_parallel_games > 1 * Replace get_local_batch_size() with get_main_batch_size() * Change local batch size in UCI-options * Update default value for MaxInitPly * Replace searchSettings->get_local_batch_size() with searchSettings->batchSize * Replace searchSettings->get_local_batch_size() with searchSettings->batchSize * readd increment of gameIdx * Update go command for selfplay * Update SelfPlay::max_samples_per_iteration() * Change one logging.info call to logging.debug() * Update generate_random_nn.py Use get_default_model() * Update generate_random_nn.py Remove mxnet code
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot]
Can you help keep this open source service alive? 💖 Please sponsor : )