From 88f44160119d751659d2a22eb2968a5ad4fa1460 Mon Sep 17 00:00:00 2001 From: MeowStlanik Date: Thu, 29 Jan 2026 19:59:26 +0700 Subject: [PATCH 1/4] docs: fix typo and improve grammar in client versioning. --- architectures/decentralized/client_versioning.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/architectures/decentralized/client_versioning.md b/architectures/decentralized/client_versioning.md index 7ef46b965..910448acb 100644 --- a/architectures/decentralized/client_versioning.md +++ b/architectures/decentralized/client_versioning.md @@ -8,7 +8,7 @@ There are two ways to specify the client version for a run: ## Docker RepoId hash Once the client docker image is uploaded to DockerHub, a `RepoId` hash is associated with that image. This string is what should be used for -setting the client version in a run, toguether with the "sha256" part. For example, "sha256:ca978112ca1bbdcafac231b39a23dc4da786eff8147c4e72b9807785afee48bb". +setting the client version in a run, together with the "sha256" part. For example, "sha256:ca978112ca1bbdcafac231b39a23dc4da786eff8147c4e72b9807785afee48bb". ## Docker version tag @@ -16,11 +16,11 @@ For setting a docker version tag, the image should be built with that tag set be ## Updating client docker version for a run -Once the new docker image uploaded to DockerHub and some version selected, you can update the client version required +Once the new docker image is uploaded to DockerHub and a version is selected, you can update the client version required for a particular run with the following command: -[!] You should have the run owner solana key to successfully run this command -[!] The run must be paused beforehand to do the client version update +[!] You should have the run owner Solana key to successfully run this command. +[!] The run must be paused beforehand to do the client version update. ```bash cargo run --release --bin run-manager -- \ From 8cfe078efc3b31ea4758015e689231d2cfc211e4 Mon Sep 17 00:00:00 2001 From: MeowStlanik Date: Thu, 29 Jan 2026 20:34:46 +0700 Subject: [PATCH 2/4] docs: fix informal language and improve formatting in data-provider Refined the README by replacing informal terms ('wanna', 'thru') with professional alternatives, standardizing header capitalization, and ensuring consistent punctuation throughout the document. --- shared/data-provider/README.md | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/shared/data-provider/README.md b/shared/data-provider/README.md index 6f9e1f63b..cb9af68df 100644 --- a/shared/data-provider/README.md +++ b/shared/data-provider/README.md @@ -1,22 +1,24 @@ # data-provider -there's a bunch of functionality here, but the http stuff is what you probably wanna try out. +There's a bunch of functionality here, but the HTTP components are what you probably want to try out first. -## http data provider fetch example +## HTTP data provider fetch example ### Usage -#### working example +#### Working example First, an example: + `cargo run --example http -- --file-size 40000004052 --batch-ids 103 --token-size 4 --tokenizer tests/resources/llama3_tokenizer.json urls https://storage.googleapis.com/nous-pretraining-public-us/fineweb-1pct-tokenized-llama3/000_fineweb.ds` -This will fetch some fineweb data & output it using the llama3 tokenizer! +This will fetch some FineWeb data and output it using the LLaMA 3 tokenizer. -#### Basic Command Structure +#### Basic command structure ```bash cargo run --example http --file-size [--sequence-length ] [--token-size ] --batch-ids [--tokenizer ] + ``` The tool supports two main modes of operation: template-based URLs and explicit URL lists. @@ -29,7 +31,7 @@ The tool supports two main modes of operation: template-based URLs and explicit - `--sequence-length`: Length of each sequence (default: 2048) - `--token-size`: Size of each token in bytes (default: 2) -- `--tokenizer`: Path to tokenizer file for decoding output +- `--tokenizer`: Path to a tokenizer file for decoding output #### Subcommands @@ -45,11 +47,11 @@ Example: cargo run --example http --batch-ids 1,2,3 template "http://example.com/{}.ds" --start 0 --end 10 ``` -this will fetch urls http://example.com/0.ds thru http://example.com/10.ds +This will fetch URLs http://example.com/0.ds through http://example.com/10.ds. -###### left pad zeros +###### Left pad zeros -`--left-pad-zeros 3` will transform fetch URLs http://example.com/000.ds thru http://example.com/010.ds +Using `--left-pad-zeros 3` will transform the fetched URLs to http://example.com/000.ds through http://example.com/010.ds. ##### URL List Mode @@ -65,7 +67,7 @@ cargo run --example http --batch-ids 1,2,3 urls "http://example.com/1.ds" "http: ### Examples -1. Fetch data using a template with tokenizer: +1. Fetch data using a template with a tokenizer: ```bash cargo run --example http --batch-ids 1,2,3 --tokenizer ./tokenizer.json template "http://example.com/{}.ds" --start 0 --end 10 From a4d708ce43bc3667b22781a93d1d2417114a4679 Mon Sep 17 00:00:00 2001 From: MeowStlanik Date: Thu, 29 Jan 2026 20:53:55 +0700 Subject: [PATCH 3/4] docs: fix typos and refine run-manager README - Fixed typo: "Thsi" -> "This". - Clarified connectivity terminology: "Psyche network". - Improved punctuation and standardized hotkey formatting (Ctrl+C). - Removed unnecessary modal verbs for clearer instructions. --- tools/rust-tools/run-manager/README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/rust-tools/run-manager/README.md b/tools/rust-tools/run-manager/README.md index 101a69089..cfc34634e 100644 --- a/tools/rust-tools/run-manager/README.md +++ b/tools/rust-tools/run-manager/README.md @@ -1,4 +1,4 @@ -Thsi binary is a manager for Psyche client containers. It should allow users to connect to a Psyche without having to worry about client versions, as this performs the necessary checks beforehand. +This binary is a manager for Psyche client containers. It allows users to connect to the Psyche network without having to worry about client versions, as it performs the necessary checks beforehand. One can run the run manager like this: @@ -23,6 +23,6 @@ Where: WALLET_PRIVATE_KEY_PATH=keys/keypair.json # Optional ``` - - If `WALLET_PRIVATE_KEY_PATH` is defined it will use the specified keypair instead of the default `$HOME/.config/solana/id.json` + - If `WALLET_PRIVATE_KEY_PATH` is defined, it will use the specified keypair instead of the default `$HOME/.config/solana/id.json` -The run manager will also try to restart the client a few times in case it encounters an error. If you notice it somehow is stuck you may close the process manually via `ctrl+c` and run it again. +The run manager will also try to restart the client a few times in case it encounters an error. If you notice that it is stuck, you may close the process manually via Ctrl+C and run it again. From 8af0846bb54167aab8e5be3846cd2d19b89a582d Mon Sep 17 00:00:00 2001 From: MeowStlanik Date: Thu, 29 Jan 2026 21:25:24 +0700 Subject: [PATCH 4/4] docs: fix grammar and typos in Docker README Fix minor grammatical errors and typos. --- docker/README.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/docker/README.md b/docker/README.md index 2377920e9..3c0e8f714 100644 --- a/docker/README.md +++ b/docker/README.md @@ -1,12 +1,12 @@ # Docker Psyche This folder contains some of the docker related files and scripts, mostly entrypoint scripts for the docker containers. -All the used docker images are created via nix docker-tools and can be found in the `packages.nix` file inside the `nix` directory. +All the docker images used are created via nix docker-tools and can be found in the `packages.nix` file inside the `nix` directory. The purpose of using docker is two-fold: -- compartmentalize psyche client to be deployed and used in testing and production environments easily. -- implementing end-to-end tests that are as close as possible to a production environment. +- compartmentalize the Psyche client to be deployed and used in testing and production environments easily. +- implement end-to-end tests that are as close as possible to a production environment. There are three concrete use-cases for the docker containers: @@ -22,9 +22,9 @@ There are three concrete use-cases for the docker containers: ## Psyche Solana client The `docker-psyche-solana-client` package works as the dockerfile used to build the image for the client that would be used by -end users, in a production-like environment `psyche-solana-cient binary` already built with nix. +end users, in a production-like environment `psyche-solana-client binary` already built with nix. The `train_entrypoint.sh` script runs as the default entrypoint for the container, which is no more than a -call to the `psyche-solana-client` binary to start training, and some logic for restart the client in case it crashes. +call to the `psyche-solana-client` binary to start training, and some logic to restart the client in case it crashes. ## Psyche Solana test client @@ -45,7 +45,7 @@ Its main task is to spawn the `solana-test-validator` and deploy the coordinator ### Starting solana-test-validator and deploying Coordinator -If you want to running a validator in your machine, then you will need to start the `solana-test-validator` +If you want to run a validator on your machine, then you will need to start the `solana-test-validator` binary and then deploy the coordinator program. If you have started the validator and deployed the Coordinator in another machine, you can skip to the next section. @@ -96,7 +96,7 @@ This can be done using: just setup_gpu_clients ``` -where `` should be replaced with the number of clients you want spawn. +where `` should be replaced with the number of clients you want to spawn. As soon as you run the previous `just` command, you will be prompted with a message saying something like @@ -158,7 +158,7 @@ go directly to the `Join training run with the dockerized Psyche client` step. To create a run, you will need to specify the model configuration file and the wallet that will be used to pay for the creation of the run, as well as the devnet/mainnet RPC and websocket endpoint, and the **run ID** of the training run. -Create an environment file in `config/client/.env`, if you don't already have one. There variables that should be present are: +Create an environment file in `config/client/.env`, if you don't already have one. The variables that should be present are: - `RPC`: The url to the Solana RPC endpoint - `WS_RPC`: The url to the Solana websocket endpoint