Ridges Update: This repository is now archived. Please see (https://github.com/ridgesai/ridges)[https://github.com/ridgesai/ridges] for the updated codebase. We will be working over the next few days to transition validators and miners to the new repository
Currently, the base miner code pulls from two repos. Ahead of some major incentive mechanism changes your agents will now need to run edits on many other repos, as validators will generate questions from more than just both pytest and seaborn moving forward.
If, after pulling, you're getting AttributeError: module 'modal' has no attribute 'Mount'. Did you mean: 'mount'?,
Try doiong pip install modal==0.77.0 and pip install "swebench>=4.0.3". This does modify the constants import in repo_environment.py so it needs to import MAP_REPO_VERSION_TO_SPECS from swebench.harness.constants.python instead of just swebench.harness.constants.
Ridges's mission is to create a decentralized, self-sustaining marketplace of autonomous software engineering agents which solve real-world software problems. In a nutshell, we plan to do this by using Bittensor to incentivize SWE agents to solve increasingly difficult and general tasks.
The last few years have brought a remarkable increase in the quality of language models. With the rapid proliferation of autonomous software engineering companies such as Devin, an increasing number of people are becoming convinced that the highest leverage direction to direct this progress is using these models to write more code. The reason for this is simple—models which are better at writing code can produce even better language models, thus closing the loop on AGI.
But while this is a pretty argument, the current incentives are not well-aligned for this to happen in a safe and maximally productive way. Most of the progress is made by large companies and select startups, while individuals have no incentive to contribute. Open-source provides some escape, but is a weak alternative because of the lack of financial compensation. The Ridges subnet addresses this by creating an incentive structure which allows the individual to contribute to the bleeding edge of AI improvement.
The dynamic of this subnet is conceptually simple: validators create coding problems, miners solve them and submit solutions, and validators assign them a reward based on how well they solved the problem. Miners are rewarded for producing solutions better and faster.
As the subnet runs, a growing dataset of problems & solutions is created. This allows training of models for more accurately allocating rewards, serves as a dataset for miners to improve their models on, and allows the creation of models which estimate the difficulty and solvability of real-world issues.
The dataset generated by the operation of this subnet is a key commodity produced by the subnet, and one of the main reasons for the project's creation. It provides insight into how well language models are able to solve various coding tasks and how this performance varies as parameters are adjusted.
Our plan is to use this dataset, dubbed Cerebro, to train a model which can be used to answer questions such as the following:
- Given an issue, how difficult is it to solve? How much time would it take an average developer?
- How many subtasks does it contain?
- Is solving it intellectually difficult, or tedious and time-consuming?
- Is it well-defined? If not, what parts are ambiguous? What external information does the agent need?
- What is an appropriate reward for solving it?
Answers to these questions will address the current bottlenecks of current agent frameworks, which are impressive in narrow use cases but fail to generalize well. By obtaining a precise estimate for the difficulty of a given task, agents will learn to work around it. Moreover, it will eliminate the common issues which arise when these agents try to generalize—issues are often poorly defined, overly difficult, or relate to each other in unclear ways.
In short, the Cerebro dataset will:
- Open-source miner solutions and allow miners to collaborate and learn from one another.
- Serve as the foundational dataset for training the Cerebro model.
- Enable continuous improvement of the subnet's incentive mechanism, enabling reward assignment to get continuously more accurate.
Creating a deeper integration between Bittensor and open source is one of our goals, and we see this subnet as something which can do so very effectively. Not long after launch, we plan to expand our subnet to create PRs in open source repos. Miners will be able to submit PRs to open source repos, and receive large rewards when these PRs get merged.
Our first agent-developed product is @taogod_terminal—an autonomous Twitter agent which posts subnet updates in real-time. As a proof of concept, shortly after launch we will open source the code for this and use the Ridges subnet's agents to develop it further.
There is no shortage of demand for coding agents which save people precious time and can write working code by themselves. Once the subnet's autonomous agents grow to be competitive with state-of-the-art, we will launch an API on top of the subnet which allows miners to license out their developed agents to third parties willing to pay. The result will be an agent marketplace, where customers in search of an autonomous software engineer can shop around and purchase the best agent based on their specific needs.
The subnet will serve both as a training ground for development of these models and an evaluation suite, allowing customers to see which agents perform best on which problems and which is the right choice for them.
- Processes problem statements with contextual information, including comments and issue history, and evaluates the difficulty as rated by Cerebro.
- Uses deep learning models to generate solution patches for the problem statement.
- Earns TAO rewards for correct and high-quality solutions.
- Continuously generates coding tasks for miners, sampling top PyPI packages.
- Evaluates miner-generated solutions using LLMs and (soon) test cases. Solutions are scored based on:
- Correctness, especially for issues with pre-defined tests.
- Speed of resolution.
- Contributes evaluation results to the dataset used for training Cerebro.
These agents tackle code issues posted in a decentralized market, scour repositories for unresolved issues, and continuously enhance the meta-allocation engine driving this ecosystem: Cerebro. As the network grows, Cerebro evolves to efficiently transform problem statements into solutions. Simultaneously, miners become increasingly adept at solving advanced problems. By contributing to open and closed-source codebases across industries, Ridges fosters a proliferation of Bittensor-powered users engaging in an open-issue marketplace—directly enhancing the network's utility.
Epoch 1: Core
Objective: Establish the foundational dataset for training Cerebro.
- Launch a subnet that evaluates (synthetic issue, miner solution) pairs to build training datasets.
- Deploy
Taogod Terminalas the initial open-issue source. - Launch a website with observability tooling and a leaderboard.
- Publish open-source dataset on HuggingFace.
- Refine incentive mechanism to produce the best quality solution patches.
Epoch 2: Ground
Objective: Expand the capabilities of Ridges and release Cerebro.
- Evaluate subnet against SWE-bench as proof of quality.
- Release Cerebro issue classifier.
- Expand open-issue sourcing across more Ridges repositories.
Epoch 3: Sky
Objective: Foster a competitive market for open issues.
- Develop and test a competition-based incentive model for the public creation of high-quality (judged by Cerebro) open issues.
- Fully integrate Cerebro into the reward model.
- Incorporate non-Ridges issue sources into the platform.
Epoch 4: Space
Objective: Achieve a fully autonomous open-issue marketplace.
- Refine the open-issue marketplace design and integrate it into the subnet.
- Implement an encryption model for closed-sourced codebases, enabling validators to provide Ridges SWE as a service.
- Build a pipeline for miners to submit containers, enabling Ridges to autonomously generate miners for other subnets.
- Python 3.9+
- pip
- OpenAI or Anthropic API key (saved as
OPENAI_API_KEYorANTHROPIC_API_KEY, respectively) - Docker installed and running (install guide)
- Clone the
ridgesrepo, including theSWE-agentsubmodule:
git clone --recurse-submodules https://github.com/taoagents/ridges
cd ridges- Install
ridgesandsweagent:pip install -e SWE-agent -e . - Install pm2 if you don't have it: guide
- Set the required envars in the
.envfile, using .env.miner_example as a template:cp .env.miner_example .envand populate.envwith the required credentials - Pull the latest sweagent Docker image:
docker pull sweagent/swe-agent:latest
Run the miner script with pm2.
pm2 start neurons/miner.py --name ridges-miner -- \
--netuid 62 \
--wallet.name <wallet> \
--wallet.hotkey <hotkey>
[--model <model to use, default is gpt4omini> (optional)]
[--instance-cost <max $ per miner query, default is 3> (optional)]pm2 start neurons/miner.py --name ridges-miner -- \
--netuid 244 \
--subtensor.network test \
--wallet.name <wallet> \
--wallet.hotkey <hotkey>
[--model <model to use, default is gpt4omini> (optional)]
[--instance-cost <max $ per miner query, default is 3> (optional)]If you are running in a virtual environment, remember to add the --interpreter <venv>/bin/python3 flag before the --.
Here are some tips for improving your miner:
- Try a different autonomous agent framework, e.g. AutoCodeRover (Devin?)
- Switch to a cheaper LLM provider to reduce cost
- Experiment with different retrieval methods (BM25, RAG, etc.)
- If sweagent is not appearing for autocompletion in VSCode/Cursor, add a
.vscode/settings.jsonfile with the following:
{
"python.analysis.extraPaths": [
"./SWE-agent"
]
}- Python 3.9+
- OpenAI API key (saved as
OPENAI_API_KEYenvar) - pip
- Clone the
ridgesrepo, including theSWE-agentsubmodule:
git clone --recurse-submodules https://github.com/taoagents/ridges
cd ridges- Install
ridgesandsweagent:pip install -e SWE-agent -e . - Install pm2 if you don't have it: guide
- Set the required envars in the
.envfile, using .env.validator_example as a template:cp .env.validator_example .envand populate.envwith the required credentials
Run the validator script via run_validator.sh, which will automatically keep it up to date:
./scripts/run_validator.sh \
--name ridges-validator \
-- \
--netuid 62 \
--wallet.name <wallet> \
--wallet.hotkey <hotkey>./scripts/run_validator.sh \
--name ridges-validator \
-- \
--netuid 244 \
--subtensor.network test \
--wallet.name <wallet> \
--wallet.hotkey <hotkey>Arguments before the -- will get passed to the pm2 start command, and arguments after get passed to the python neurons/validator.py command. So if you are running in a virtual environment, add a --interpreter <venv>/bin/python3 argument before the --.
You can optionally run both the validator and the auto-updater with pm2. This will create 2 separate pm2 processes: ridges-updater (everything related to auto-updates) and ridges-validator (the actual validator code).
You can run everything with pm2 like this:
pm2 start ./scripts/run_validator.sh \
--name ridges-updater \
-- \
--name ridges-validator \
-- \
--netuid 62 \
--wallet.name <wallet> \
--wallet.hotkey <hotkey>This script will automatically update the code as new updates are pushed. To run the validator with auto-updates off, you can run the script as RIDGES_VALIDATOR_AUTO_UPDATE=0 ./scripts/run_validator.sh .... Note that this is not recommended, as it will make it more difficult for us to obtain support in the case of issues.
Sending logs is fully optional, but recommended. As a new subnet there may be unexpected bugs or errors, and it will be very difficult for us to help you debug if we cannot see the logs. Use the PostHog credentials given in .env.[miner|validator]_example in order to allow us to trace the error and assist.
For support, please message the Ridges channel in the Bittensor Discord.
Ridges is released under the MIT License.
Credits to princeton-nlp for creating SWE-Bench and SWE-agent, which served as useful references and bases for creating the miner infrastructure.