CodeRunner Docker Environment

This setup is inspired by the [bitnami repository](https://github.com/bitnami/bitnami-docker-moodle for configuration parameters), but the docker images have been replaced with an official database image and a web server image from Moodle.

The purpose of this repo is to test a setup using large language models (LLM) to provide automated feedback. For this purpose CodeRunner is used. Even though CodeRunner is made for programming tasks, it is here used only for free form text. The programming features are used to pass student responses to the LLM.

Under jobe/ChatRunner there is a python package to provide the API to call an LLM from CodeRunner.

Authors

Jonas Julius Harang, idea and original prototype
Hans Georg Schaathun, refactoring and documentation for reuse and publication

Usage in Moodle

Make sure you have git, docker, and docker-compose.
Run sh gitclone.sh to clone the Moodle directory with plugins required for CodeRunner.
Run docker compose up -d to start the server.
Connect to http://localhost:8080/
- You will have to go through the setup procedure. In the database setup, you have to choose mariadb as the server type, and mariadb as the hostname. The database user and password are found in the docker-compose.yml file.
- Moodle will complain that you are not using SSL (https).
  It still works, and for testing and prototyping there is no need to worry. For production, this has to change.
Configure Site Administration -> Plugins -> CodeRunner. Set Jobe server to «jobe».
Configure Site Administration -> General -> HTTP Security. Prune the «cURL blocked hosts list». It may suffice to remove the 172.* addresses, but this may depend on the configuration of docker.
Run docker exec -it moodle-coderunner-docker-moodle-1 /usr/local/bin/php /var/www/html/admin/cli/cron.php
- Moodle usually requires a cron job, but cron works poorly in docker containers.
- You may have to rerun the above command regularly, but the critical issue is to run it once to have the question bank work.
- In production it should be run from cron.

Sample Question

To enter a sample question in CodeRunner, you can open a new question and make the following changes. This assumes that you have an API key with OpenAI.

Under CodeRunner Question type
- Question Type, select Python3
- Customisation, tick Customise
Enter the following under CodeRunner question type -> Template params:

{ "API": "openai", "model": "gpt-4o", "url": "https://api.openai.com/v1/chat/completions", "OPENAI_API_KEY": "<your key>" }

Enter the contents of file jobe/ChatRunner/chatgpt.py under Customisation -> Template
Under Customisation -> Grading, select Template grader
Enter a time limit under Advanced Customisation -> Sandbox -> TimieLimit. In production you probably do not want more than 20, but for testing it may be useful to have, say, 180.
Under General, give the question a name and question text. This does not matter for testing. You can use sampel question text from Example/problem.md and «Mikroskopet» for question name.
Under support files, add the files from the Example directory: literature.json, problem.md, question.md
Testing the question, you may use Example/naiveanswer.md as a dummy answer.

Developing your own qestion, you change the files used in steps 6-8; everything else is constant.

Using different language models, you change the sandbox parameters i Step 2.

Testing and Developing ChatRunner

To test ChatRunner without using Moodle, you should install it using pip. For instance, like this

python3 -m venv venv
. venv/bin/activate
pip install build
cd jobe/ChatRunner
pip install -e .

This installs in editable mode, so that you can keep developing and testing the module.

To test against OpenAI/ChatGPT, you have to get an API key, and edit the config file chatgpt.json to use this key before running,

sh test.sh --config chatgpt.json --markdown

At NTNU, you may be able to use Idun. This also requires an API key, and sample config is idun.json.

There are two options to modify the output:

--markdown formats the output in markdown
--verbose gives additional debug output The default is the format used internally within Moodle.

There are different modes to test different internal features. Use the --mode option with

moodle (alt. --moodle`) runs the test in the sandbox as used in moodle.
dump (alt. --debug) dumps and reparses the output as is required by the sandbox.
baseline uses the old prompt, using plain text to describe the JSON format
new uses the new prompt using the API to specify the JSON schema

This is work in progress, and we have not yet been able to format the output, which is intended to be parsed by CodeRunner, so that it is readable for human users in the command line interface. It is possible to add the -T option to run outside the sandbox, which gives more debug information.

Batch testing

For batch testing, question/answer tests can be defined in a TOML file. There is no reference documentation, but there are examples to demonstrate the format.

Example/exphil.toml is complete, but does not use grading criteria.
Example/optics.toml has only nonsense answers, but demonstrate the use of grading criteria.

To run a batch test, the following command can be used. (Remember to add API key to the config file.)

python -m ChatRunner --config idun.toml --batch Example/exphil.toml --outfile Example/exphil-idun.toml --count 5

Sample output is included in the repo, showing how feedback from AI is added to the originl sample object from the input TOML file.

As always a config file is required, supplying URL, API key, and any other data the server requires. The TOML format allows listing multiple models, and the batch processor will test each model in turn.

The --count option specifies the number of queries made per student answer. This is intended for consistency testing.

Using Ollama

We have started experimenting using ollama, but this is still flaky and unstable.

You can run ollama in docker, using

docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
docker exec -it ollama ollama pull llama3

This installs the llama3 model. You can install other models as desired.

If you want to test this as a chatbot, use,

docker exec -it ollama ollama run llama3

To test ChatRunner against ollama, you can run

sh test.sh --config ollama.json --markdown

The main problem with ollama, is that the models available are inferior to chatgpt and often produce syntactically unexpected output. To make it work in practice, two things are required

Improved prompting to reduce the error frequency.
Improve error handling to manage the consequences of errors.

Overview of subdirectories

Docker images
- jobe runs jobe with ChatRunner from the working copy
- jobe-production runs jobe with the latest release of ChatRunner

Development

We have not found a good way to test continuous development within moodle. It is necessary to shut docker compose down, delete the jobe image, and restart docker compose, whenever ChatRunner is edited; i.e.

docker compose down
docker rmi moodle-coderunner-docker-jobe
docker compose up -d

We tried installing ChatRunner in editable mode and mount the ChatRunner directory from the host, but it seems that changes to the module do not affect jobe.

Name		Name	Last commit message	Last commit date
Latest commit History 193 Commits
Example		Example
Notes		Notes
config		config
db		db
jobe-production		jobe-production
jobe		jobe
moodle-config		moodle-config
moodle-docker		moodle-docker
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
IDUN.md		IDUN.md
LICENSE		LICENSE
README.md		README.md
batchtest.sh		batchtest.sh
docker-compose.yml		docker-compose.yml
idun.toml		idun.toml
requirements.txt		requirements.txt
test.sh		test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CodeRunner Docker Environment

Authors

Usage in Moodle

Sample Question

Testing and Developing ChatRunner

Batch testing

Using Ollama

Overview of subdirectories

Development

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CodeRunner Docker Environment

Authors

Usage in Moodle

Sample Question

Testing and Developing ChatRunner

Batch testing

Using Ollama

Overview of subdirectories

Development

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages