This setup is inspired by the [bitnami repository](https://github.com/bitnami/bitnami-docker-moodle for configuration parameters), but the docker images have been replaced with an official database image and a web server image from Moodle.
The purpose of this repo is to test a setup using large language models (LLM) to provide automated feedback. For this purpose CodeRunner is used. Even though CodeRunner is made for programming tasks, it is here used only for free form text. The programming features are used to pass student responses to the LLM.
Under jobe/ChatRunner there is a python package to provide the
API to call an LLM from CodeRunner.
- Jonas Julius Harang, idea and original prototype
- Hans Georg Schaathun, refactoring and documentation for reuse and publication
- Make sure you have git, docker, and docker-compose.
- Run
sh gitclone.shto clone the Moodle directory with plugins required for CodeRunner. - Run
docker compose up -dto start the server. - Connect to http://localhost:8080/
- You will have to go through the setup procedure. In the database setup, you have to choose mariadb as the server type, and mariadb as the hostname. The database user and password are found in the docker-compose.yml file.
- Moodle will complain that you are not using SSL (https).
It still works, and for testing and prototyping there is no need to worry. For production, this has to change.
- Configure Site Administration -> Plugins -> CodeRunner. Set Jobe server to «jobe».
- Configure Site Administration -> General -> HTTP Security. Prune the «cURL blocked hosts list». It may suffice to remove the 172.* addresses, but this may depend on the configuration of docker.
- Run
docker exec -it moodle-coderunner-docker-moodle-1 /usr/local/bin/php /var/www/html/admin/cli/cron.php- Moodle usually requires a cron job, but cron works poorly in docker containers.
- You may have to rerun the above command regularly, but the critical issue is to run it once to have the question bank work.
- In production it should be run from cron.
To enter a sample question in CodeRunner, you can open a new question and make the following changes. This assumes that you have an API key with OpenAI.
- Under CodeRunner Question type
- Question Type, select Python3
- Customisation, tick Customise
- Enter the following under CodeRunner question type -> Template params:
{ "API": "openai", "model": "gpt-4o", "url": "https://api.openai.com/v1/chat/completions", "OPENAI_API_KEY": "<your key>" }
-
Enter the contents of file
jobe/ChatRunner/chatgpt.pyunder Customisation -> Template -
Under Customisation -> Grading, select Template grader
-
Enter a time limit under Advanced Customisation -> Sandbox -> TimieLimit. In production you probably do not want more than 20, but for testing it may be useful to have, say, 180.
-
Under General, give the question a name and question text. This does not matter for testing. You can use sampel question text from
Example/problem.mdand «Mikroskopet» for question name. -
Under support files, add the files from the Example directory: literature.json, problem.md, question.md
-
Testing the question, you may use
Example/naiveanswer.mdas a dummy answer.
Developing your own qestion, you change the files used in steps 6-8; everything else is constant.
Using different language models, you change the sandbox parameters i Step 2.
To test ChatRunner without using Moodle, you should install it using pip. For instance, like this
python3 -m venv venv
. venv/bin/activate
pip install build
cd jobe/ChatRunner
pip install -e .This installs in editable mode, so that you can keep developing and testing the module.
To test against OpenAI/ChatGPT, you have to get an API key, and edit
the config file chatgpt.json to use this key before running,
sh test.sh --config chatgpt.json --markdownAt NTNU, you may be able to use Idun. This also requires an API key,
and sample config is idun.json.
There are two options to modify the output:
--markdownformats the output in markdown--verbosegives additional debug output The default is the format used internally within Moodle.
There are different modes to test different internal features.
Use the --mode option with
moodle (alt.--moodle`) runs the test in the sandbox as used in moodle.dump(alt.--debug) dumps and reparses the output as is required by the sandbox.baselineuses the old prompt, using plain text to describe the JSON formatnewuses the new prompt using the API to specify the JSON schema
This is work in progress, and we have not yet been able to format the
output, which is intended to be parsed by CodeRunner, so that it is
readable for human users in the command line interface.
It is possible to add the -T option to run outside the sandbox, which
gives more debug information.
For batch testing, question/answer tests can be defined in a TOML file. There is no reference documentation, but there are examples to demonstrate the format.
Example/exphil.tomlis complete, but does not use grading criteria.Example/optics.tomlhas only nonsense answers, but demonstrate the use of grading criteria.
To run a batch test, the following command can be used. (Remember to add API key to the config file.)
python -m ChatRunner --config idun.toml --batch Example/exphil.toml --outfile Example/exphil-idun.toml --count 5 Sample output is included in the repo, showing how feedback from AI is added to the originl sample object from the input TOML file.
As always a config file is required, supplying URL, API key, and any other data the server requires. The TOML format allows listing multiple models, and the batch processor will test each model in turn.
The --count option specifies the number of queries made per student answer.
This is intended for consistency testing.
We have started experimenting using ollama, but this is still flaky and unstable.
You can run ollama in docker, using
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
docker exec -it ollama ollama pull llama3This installs the llama3 model. You can install other models as desired.
If you want to test this as a chatbot, use,
docker exec -it ollama ollama run llama3To test ChatRunner against ollama, you can run
sh test.sh --config ollama.json --markdownThe main problem with ollama, is that the models available are inferior to chatgpt and often produce syntactically unexpected output. To make it work in practice, two things are required
- Improved prompting to reduce the error frequency.
- Improve error handling to manage the consequences of errors.
- Docker images
- jobe runs jobe with ChatRunner from the working copy
- jobe-production runs jobe with the latest release of ChatRunner
We have not found a good way to test continuous development within moodle. It is necessary to shut docker compose down, delete the jobe image, and restart docker compose, whenever ChatRunner is edited; i.e.
docker compose down
docker rmi moodle-coderunner-docker-jobe
docker compose up -dWe tried installing ChatRunner in editable mode and mount the ChatRunner directory from the host, but it seems that changes to the module do not affect jobe.