Skip to content

Conversation

@ArthurCRodrigues
Copy link
Member

@ArthurCRodrigues ArthurCRodrigues commented Jan 28, 2026

I'll add some bullet points and quick explanations here, so for more details, check #151

Context

We were running into some nasty problems when trying to run the project as a web API. Static variables, state and speed were being a major concern and it was simply impossible to run it as a service.

Also, the previous architecture relied in an orchestrated grading workflow that was causing the orchestration file (autograder_facade.py) to be extremely coupled. Adding a new step meant adding orchestration logic to the grading process.

When it comes to executing student code remotely, we were spinning containers for every request with no proper control. And spinning containers was taking most of the request time.

Finally, we had the goal of being able to store grading packages so that teachers could only send them once and for every submission keep only a reference of the assignment configuration.

Solution

So, in this PR, we introduce an architecture that follows a pipeline pattern: Each step knows that it takes place in a grading process, they're not simple service providers anymore (they are choreographed, not orchestrated). Which makes it way easier to include more steps or adjust the order of which steps are executed within the pipeline.

Secondarily, we introduce a really robust sandbox management sub-system that's responsible for handling all sandbox containers. It uses an optmization technique that starts containers upon application startup and keep them "warm" and ready to receive code. By doing this, we manage to keep control of container usage within the system and also solve the problem of container startup time for requests since they'll be already up.

Pipeline Architecture

An AutograderPipeline is an instance of a grading recipe based on the grading configuration. It contains all the necessary steps (and associated data) to grade the assignment configured by the teacher. One pipeline can have the setup step for checking for required files while another may not, it all depends on what the teachers configured.

And what's cool about the AutograderPipeline is that it's really about the recipe. You can cook submissions with it as much as you want. It is a stateless representation of an specific grading workflow, that can be executed for any submission. As you can see, it is highly compatible with the goal of storing grading packages and simply having their references for further grading.

Finally, we fixed the problems of static variables by following proper coding practices and not using static member variables anymore.

Sandbox Management

Since this is a draft PR, I haven't yet implemented this one. But I'm doing the research and the key points here is:

  • We'll use Gvisor for enhanced isolation
  • The Sandbox Management sub-system will work as a background process
  • We'll keep 2 containers running for each language as default (java,python,c,c++,js)
  • We'll add features for scaling on increased workload.

ArthurCRodrigues and others added 30 commits December 23, 2025 09:59
config grading WIP
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…twork/autograder into pipeline-architecture

# Conflicts:
#	autograder/autograder.py
@ArthurCRodrigues ArthurCRodrigues linked an issue Jan 28, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New Pipeline Architecture

3 participants