Skip to content

gabrii/Cursor-Azure-GPT-5

Repository files navigation

Cursor Azure GPT-5

Python Flask Pytest Docker GitHub License GitHub Actions Workflow Status Codecov

A service that allows Cursor to use Azure GPT-5 deployments by:

  • Adapting incoming Cursor completions API requests to the Responses API
  • Forwarding the requests to Azure
  • Adapting outgoing Azure Responses API streams into completions API streams

This project originates from Cursor's lack of support for Azure models that are only served through the Responses API. It will hopefully become obsolete as Cursor continues to improve its model support.

Warning

You still need an active paid Cursor subscription to be able to use this project.

Important

Azure now supports the Completions API for the models gpt-5, gpt-5-mini, and gpt-5-nano.

They can now be used directly in Cursor, but without the ability to change the Reasoning Effort / Verbosity / Summary Level. To do so, you can still use this project.

The models gpt-5-pro and gpt-5-codex remain available only through the Responses API, but work great with this project (see list of specific model limitations in the next section).

Supported Models

The entire gpt-5 series is supported, although some models have some limitations on the reasoning effort / verbosity / truncation / summary values they accept:

Variable Value 5.2 5.2-chat 5.1 5.1-codex 5.1-codex-mini 5.1-codex-max 5 5-nano 5-mini 5-pro 5-codex
Reasoning minimal
low
medium
high
Verbosity low
medium
high
Truncation auto
disabled
Summary auto
detailed
concise

(This matrix is automatically generated, and updated after every new model release.)

Feature highlights

  • Switching between high/medium/low/minimal reasoning effort levels by selecting different models in Cursor.
  • Configuring different reasoning summary levels (auto, detailed, concise).
  • Displaying reasoning summaries in Cursor natively, like any other reasoning model.
  • Production-ready, so you can share the service among different users in an organization.
  • When running from a terminal, rich logging of the model's context on every request, including Markdown rendering, syntax highlighting, tool calls/outputs, and more.

Feel free to create or vote on any project issues, and star the project to show your support.

Quick start

If you prefer to deploy the service (for example, to allow multiple members of your team to use it), check the Production section, as the project comes with production-ready containers using supervisord and gunicorn.

1. Service configuration

Make a copy of the file .env.example as .env and update the following flags as needed:

Flag Description Default
SERVICE_API_KEY Arbitrary API key to protect your service. Set it to a random string. change-me
AZURE_BASE_URL Your Azure OpenAI endpoint base URL (no trailing slash), e.g. https://<resource>.openai.azure.com. required
AZURE_API_KEY Azure OpenAI API key. required
AZURE_DEPLOYMENT Name of the Azure model deployment to use. gpt-5
AZURE_VERBOSITY_LEVEL Hint the model to be more or less expansive in its replies. Use either high / medium / low medium
AZURE_SUMMARY_LEVEL Set to none to disable summaries. You might have to disable them if your organization hasn't been approved for this feature. detailed
AZURE_TRUNCATION Truncation strategy for long inputs. Either auto or disabled disabled

Alternatively, you can pass them through the environment where you run the application.

Optional Configuration
Flag Description Default
AZURE_API_VERSION Azure OpenAI Responses API version to call. 2025-04-01-preview
FLASK_ENV Flask environment. Use development for dev or production for prod. production
RECORD_TRAFFIC Toggle writing request/response traffic to recordings/ off
LOG_CONTEXT Enable rich pretty-printing of request context to console. on
LOG_COMPLETION Enable logging of completion responses (not yet implemented). on

2. Exposing the service

Why do I have to?

Since Cursor routes requests through its external prompt-building service rather than directly from the IDE to your API, your custom endpoint must be publicly reachable on the Internet.

Consider using Cloudflare because its tunnels are free and require no account.

Install cloudflared and run:

cloudflared tunnel --url http://localhost:8080

Copy the URL of your tunnel from the output of the command. It looks something like this:

+----------------------------------------------------+
|  Your quick Tunnel has been created! Visit it at:  |
|  https://foo-bar.trycloudflare.com                 |
+----------------------------------------------------+

Then paste it into Cursor Settings > Models > API Keys > OpenAI API Key > Override OpenAI Base URL:

How to use Azure API key in Cursor for GPT-5

3. Configuring Cursor

In addition to updating the OpenAI Base URL, you need to:

  1. Set OpenAI API Key to the value of SERVICE_API_KEY in your .env

  2. Ensure the toggles for both options are on, as shown in the previous image.

  3. Add the custom models called exactly gpt-high, gpt-medium, and gpt-low, as shown in the previous image. You can also create gpt-minimal for minimal reasoning effort for models that support it. You don't need to remove other models.

4. Running the service

To run the production version of the app:

docker compose up flask-prod

For instructions on how to run locally without Docker, and the different development commands, see the Development section.

Development

Running locally

Expand

Bootstrap your local environment

python -m venv .venv
pip install -r requirements/dev.txt

Running the development server

flask run -p 8080

Running the production server*

export FLASK_ENV=production
export FLASK_DEBUG=0
export LOG_LEVEL=info
flask run -p 8080

This will only run the Flask server with the production settings. For a closer approximation of the production server running with supervisord and gunicorn, check Running with Docker.

Running tests

flask test

To run only specific tests, you can use the pytest -k argument:

flask test -k ...

Running linter

flask lint

The lint command will attempt to fix any linting/style errors in the code. If you only want to know if the code will pass CI and do not wish for the linter to make changes, add the --check argument.

flask lint --check

Running with Docker

Expand

Running the development server

docker compose up flask-dev

Running the production server

docker compose up flask-prod

This image runs the server through supervisord and gunicorn. See the Production section for more details.

When running flask-prod, the production flags are set in docker-compose.yml:

    FLASK_ENV: production
    FLASK_DEBUG: 0
    LOG_LEVEL: info
    GUNICORN_WORKERS: 4

The list of environment: variables in the docker-compose.yml file takes precedence over any variables specified in .env.

Running tests

docker compose run --rm manage test

To run only specific tests, you can use the pytest -k argument:

docker compose run --rm manage test -k ...

Running linter

docker compose run --rm manage lint

The lint command will attempt to fix any linting/style errors in the code. If you only want to know if the code will pass CI and do not wish for the linter to make changes, add the --check argument.

docker compose run --rm manage lint --check

Testing

To make the generation of test fixtures easier, the RECORD_TRAFFIC flag has been added, which creates files with all the incoming/outgoing traffic between this service and Cursor/Azure in the directory recordings/

To avoid violating Cursor's intellectual property, a redaction layer removes any sensitive data, such as: system prompts, tool names, tool descriptions, and any context containing scaffolding from Cursor's prompt-building service.

Therefore, recorded traffic can be published under tests/recordings/ to be used as test fixtures while remaining MIT-licensed.

Production

Expand

Configure server

You might want to review and modify the following configuration files:

File Description
supervisord/gunicorn.conf Supervisor program config for Gunicorn (bind :5000, gevent; workers/log level from env; logs to stdout/stderr).
supervisord/supervisord_entrypoint.sh Container entrypoint that execs supervisord (prepends it when args start with -).
supervisord/supervisord.conf Main Supervisord config: socket, logging, nodaemon; includes conf.d program configs.

Build, tag, and push the image

docker compose build flask-prod
docker tag app-production your-tag
docker push your-tag

About

A service that allows Cursor to use Azure GPT-5 deployments.

Topics

Resources

License

Stars

Watchers

Forks

Contributors 4

  •  
  •  
  •  
  •