Cursor Azure GPT-5

A service that allows Cursor to use Azure GPT-5 deployments by:

Adapting incoming Cursor completions API requests to the Responses API
Forwarding the requests to Azure
Adapting outgoing Azure Responses API streams into completions API streams

This project originates from Cursor's lack of support for Azure models that are only served through the Responses API. It will hopefully become obsolete as Cursor continues to improve its model support.

Warning

You still need an active paid Cursor subscription to be able to use this project.

Important

Azure now supports the Completions API for the models gpt-5, gpt-5-mini, and gpt-5-nano.

They can now be used directly in Cursor, but without the ability to change the Reasoning Effort / Verbosity / Summary Level. To do so, you can still use this project.

The models gpt-5-pro and gpt-5-codex remain available only through the Responses API, but work great with this project (see list of specific model limitations in the next section).

Supported Models

The entire gpt-5 series is supported, although some models have some limitations on the reasoning effort / verbosity / truncation / summary values they accept:

Variable	Value	5.2	5.2-chat	5.1	5.1-codex	5.1-codex-mini	5.1-codex-max	5	5-nano	5-mini	5-pro	5-codex
Reasoning	`minimal`	❌	❌	❌	❌	❌	❌	✅	✅	✅	❌	❌
	`low`	✅	❌	✅	✅	✅	✅	✅	✅	✅	❌	✅
	`medium`	✅	✅	✅	✅	✅	✅	✅	✅	✅	❌	✅
	`high`	✅	❌	✅	✅	✅	✅	✅	✅	✅	✅	✅
Verbosity	`low`	✅	❌	✅	❌	❌	❌	✅	✅	✅	❌	❌
	`medium`	✅	✅	✅	✅	✅	✅	✅	✅	✅	❌	✅
	`high`	✅	❌	✅	❌	❌	❌	✅	✅	✅	❌	❌
Truncation	`auto`	✅	✅	✅	✅	✅	✅	✅	✅	✅	❌	✅
	`disabled`	✅	✅	✅	✅	✅	✅	✅	✅	✅	❌	✅
Summary	`auto`	✅	✅	✅	✅	✅	✅	✅	✅	✅	❌	✅
	`detailed`	✅	✅	✅	✅	✅	✅	✅	✅	✅	❌	✅
	`concise`	✅	✅	✅	❌	❌	❌	✅	✅	✅	❌	❌

(This matrix is automatically generated, and updated after every new model release.)

Feature highlights

Switching between high/medium/low/minimal reasoning effort levels by selecting different models in Cursor.
Configuring different reasoning summary levels (auto, detailed, concise).
Displaying reasoning summaries in Cursor natively, like any other reasoning model.
Production-ready, so you can share the service among different users in an organization.
When running from a terminal, rich logging of the model's context on every request, including Markdown rendering, syntax highlighting, tool calls/outputs, and more.

Feel free to create or vote on any project issues, and star the project to show your support.

Quick start

If you prefer to deploy the service (for example, to allow multiple members of your team to use it), check the Production section, as the project comes with production-ready containers using supervisord and gunicorn.

1. Service configuration

Make a copy of the file .env.example as .env and update the following flags as needed:

Flag	Description	Default
`SERVICE_API_KEY`	Arbitrary API key to protect your service. Set it to a random string.	`change-me`
`AZURE_BASE_URL`	Your Azure OpenAI endpoint base URL (no trailing slash), e.g. `https://<resource>.openai.azure.com`.	required
`AZURE_API_KEY`	Azure OpenAI API key.	required
`AZURE_DEPLOYMENT`	Name of the Azure model deployment to use.	`gpt-5`
`AZURE_VERBOSITY_LEVEL`	Hint the model to be more or less expansive in its replies. Use either `high` / `medium` / `low`	`medium`
`AZURE_SUMMARY_LEVEL`	Set to `none` to disable summaries. You might have to disable them if your organization hasn't been approved for this feature.	`detailed`
`AZURE_TRUNCATION`	Truncation strategy for long inputs. Either `auto` or `disabled`	`disabled`

Alternatively, you can pass them through the environment where you run the application.

Optional Configuration

Flag	Description	Default
`AZURE_API_VERSION`	Azure OpenAI Responses API version to call.	`2025-04-01-preview`
`FLASK_ENV`	Flask environment. Use `development` for dev or `production` for prod.	`production`
`RECORD_TRAFFIC`	Toggle writing request/response traffic to `recordings/`	`off`
`LOG_CONTEXT`	Enable rich pretty-printing of request context to console.	`on`
`LOG_COMPLETION`	Enable logging of completion responses (not yet implemented).	`on`

2. Exposing the service

Why do I have to?

Since Cursor routes requests through its external prompt-building service rather than directly from the IDE to your API, your custom endpoint must be publicly reachable on the Internet.

Consider using Cloudflare because its tunnels are free and require no account.

Install cloudflared and run:

cloudflared tunnel --url http://localhost:8080

Copy the URL of your tunnel from the output of the command. It looks something like this:

+----------------------------------------------------+
|  Your quick Tunnel has been created! Visit it at:  |
|  https://foo-bar.trycloudflare.com                 |
+----------------------------------------------------+

Then paste it into Cursor Settings > Models > API Keys > OpenAI API Key > Override OpenAI Base URL:

3. Configuring Cursor

In addition to updating the OpenAI Base URL, you need to:

Set OpenAI API Key to the value of SERVICE_API_KEY in your .env
Ensure the toggles for both options are on, as shown in the previous image.
Add the custom models called exactly gpt-high, gpt-medium, and gpt-low, as shown in the previous image. You can also create gpt-minimal for minimal reasoning effort for models that support it. You don't need to remove other models.

4. Running the service

To run the production version of the app:

docker compose up flask-prod

For instructions on how to run locally without Docker, and the different development commands, see the Development section.

Development

Running locally

Expand

Bootstrap your local environment

python -m venv .venv
pip install -r requirements/dev.txt

Running the development server

flask run -p 8080

Running the production server*

export FLASK_ENV=production
export FLASK_DEBUG=0
export LOG_LEVEL=info
flask run -p 8080

This will only run the Flask server with the production settings. For a closer approximation of the production server running with supervisord and gunicorn, check Running with Docker.

Running tests

flask test

To run only specific tests, you can use the pytest -k argument:

flask test -k ...

Running linter

flask lint

The lint command will attempt to fix any linting/style errors in the code. If you only want to know if the code will pass CI and do not wish for the linter to make changes, add the --check argument.

flask lint --check

Running with Docker

Expand

Running the development server

docker compose up flask-dev

Running the production server

docker compose up flask-prod

This image runs the server through supervisord and gunicorn. See the Production section for more details.

When running flask-prod, the production flags are set in docker-compose.yml:

    FLASK_ENV: production
    FLASK_DEBUG: 0
    LOG_LEVEL: info
    GUNICORN_WORKERS: 4

The list of environment: variables in the docker-compose.yml file takes precedence over any variables specified in .env.

Running tests

docker compose run --rm manage test

To run only specific tests, you can use the pytest -k argument:

docker compose run --rm manage test -k ...

Running linter

docker compose run --rm manage lint

The lint command will attempt to fix any linting/style errors in the code. If you only want to know if the code will pass CI and do not wish for the linter to make changes, add the --check argument.

docker compose run --rm manage lint --check

Testing

To make the generation of test fixtures easier, the RECORD_TRAFFIC flag has been added, which creates files with all the incoming/outgoing traffic between this service and Cursor/Azure in the directory recordings/

To avoid violating Cursor's intellectual property, a redaction layer removes any sensitive data, such as: system prompts, tool names, tool descriptions, and any context containing scaffolding from Cursor's prompt-building service.

Therefore, recorded traffic can be published under tests/recordings/ to be used as test fixtures while remaining MIT-licensed.

Production

Expand

Configure server

You might want to review and modify the following configuration files:

File	Description
`supervisord/gunicorn.conf`	Supervisor program config for Gunicorn (bind :5000, gevent; workers/log level from env; logs to stdout/stderr).
`supervisord/supervisord_entrypoint.sh`	Container entrypoint that execs supervisord (prepends it when args start with -).
`supervisord/supervisord.conf`	Main Supervisord config: socket, logging, nodaemon; includes conf.d program configs.

Build, tag, and push the image

docker compose build flask-prod
docker tag app-production your-tag
docker push your-tag

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
.cursor/rules		.cursor/rules
.github		.github
.vscode		.vscode
app		app
assets		assets
requirements		requirements
scripts		scripts
supervisord		supervisord
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
autoapp.py		autoapp.py
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Cursor Azure GPT-5

Supported Models

Feature highlights

Quick start

1. Service configuration

2. Exposing the service

3. Configuring Cursor

4. Running the service

Development

Running locally

Bootstrap your local environment

Running the development server

Running the production server*

Running tests

Running linter

Running with Docker

Running the development server

Running the production server

Running tests

Running linter

Testing

Production

Configure server

Build, tag, and push the image

About

Uh oh!

Contributors 4

Uh oh!

Languages

License

gabrii/Cursor-Azure-GPT-5

Folders and files

Latest commit

History

Repository files navigation

Cursor Azure GPT-5

Supported Models

Feature highlights

Quick start

1. Service configuration

2. Exposing the service

3. Configuring Cursor

4. Running the service

Development

Running locally

Bootstrap your local environment

Running the development server

Running the production server*

Running tests

Running linter

Running with Docker

Running the development server

Running the production server

Running tests

Running linter

Testing

Production

Configure server

Build, tag, and push the image

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors 4

Uh oh!

Languages