Skip to content

Improve error message when running GPU models on a VM without GPU #30

@elsewhat

Description

@elsewhat

Have tested Foundry CLI based the installer FoundryLocal-windows-installer-x64-v0.2.9224.msix in a Virtual Machine.
Using Hyper-V with a Windows 11 Enterprise (version 10.0.26100 Build 26100) image

As far as I know, this setup with default settingsdoes not share GPU resource from the host with the virtual machine.

The CPU model Phi-4-mini-cpu-int4-rtn-block-32-acc-level-4-onnx work well with this setup

The GPU model Phi-4-mini-gpu-int4-rtn-block-32 fails with message Status: 500 (Internal Server Error).
Think it's ok it fails, but the error message should be more informative.

C:\Windows\System32>foundry model run phi4-gpu
Model phi4-gpu is already stored in your local cache.
To force a re-download enter the following command: foundry download phi4-gpu --force
🕛 Loading model...
🟢 Model Phi-4-mini-gpu-int4-rtn-block-32 loaded successfully

Interactive Chat Help

/?, /help                       - Display this help message
/exit                           - Exit interactive chat
/info                           - Display model information
/get_config                     - Get all parameter values
/set_config <parameter>:<value> - Set the parameter (system_prompt, max_tokens, temperature, top_p, top_k, random_seed)

Interactive mode, please enter your prompt
> test
🤖 Exception: Error during chat
Service request failed.
Status: 500 (Internal Server Error)

Client log :

Details
2025-04-22 03:16:32.480 -07:00 [INF] Starting service <C:\Program Files\WindowsApps\Microsoft.FoundryLocal_0.2.9224.7622_x64__8wekyb3d8bbwe\Inference.Service.Agent.exe --urls http://localhost:5272 --OpenAIServiceSettings:ModelDirPath=C:\Users\User\.aitk\models --JsonRpcServer:Run=true --JsonRpcServer:PipeName=inference_agent>
2025-04-22 03:16:33.537 -07:00 [INF] Downloading Phi-4-mini-gpu-int4-rtn-block-32 url:http://localhost:5272/openai/download
2025-04-22 03:17:50.410 -07:00 [INF] Starting Foundry Local CLI with '--help'
2025-04-22 03:17:59.265 -07:00 [INF] Starting Foundry Local CLI with 'model run phi4-gpu'
2025-04-22 03:17:59.383 -07:00 [INF] Loading model: http://localhost:5272/openai/load/Phi-4-mini-gpu-int4-rtn-block-32?ttl=600&ep=webgpu
2025-04-22 03:18:11.262 -07:00 [INF] 🟢 Model Phi-4-mini-gpu-int4-rtn-block-32 loaded successfully
2025-04-22 03:19:07.271 -07:00 [INF] LogException
Microsoft.AI.Foundry.Local.Common.FLException: Error during chat
 ---> System.ClientModel.ClientResultException: Service request failed.
Status: 500 (Internal Server Error)
   at OpenAI.ClientPipelineExtensions.<ProcessMessageAsync>d__0.MoveNext() + 0x29c
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b
   at OpenAI.Chat.ChatClient.<CompleteChatAsync>d__16.MoveNext() + 0x198
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b
   at OpenAI.Chat.ChatClient.<>c__DisplayClass10_0.<<CompleteChatStreamingAsync>b__0>d.MoveNext() + 0xf2
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b
   at OpenAI.AsyncSseUpdateCollection`1.<GetRawPagesAsync>d__7.MoveNext() + 0xe1
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20
   at System.Threading.Tasks.Sources.ManualResetValueTaskSourceCore`1.ThrowForFailedGetResult() + 0x13
   at System.Threading.Tasks.Sources.ManualResetValueTaskSourceCore`1.GetResult(Int16) + 0x2c
   at System.ClientModel.AsyncCollectionResult`1.<GetAsyncEnumerator>d__1.MoveNext() + 0x5ff
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20
   at System.ClientModel.AsyncCollectionResult`1.<GetAsyncEnumerator>d__1.MoveNext() + 0x765
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20
   at System.Threading.Tasks.Sources.ManualResetValueTaskSourceCore`1.ThrowForFailedGetResult() + 0x13
   at System.Threading.Tasks.Sources.ManualResetValueTaskSourceCore`1.GetResult(Int16) + 0x2c
   at Microsoft.AI.Foundry.Local.Commands.ModelRunCommand.<InteractiveNewRoundAsync>d__4.MoveNext() + 0x2b5
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20
   at Microsoft.AI.Foundry.Local.Commands.ModelRunCommand.<InteractiveNewRoundAsync>d__4.MoveNext() + 0x3e2
   --- End of inner exception stack trace ---
   at Microsoft.AI.Foundry.Local.Commands.ModelRunCommand.HandleExceptionDuringChat(Exception) + 0xfd
   at Microsoft.AI.Foundry.Local.Commands.ModelRunCommand.<InteractiveNewRoundAsync>d__4.MoveNext() + 0x594
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b
   at Microsoft.AI.Foundry.Local.Commands.ModelRunCommand.<RunInteractiveNewRoundAsync>d__2.MoveNext() + 0x132
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b
   at Microsoft.AI.Foundry.Local.Commands.ModelRunCommand.<<Create>b__1_0>d.MoveNext() + 0x1dc4
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b
   at Microsoft.AI.Foundry.Local.Common.CommandActionFactory.<>c__DisplayClass0_0`1.<<Create>b__0>d.MoveNext() + 0x1bb
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b
   at System.CommandLine.NamingConventionBinder.CommandHandler.<GetExitCodeAsync>d__66.MoveNext() + 0xf1
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b
   at System.CommandLine.NamingConventionBinder.ModelBindingCommandHandler.<InvokeAsync>d__11.MoveNext() + 0x228
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b
   at System.CommandLine.Invocation.InvocationPipeline.<InvokeAsync>d__0.MoveNext() + 0x323
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b

Service log:

Details Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx[0] 03:21:55.38 0 OpenAIService for type:OpenAIServiceProviderOnnx Information: Microsoft.Neutron.Rpc.Service.JsonRpcService[0] 03:21:55.38 0 json-rpc-service running Information: Microsoft.Neutron.Rpc.Service.JsonRpcService[2305] 03:21:55.38 accept_pipe_connections Accepting pipe connections pipeName:inference_agent Information: Microsoft.Hosting.Lifetime[14] 03:21:55.39 ListeningOnAddress Now listening on: http://localhost:5272 Information: Microsoft.Hosting.Lifetime[0] 03:21:55.39 0 Application started. Press Ctrl+C to shut down. Information: Microsoft.Hosting.Lifetime[0] 03:21:55.39 0 Hosting environment: Production Information: Microsoft.Hosting.Lifetime[0] 03:21:55.39 0 Content root path: C:\Program Files\WindowsApps\Microsoft.FoundryLocal_0.2.9224.7622_x64__8wekyb3d8bbwe\ Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx[1400] 03:21:57.45 load_model_started Loading model:Phi-4-mini-gpu-int4-rtn-block-32 Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx[1401] 03:22:01.69 load_model_finished Finish loading model:Phi-4-mini-gpu-int4-rtn-block-32 elapsed time:00:00:04.2437755 Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx[0] 03:22:03.52 0 HandleChatCompletionAsStreamRequest -> model:Phi-4-mini-gpu-int4-rtn-block-32 MaxCompletionTokens:2048 maxTokens:(null) temperature:(null) topP:(null) Error: Microsoft.AspNetCore.Server.Kestrel[13] 03:22:03.75 ApplicationError Connection id "0HNC1GU1FFBT0", Request id "0HNC1GU1FFBT0:00000001": An unhandled exception was thrown by the application. message:D:\a\_work\1\s\onnxruntime\core\providers\webgpu\buffer_manager.cc:321 onnxruntime::webgpu::BufferManager::Download::::operator () status == wgpu::MapAsyncStatus::Success was false. Failed to download data from buffer: [Device] is lost. stack: at Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess(IntPtr) + 0x54 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.OnnxChatGenerator..ctor(OnnxLoadedModel, GeneratorParams, Sequences) + 0x63 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.d__21.MoveNext() + 0xed7 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.d__14.MoveNext() + 0xb8 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase`1.d__34.MoveNext() + 0x352 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Threading.Tasks.Sources.ManualResetValueTaskSourceCore`1.ThrowForFailedGetResult() + 0x13 at System.Threading.Tasks.Sources.ManualResetValueTaskSourceCore`1.GetResult(Int16) + 0x2c at Microsoft.Neutron.OpenAI.OpenAIServiceWebApiExtensions.<>c__DisplayClass2_0.<b__0>d.MoveNext() + 0x2cb --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at Microsoft.Neutron.OpenAI.OpenAIServiceWebApiExtensions.<>c__DisplayClass2_0.<b__0>d.MoveNext() + 0x3ec --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b at Microsoft.AspNetCore.Http.Generated.F16C589DE9EC82483AA705851D2FE201CB4CB4AAF6561E8DE71B6A1891AD8D67F__GeneratedRouteBuilderExtensionsCore.<>c__DisplayClass11_0.<g__RequestHandler|4>d.MoveNext() + 0x4f8 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.d__238`1.MoveNext() + 0x404 Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx[0] 03:22:03.76 0 HandleChatCompletionAsStreamRequest -> model:Phi-4-mini-gpu-int4-rtn-block-32 MaxCompletionTokens:2048 maxTokens:(null) temperature:(null) topP:(null) Error: Microsoft.AspNetCore.Server.Kestrel[13] 03:22:03.77 ApplicationError Connection id "0HNC1GU1FFBT0", Request id "0HNC1GU1FFBT0:00000002": An unhandled exception was thrown by the application. message:D:\a\_work\1\s\onnxruntime\core\providers\webgpu\buffer_manager.cc:321 onnxruntime::webgpu::BufferManager::Download::::operator () status == wgpu::MapAsyncStatus::Success was false. Failed to download data from buffer: [Device] is lost. stack: at Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess(IntPtr) + 0x54 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.OnnxChatGenerator..ctor(OnnxLoadedModel, GeneratorParams, Sequences) + 0x63 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.d__21.MoveNext() + 0xed7 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.d__14.MoveNext() + 0xb8 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase`1.d__34.MoveNext() + 0x352 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Threading.Tasks.Sources.ManualResetValueTaskSourceCore`1.ThrowForFailedGetResult() + 0x13 at System.Threading.Tasks.Sources.ManualResetValueTaskSourceCore`1.GetResult(Int16) + 0x2c at Microsoft.Neutron.OpenAI.OpenAIServiceWebApiExtensions.<>c__DisplayClass2_0.<b__0>d.MoveNext() + 0x2cb --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at Microsoft.Neutron.OpenAI.OpenAIServiceWebApiExtensions.<>c__DisplayClass2_0.<b__0>d.MoveNext() + 0x3ec --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b at Microsoft.AspNetCore.Http.Generated.F16C589DE9EC82483AA705851D2FE201CB4CB4AAF6561E8DE71B6A1891AD8D67F__GeneratedRouteBuilderExtensionsCore.<>c__DisplayClass11_0.<g__RequestHandler|4>d.MoveNext() + 0x4f8 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.d__238`1.MoveNext() + 0x404 Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx[0] 03:22:03.77 0 HandleChatCompletionAsStreamRequest -> model:Phi-4-mini-gpu-int4-rtn-block-32 MaxCompletionTokens:2048 maxTokens:(null) temperature:(null) topP:(null) Error: Microsoft.AspNetCore.Server.Kestrel[13] 03:22:03.78 ApplicationError Connection id "0HNC1GU1FFBT0", Request id "0HNC1GU1FFBT0:00000003": An unhandled exception was thrown by the application. message:D:\a\_work\1\s\onnxruntime\core\providers\webgpu\buffer_manager.cc:321 onnxruntime::webgpu::BufferManager::Download::::operator () status == wgpu::MapAsyncStatus::Success was false. Failed to download data from buffer: [Device] is lost. stack: at Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess(IntPtr) + 0x54 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.OnnxChatGenerator..ctor(OnnxLoadedModel, GeneratorParams, Sequences) + 0x63 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.d__21.MoveNext() + 0xed7 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.d__14.MoveNext() + 0xb8 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase`1.d__34.MoveNext() + 0x352 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Threading.Tasks.Sources.ManualResetValueTaskSourceCore`1.ThrowForFailedGetResult() + 0x13 at System.Threading.Tasks.Sources.ManualResetValueTaskSourceCore`1.GetResult(Int16) + 0x2c at Microsoft.Neutron.OpenAI.OpenAIServiceWebApiExtensions.<>c__DisplayClass2_0.<b__0>d.MoveNext() + 0x2cb --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at Microsoft.Neutron.OpenAI.OpenAIServiceWebApiExtensions.<>c__DisplayClass2_0.<b__0>d.MoveNext() + 0x3ec --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b at Microsoft.AspNetCore.Http.Generated.F16C589DE9EC82483AA705851D2FE201CB4CB4AAF6561E8DE71B6A1891AD8D67F__GeneratedRouteBuilderExtensionsCore.<>c__DisplayClass11_0.<g__RequestHandler|4>d.MoveNext() + 0x4f8 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.d__238`1.MoveNext() + 0x404 Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx[0] 03:22:03.78 0 HandleChatCompletionAsStreamRequest -> model:Phi-4-mini-gpu-int4-rtn-block-32 MaxCompletionTokens:2048 maxTokens:(null) temperature:(null) topP:(null) Error: Microsoft.AspNetCore.Server.Kestrel[13] 03:22:03.79 ApplicationError Connection id "0HNC1GU1FFBT0", Request id "0HNC1GU1FFBT0:00000004": An unhandled exception was thrown by the application. message:D:\a\_work\1\s\onnxruntime\core\providers\webgpu\buffer_manager.cc:321 onnxruntime::webgpu::BufferManager::Download::::operator () status == wgpu::MapAsyncStatus::Success was false. Failed to download data from buffer: [Device] is lost. stack: at Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess(IntPtr) + 0x54 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.OnnxChatGenerator..ctor(OnnxLoadedModel, GeneratorParams, Sequences) + 0x63 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.d__21.MoveNext() + 0xed7 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.d__14.MoveNext() + 0xb8 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase`1.d__34.MoveNext() + 0x352 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Threading.Tasks.Sources.ManualResetValueTaskSourceCore`1.ThrowForFailedGetResult() + 0x13 at System.Threading.Tasks.Sources.ManualResetValueTaskSourceCore`1.GetResult(Int16) + 0x2c at Microsoft.Neutron.OpenAI.OpenAIServiceWebApiExtensions.<>c__DisplayClass2_0.<b__0>d.MoveNext() + 0x2cb --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at Microsoft.Neutron.OpenAI.OpenAIServiceWebApiExtensions.<>c__DisplayClass2_0.<b__0>d.MoveNext() + 0x3ec --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b at Microsoft.AspNetCore.Http.Generated.F16C589DE9EC82483AA705851D2FE201CB4CB4AAF6561E8DE71B6A1891AD8D67F__GeneratedRouteBuilderExtensionsCore.<>c__DisplayClass11_0.<g__RequestHandler|4>d.MoveNext() + 0x4f8 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x20 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0xb2 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x4b at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.d__238`1.MoveNext() + 0x404 ```

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions