Skip to content

Conversation

@dbutenhof
Copy link
Collaborator

Summary

I was looking for a "dead simple" problem just to get my feet damp. Issue #205 stood out. (A comment on that issue includes some detailed analysis.)

Deserialization of some parameters like --data takes place during run rather than during Click option processing. Errors raise a validation exception, but to the Click CLI infrastructure this is an unexpected exception and causes a full traceback which is not generally useful to the user.

This change intercepts internal validation error exceptions and raises a Click BadParameter exception encapsulating the validation error text. This will be reported without traceback.

Details

This simply wraps the asyncio.run which starts benchmarking (and runs the deserializers) with a try block to convert the deserializer's ValueError into click.BadParameter so that Click can generate a better usage message and will suppress the traceback which can obscure the message.

dbutenho 14:54 badparam:fix/badparam guidellm mock-server --port 8004 &
dbutenho 11:02 badparam:fix/badparam guidellm benchmark --target http://localhost:8004/v1 --rate-type sweep --max-seconds 30 --model qwen3:4b --data "prompt_tokens=256,output_tokens=128"
Main |2026-01-22 11:03:07 -0500 ACCESS:   127.0.0.1:50704 GET http://localhost:8004/health                                                                                                                                200 50  0.1ms
✔ OpenAIHTTPBackend backend validated with model qwen3:4b
  {'target': 'http://localhost:8004', 'model': 'qwen3:4b', 'timeout': 60.0, 'http2': True, 'follow_redirects': True, 'verify': False, 'openai_paths': {'health': 'health', 'models': 'v1/models', 'text_completions': 'v1/completions',
  'chat_completions': 'v1/chat/completions', 'audio_transcriptions': 'v1/audio/transcriptions', 'audio_translations': 'v1/audio/translations'}, 'validate_backend': {'method': 'GET', 'url': 'http://localhost:8004/health'}}          
✔ Processor resolved
  Using model 'qwen3:4b' as processor                                                                                                                                                                                                  
Usage: guidellm benchmark run [OPTIONS]
Try 'guidellm benchmark run --help' for help.

Error: Invalid value: Data deserialization failed, likely because the input doesn't match any of the input formats. See the 15 error(s) that occurred while attempting to deserialize the data prompt_tokens=256,output_tokens=128:
  - Deserializer 'huggingface': (HFValidationError) Repo id must use alphanumeric chars, '-', '_' or '.'. The name cannot start or end with '-' or '.' and the maximum length is 96: 'prompt_tokens=256,output_tokens=128'.
  - Deserializer 'synthetic_text': (HFValidationError) Repo id must use alphanumeric chars, '-', '_' or '.'. The name cannot start or end with '-' or '.' and the maximum length is 96: 'qwen3:4b'.
  - Deserializer 'arrow_file': (DataNotSupportedError) Unsupported data for ArrowFileDatasetDeserializer, expected str or Path to a local .arrow file, got prompt_tokens=256,output_tokens=128
  - Deserializer 'csv_file': (DataNotSupportedError) Unsupported data for CSVFileDatasetDeserializer, expected str or Path to a valid local .csv file, got prompt_tokens=256,output_tokens=128
  - Deserializer 'db_file': (DataNotSupportedError) Unsupported data for DBFileDatasetDeserializer, expected str or Path to a local .db file, got prompt_tokens=256,output_tokens=128
  - Deserializer 'hdf5_file': (DataNotSupportedError) Unsupported data for HDF5FileDatasetDeserializer, expected str or Path to a local .hdf5 or .h5 file, got prompt_tokens=256,output_tokens=128
  - Deserializer 'in_memory_csv_str': (DataNotSupportedError) Unsupported data for InMemoryCsvDatasetDeserializer, expected CSV string, got <class 'str'>
  - Deserializer 'in_memory_dict': (DataNotSupportedError) Unsupported data for InMemoryDictDatasetDeserializer, expected dict[str, list], got prompt_tokens=256,output_tokens=128
  - Deserializer 'in_memory_dict_list': (DataNotSupportedError) Unsupported data for InMemoryDictListDatasetDeserializer, expected list of dicts, got prompt_tokens=256,output_tokens=128
  - Deserializer 'in_memory_item_list': (DataNotSupportedError) Unsupported data for InMemoryItemListDatasetDeserializer, expected list of primitive items, got prompt_tokens=256,output_tokens=128
  - Deserializer 'in_memory_json_str': (DataNotSupportedError) Unsupported data for InMemoryJsonStrDatasetDeserializer, expected JSON string with a list or dict of items, got prompt_tokens=256,output_tokens=128
  - Deserializer 'json_file': (DataNotSupportedError) Unsupported data for JSONFileDatasetDeserializer, expected str or Path to a local .json or .jsonl file, got prompt_tokens=256,output_tokens=128
  - Deserializer 'parquet_file': (DataNotSupportedError) Unsupported data for ParquetFileDatasetDeserializer, expected str or Path to a local .parquet file, got prompt_tokens=256,output_tokens=128
  - Deserializer 'tar_file': (DataNotSupportedError) Unsupported data for TarFileDatasetDeserializer, expected str or Path to a local .tar file, got prompt_tokens=256,output_tokens=128
  - Deserializer 'text_file': (DataNotSupportedError) Unsupported data for TextFileDatasetDeserializer, expected str or Path to a local .txt or .text file, got prompt_tokens=256,output_tokens=128

Test Plan

This only affects the output of a validation error in benchmark run startup, removing the traceback. While it's not impossible to imagine a CliRunner test for this case, the output analysis would be a bit tedious and it didn't seem worthwhile. (Feel free to tell me otherwise! Currently, test_main.py is rather light on content.)

Related Issues

This is "inspired by" #205 but possibly doesn't actually resolve it.

  • Resolves #

  • "I certify that all code in this PR is my own, except as noted below."

Use of AI

  • Includes AI-assisted code completion
  • Includes code generated by an AI application
  • Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

Deserialization of some parameters like `--data` takes place during `run`
rather than during option processing. It raises a `ValidationError`, but to
the Click CLI infrastructure this is an unexpected exception and causes a full
traceback which is not generally useful to the user.

This change intercepts `ValidationError` (and for completeness the standard
Python `ValueError`) and raises a Click `BadParameter` exception encapsulating
the validation error text. This will be reported without traceback.

Signed-off-by: David Butenhof <dbutenho@redhat.com>
)
)
)
except (ValidationError, ValueError) as err:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My biggest concern here is we lose debugging information for a large class of errors. ValueError is especially generic and could be emitted in any number of places. Maybe instead we define a custom error type for these runtime argument errors and catch that specific exception here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... yeah, it actually looks like ValueError is used a lot more than I would have expected. And if any of them can occur after "initial startup" we wouldn't want to hide the traceback. So much for a "simple touch", but it was in any case an interesting excursion.

Yeah, we could use another exception for "static startup validation" errors. Perhaps a better solution would be to refactor the deserializers with a validation method that can be called during option parsing rather than let this sort of thing wait until run. And that'll require a lot more thought ...

@dbutenhof dbutenhof marked this pull request as draft January 22, 2026 22:11
@dbutenhof dbutenhof added this to the v0.7.0 milestone Jan 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants