Skip to content

Conversation

@CodersAcademy006
Copy link

This PR adds kernel-based tests for device-side casting between common numeric and boolean dtypes, mirroring a subset of the CPU-side test_casting logic. The tests validate CUDA device semantics for:

  • Numeric widening and narrowing (e.g., int32 ↔ int64, float32 ↔ float64)
  • Float ↔ int (with explicit truncation semantics)
  • Boolean ↔ int
  • Only dtypes and conversions natively supported by CUDA
  • Tests are skipped under cudasim to avoid simulator/Python semantic mismatches. This work references and addresses issue Incomplete Test Coverage in Numba-CUDA #515.

…IDIA#515)

This adds kernel-based tests for device-side casting between common numeric and boolean dtypes, mirroring a subset of CPU test_casting. Tests are skipped under cudasim. References: NVIDIA#515.
@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 20, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 20, 2026

Greptile Summary

This PR adds a new comprehensive test suite for CUDA device-side casting operations, addressing issue #515 to improve test coverage for Numba-CUDA. The implementation validates casting between numeric types (int32, int64, float32, float64) and boolean conversions, with special attention to CUDA's C-style truncation semantics for float-to-int conversions.

The test suite is well-structured with:

  • A clean 1D kernel that performs implicit type casting during assignment
  • A helper method that handles device memory allocation, kernel invocation, and validation
  • Comprehensive test cases for widening, narrowing, and cross-type conversions
  • Proper use of @skip_on_cudasim to avoid simulator semantic mismatches
  • Efficient device memory management using cuda.device_array()
  • Clear documentation via comments explaining CUDA truncation behavior

Confidence Score: 5/5

  • This PR is safe to merge. The new test code is well-structured, follows established patterns, contains no syntax errors, and properly validates CUDA casting behavior.
  • Full confidence is warranted because: (1) The code is syntactically correct and follows Python/CUDA best practices, (2) The kernel implementation is simple and correct with proper boundary checking, (3) Test coverage is comprehensive with good dtype variety, (4) Device memory management is efficient using cuda.device_array(), (5) The decorator usage correctly skips tests on cudasim to avoid false failures, (6) Grid/block dimension calculations are correct, and (7) The PR directly addresses a tracked issue (Incomplete Test Coverage in Numba-CUDA #515) for improving test coverage in Numba-CUDA.
  • No files require special attention

Important Files Changed

Filename Overview
numba_cuda/numba/cuda/tests/test_casting.py New comprehensive CUDA casting test suite that validates device-side type conversions across numeric and boolean dtypes. Tests properly use the CUDA kernel architecture with correct grid/block calculations, efficient device memory allocation via device_array, and comprehensive test coverage for int/float/bool conversions including proper handling of CUDA truncation semantics.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

CodersAcademy006 and others added 2 commits January 21, 2026 11:31
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

CodersAcademy006 and others added 3 commits January 21, 2026 11:33
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

CodersAcademy006 and others added 2 commits January 21, 2026 11:36
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
- Use cuda.device_array() instead of zeros+to_device for efficiency
- Remove duplicate cuda.synchronize() calls (copy_to_host is already sync)
- Eliminate unnecessary Host-to-Device memory transfer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant