Skip to content

Task05 Александр Исаенков#1043

Open
PlayingPeano wants to merge 1 commit intoGPGPUCourse:task05from
PlayingPeano:task05
Open

Task05 Александр Исаенков#1043
PlayingPeano wants to merge 1 commit intoGPGPUCourse:task05from
PlayingPeano:task05

Conversation

@PlayingPeano
Copy link

@PlayingPeano PlayingPeano commented Feb 22, 2026

Локальный вывод

$ ./main_radix_sort 1
Found 3 GPUs in 0.0606105 sec (OpenCL: 0.0326639 sec, Vulkan: 0.0278155 sec)
Available devices:
  Device #0: API: Vulkan. iGPU. AMD Radeon Vega 8 Graphics (RADV RAVEN). Free memory: 1866/2970 Mb.
  Device #1: API: OpenCL. CPU. cpu-haswell-AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx. AuthenticAMD. Total memory: 5146 Mb.
  Device #2: API: Vulkan. CPU. llvmpipe (LLVM 21.1.6, 256 bits). Free memory: 6862/6862 Mb.
Using device #1: API: OpenCL. CPU. cpu-haswell-AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx. AuthenticAMD. Total memory: 5146 Mb.
Using OpenCL API...
n=100000000 max_value=2147483647
sorting on CPU...
CPU std::sort finished in 27.5634 sec
CPU std::sort effective RAM bandwidth: 0.0270306 GB/s (3.62799 uint millions/s)
Kernels compilation done in 0.0777101 seconds
Kernels compilation done in 0.0528955 seconds
GPU radix-sort times (in seconds) - 10 values (min=29.6156 10%=29.7186 median=35.9233 90%=38.1614 max=38.1614)
GPU radix-sort median effective VRAM bandwidth: 0.0207403 GB/s (2.78371 uint millions/s)

@GPUcourseBOT
Copy link
Collaborator

Результаты тестирования PR #1043

Логи тестирования (нажмите чтобы развернуть)
=== СТАТУС: Успешно выполнены программы: main_radix_sort ===
=== main_radix_sort stdout (exit code: -11 (segfault после выполнения)) ===
Found 1 GPUs in 8.54984 sec (CUDA: 0.11282 sec, OpenCL: 0.707246 sec, Vulkan: 7.72972 sec)
Available devices:
Device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using OpenCL API...
n=100000000 max_value=2147483647
sorting on CPU...
CPU std::sort finished in 10.6186 sec
CPU std::sort effective RAM bandwidth: 0.0701653 GB/s (9.41742 uint millions/s)
Kernels compilation done in 4.12498 seconds
Kernels compilation done in 0.151988 seconds
GPU radix-sort times (in seconds) - 10 values (min=0.610892 10%=0.611165 median=0.61337 90%=5.05289 max=5.05289)
GPU radix-sort median effective VRAM bandwidth: 1.2147 GB/s (163.034 uint millions/s)

Посмотреть полные логи

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants