Skip to content

Conversation

@k06a
Copy link

@k06a k06a commented May 25, 2017

$ bin/gpuPlotGenerator.exe listPlatforms
-------------------------
GPU plot generator v4.0.3
-------------------------
Author:   Cryo
Bitcoin:  138gMBhCrNkbaiTCmUhP9HLU9xwn5QKZgD
Burst:    BURST-YA29-QCEW-QXC3-BKXDL
----
Platforms number: 1
----
Id:       0
Name:     Apple
Vendor:   Apple
Version:  OpenCL 1.2 (Apr  4 2017 19:07:42)
$ bin/gpuPlotGenerator.exe listDevices Apple
-------------------------
GPU plot generator v4.0.3
-------------------------
Author:   Cryo
Bitcoin:  138gMBhCrNkbaiTCmUhP9HLU9xwn5QKZgD
Burst:    BURST-YA29-QCEW-QXC3-BKXDL
----
Devices number: 2
----
Id:                          0
Type:                        CPU
Name:                        Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
Vendor:                      Intel
Version:                     OpenCL 1.2 
Driver version:              1.1
Max clock frequency:         2200MHz
Max compute units:           8
Global memory size:          16GB 0MB 0KB
Max memory allocation size:  4GB 0MB 0KB
Max work group size:         1024
Local memory size:           32KB
Max work-item sizes:         (1024, 1, 1)
----
Id:                          1
Type:                        GPU
Name:                        Iris Pro
Vendor:                      Intel
Version:                     OpenCL 1.2 
Driver version:              1.2(Apr 22 2017 16:00:44)
Max clock frequency:         1200MHz
Max compute units:           40
Global memory size:          1GB 512MB 0KB
Max memory allocation size:  384MB 0KB
Max work group size:         512
Local memory size:           64KB
Max work-item sizes:         (512, 512, 512)
$ bin/gpuPlotGenerator.exe generate direct 123456_0_50000_5000 123456_50000_10000_2000
-------------------------
GPU plot generator v4.0.3
-------------------------
Author:   Cryo
Bitcoin:  138gMBhCrNkbaiTCmUhP9HLU9xwn5QKZgD
Burst:    BURST-YA29-QCEW-QXC3-BKXDL
----
Loading platforms...
Loading devices...
Loading devices configurations...
Initializing generation devices...
    [0] Device: Iris Pro (OpenCL 1.2 )
    [0] Device memory: 384MB
    [0] CPU memory: 384MB
Initializing generation contexts...
    [0] Path: 123456_0_50000_50000
    [0] Nonces: 0 to 49999 (12GB 212MB)
    [0] CPU memory: 1GB 226MB
    [1] Path: 123456_50000_10000_10000
    [1] Nonces: 50000 to 59999 (2GB 452MB)
    [1] CPU memory: 500MB
----
Devices number: 1
Plots files number: 2
Total nonces number: 60000
CPU memory: 2GB 86MB
----
Generating nonces...
0.00% (0/60000 remaining nonces), 0.00 nonces/minutes, ETA: 5w 6d 16h 0m 0s...Abort trap: 6

@k06a k06a mentioned this pull request May 25, 2017
@k06a
Copy link
Author

k06a commented May 25, 2017

How can I determine the problem? I am not really familiar with GPGPU programming.

@k06a k06a force-pushed the feature/macos branch from 7a79b3e to 593dc1f Compare May 25, 2017 10:07
@bhamon
Copy link
Owner

bhamon commented May 25, 2017

I reintegrated the include correction from constants.h (@see 1c3ab77).

About the MacOS support, it can be achieved by a symlink to the two paths via the "OPENCL_INCLUDE" and "OPENCL_LIB" env vars. It doesn't seem like a good idea to ship platform dependent modifications. I can include an explanation in the README.md if you want.

@k06a
Copy link
Author

k06a commented May 25, 2017

@bhamon how about this changes?

ifeq ($(shell uname),Darwin)
LD_FLAGS = -fPIC -L$(OPENCL_LIB) -framework OpenCL -m$(PLATFORM)
else
LD_FLAGS = -fPIC -L$(OPENCL_LIB) -lOpenCL -m$(PLATFORM)
endif

@k06a
Copy link
Author

k06a commented May 25, 2017

And can you help me with this?

Generating nonces...
0.00% (0/60000 remaining nonces), 0.00 nonces/minutes, ETA: 5w 6d 16h 0m 0s...Abort trap: 6

@k06a
Copy link
Author

k06a commented May 29, 2017

@bhamon can you give me suggest how to debug this issue or at least get stacktrace?

@k06a
Copy link
Author

k06a commented May 29, 2017

Just tried:

$ lldb bin/gpuPlotGenerator.exe generate buffer xxxxxxxxxxxxxxx_0_32768_32768

Got:

Process 3519 launched: '/Users/k06a/gpuPlotGenerator/bin/gpuPlotGenerator.exe' (x86_64)
-------------------------
GPU plot generator v4.0.3
-------------------------
Author:   Cryo
Bitcoin:  138gMBhCrNkbaiTCmUhP9HLU9xwn5QKZgD
Burst:    BURST-YA29-QCEW-QXC3-BKXDL
----
Loading platforms...
Loading devices...
Loading devices configurations...
Initializing generation devices...
    [0] Device: Iris Pro (OpenCL 1.2 )
    [0] Device memory: 384MB
    [0] CPU memory: 384MB
Initializing generation contexts...
    [0] Path: xxxxxxxxxxxxxxx_0_32768_32768
    [0] Nonces: 0 to 32767 (8GB 0MB)
    [0] CPU memory: 8GB 0MB
----
Devices number: 1
Plots files number: 1
Total nonces number: 32768
CPU memory: 8GB 384MB
----
Generating nonces...
0.00% (0/32768 remaining nonces), 0.00 nonces/minutes, ETA: 3w 1d 18h 8m 0s...Process 3519 stopped
* thread #5, queue = 'opencl_runtime', stop reason = signal SIGABRT
    frame #0: 0x00007fffd1d99d42 libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill:
->  0x7fffd1d99d42 <+10>: jae    0x7fffd1d99d4c            ; <+20>
    0x7fffd1d99d44 <+12>: movq   %rax, %rdi
    0x7fffd1d99d47 <+15>: jmp    0x7fffd1d92caf            ; cerror_nocancel
    0x7fffd1d99d4c <+20>: retq   

@k06a
Copy link
Author

k06a commented Jun 1, 2017

@bhamon here is call stack:

Thread 2 Crashed:: Dispatch queue: opencl_runtime
0   libsystem_kernel.dylib        	0x00007fffd1d99d42 __pthread_kill + 10
1   libsystem_pthread.dylib       	0x00007fffd1e87457 pthread_kill + 90
2   libsystem_c.dylib             	0x00007fffd1cff420 abort + 129
3   libGPUSupportMercury.dylib    	0x00007fffca1bffbf gpusGenerateCrashLog + 158
4   com.apple.driver.AppleIntelHD5000GraphicsGLDriver	0x000000010400a09b gpusKillClientExt + 9
5   libGPUSupportMercury.dylib    	0x00007fffca1c0983 gpusQueueSubmitDataBuffers + 168
6   com.apple.driver.AppleIntelHD5000GraphicsGLDriver	0x0000000104055011 IntelCLCommandBuffer::getNew(GLDQueueRec*) + 31
7   com.apple.driver.AppleIntelHD5000GraphicsGLDriver	0x0000000104054f79 intelSubmitCLCommands(GLDQueueRec*, unsigned int) + 65
8   com.apple.driver.AppleIntelHD5000GraphicsGLDriver	0x000000010405b081 CHAL_INTEL::ChalContext::ChalFlush() + 83
9   com.apple.driver.AppleIntelHD5000GraphicsGLDriver	0x00000001040552a3 gldFinishQueue + 43
10  com.apple.opencl              	0x00007fffc08b9b37 0x7fffc08b8000 + 6967
11  com.apple.opencl              	0x00007fffc08ba000 0x7fffc08b8000 + 8192
12  com.apple.opencl              	0x00007fffc08d7cca 0x7fffc08b8000 + 130250
13  com.apple.opencl              	0x00007fffc08db29d 0x7fffc08b8000 + 144029
14  libdispatch.dylib             	0x00007fffd1c358fc _dispatch_client_callout + 8
15  libdispatch.dylib             	0x00007fffd1c36536 _dispatch_barrier_sync_f_invoke + 83
16  com.apple.opencl              	0x00007fffc08db11d 0x7fffc08b8000 + 143645
17  com.apple.opencl              	0x00007fffc08d6da6 0x7fffc08b8000 + 126374
18  com.apple.opencl              	0x00007fffc08cc1df clEnqueueReadBuffer + 813
19  gpuPlotGenerator.exe          	0x0000000102741a3b cryo::gpuPlotGenerator::GenerationDevice::bufferPlots() + 107
20  gpuPlotGenerator.exe          	0x00000001027328b5 cryo::gpuPlotGenerator::writeNonces(std::exception_ptr&, std::__1::mutex&, std::__1::condition_variable&, std::__1::list<std::__1::shared_ptr<cryo::gpuPlotGenerator::GenerationContext>, std::__1::allocator<std::__1::shared_ptr<cryo::gpuPlotGenerator::GenerationContext> > >&, std::__1::shared_ptr<cryo::gpuPlotGenerator::GenerationContext>&) + 293
21  gpuPlotGenerator.exe          	0x0000000102733d7b void* std::__1::__thread_proxy<std::__1::tuple<cryo::gpuPlotGenerator::CommandGenerate::execute(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&)::$_2, std::__1::shared_ptr<cryo::gpuPlotGenerator::GenerationContext> > >(void*) + 139
22  libsystem_pthread.dylib       	0x00007fffd1e8493b _pthread_body + 180
23  libsystem_pthread.dylib       	0x00007fffd1e84887 _pthread_start + 286
24  libsystem_pthread.dylib       	0x00007fffd1e8408d thread_start + 13

@k06a
Copy link
Author

k06a commented Jun 1, 2017

May be this is a reason: https://stackoverflow.com/a/43991502/440168

@bhamon
Copy link
Owner

bhamon commented Jun 2, 2017

@k06a I'm ok with the change in the Makefile, I will push it soon.

About your problem, can you give me the content of your configuration file?
I suppose you try to use your primary graphic card as a generator (the one that is used by your display). If that's the case, there is a high chance that the parameter "hashesNumber" needs to be lowered (you can try a value of "4" to begin with).
The "hashesNumber" parameter reflects the stress on the graphic card. To prevent the system watchdog to suspend the generation process I chunck it to smaller pieces (ideally powers of 2).

@k06a
Copy link
Author

k06a commented Jun 2, 2017

@bhamon my configuration if mostly recommended:

0 1 1536 384 8192

My device is:

Intel Iris Pro 1536 MB

@bhamon
Copy link
Owner

bhamon commented Jun 2, 2017

@k06a Have you tried to change the 8192 to 4?

@k06a
Copy link
Author

k06a commented Jun 2, 2017

Just tried config:

0 1 1536 384 4

And got:

$ bin/gpuPlotGenerator.exe generate direct 18xxx_0_131072_65536
-------------------------
GPU plot generator v4.0.3
-------------------------
Author:   Cryo
Bitcoin:  138gMBhCrNkbaiTCmUhP9HLU9xwn5QKZgD
Burst:    BURST-YA29-QCEW-QXC3-BKXDL
----
Loading platforms...
Loading devices...
Loading devices configurations...
Initializing generation devices...
    [0] Device: Iris Pro (OpenCL 1.2 )
    [0] Device memory: 384MB
    [0] CPU memory: 384MB
Initializing generation contexts...
    [0] Path: 189xxxxxx_0_131072_131072
    [0] Nonces: 0 to 131071 (32GB 0MB)
    [0] CPU memory: 16GB 0MB
----
Devices number: 1
Plots files number: 1
Total nonces number: 131072
CPU memory: 16GB 384MB
----
Generating nonces...
9.38% (12288/131072 remaining nonces), 11170.91 nonces/minutes, ETA: 10m 38s...

@k06a
Copy link
Author

k06a commented Jun 2, 2017

Interesting fact, that https://github.com/r-majere/mjminer works for me at same speed on CPU when using AVX2 instruction set:

Using AVX2 core.
Creating plots for nonces 0 to 131072 (34 GB) using 32768 MB memory and 8 threads
1.03% completed, 11505 nonces/minute, 0:11 left 

@k06a
Copy link
Author

k06a commented Jun 2, 2017

@bhamon why your app suggests me to use 8192 instead on 4? :)

@bhamon
Copy link
Owner

bhamon commented Jun 2, 2017

@k06a Good news, it works.

About the performances, OpenCL on your CPU (embedded GPU) can't go really any faster than a well optimized AVX2 implementation. The GPU plot generator is mainly targeted for dedicated GPUs.

About the auto-detection feature, I don't have any easy mean to detect whether the GPU is tied to your display or not. So by default I suggest 8192, and I added an entry in the FAQ (in README.md) to help solving this particular problem.

@k06a
Copy link
Author

k06a commented Jun 2, 2017

@bhamon are you sure you don't wanna merge include-related changes? This will make OSX compilation much harder (I am talking about hard linking dirs and files)

@gateway
Copy link

gateway commented Jun 12, 2017

Did anyone make a osx build? I have a hackintosh with a R290 and it would be great to use that to plot with..

@k06a
Copy link
Author

k06a commented Jun 13, 2017

@gateway this branch is fully compatible with macOS: https://github.com/k06a/gpuPlotGenerator/tree/feature/macos

It is partially merged in this repo. You can see all changes on third tab at top of this page:
https://github.com/bhamon/gpuPlotGenerator/pull/17/files

@bhamon
Copy link
Owner

bhamon commented Jun 13, 2017

@k06a I'm looking at a cmake integration. Thus, it would be a lot more flexible to build on different OSs.

@gateway I don't own a Mac, but I'll borrow one to put a OSX built version for the next release ;)

@gateway
Copy link

gateway commented Jun 13, 2017

@bhamon let me know and I can beta test this! 🍻

@bhamon
Copy link
Owner

bhamon commented Jun 17, 2017

@k06a @gateway The latest release (v4.1.0) embed the new CMake build system. Also, it has a native support for MacOS (ie. #include <OpenCL/cl.h>). I don't have time to test it for now. I'll provide MacOS binaries asap. In the meantime, you can compile it from sources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants