HPCC RandomAccess benchmark added.#286
Conversation
|
Hi Alex, good work. A couple questions:
|
|
Hi Brandon, thank you for your comments! I have fixed it. Could you please check the synchronization of delegates? |
|
The test seems to work on small tables sizes (and shows good performance), but when the size is increasing it segfaults. Any guesses how to track it down?
|
|
|
||
| Grappa::GlobalCompletionEvent randomaccess_gce; | ||
|
|
||
| template <Grappa::GlobalCompletionEvent * GCE = &randomaccess_gce > |
There was a problem hiding this comment.
You wouldn't necessarily need to make this a template parameter for run_random_access. Is there a reason to not just use randomaccess_gce directly?
|
I can't tell at the moment why it's segfaulting. Can you enable backtraces or attach to it with GDB? (if you don't know how, I can explain or link you to the docs) |
|
Running in debug mode shows the following: |
|
Oh, for one thing, |
|
That's definitely a bug, thank you! I will check if it has any influence on the segfault issue. But I think that segfault happens during allocation of the table. I'll check. The possible bug/problem is in
UPD: the issue is in assigning of N: |
|
Ah, I did notice that but thought you were seeing the problem with scale = 24? |
|
Yes, but I multiply it on cores(). |
|
Now, while I am trying to run the hpcc RandomAccess test on large tables (2^28 words * per core). This causes problems with allocating memory for table. Then I tried (as suggested) to increase This same error I saw when tried to run grappa without The experiments showed that I can't create locale shared memory more than 25G (roughly) without violating virtual memory limit. It seems to me strange since program just have started its execution, and heap should not be large... there should be huge amount of static data? |
|
Yeah, the memory allocation is a huge pain. We always meant to fix it for real and get rid of the need for the locale-shared-heap. That message actually told you to bump up I don't quite remember what happens when you increase the But if you increase |
|
You probably need both |
|
(the right choice for locale_shared_fraction will probably be between 0.7 and 0.85, depending on what else you need to allocate) |
|
Do you have the code of RandomAccess benchmark isolated? I wanna to compile this benchmark out of the HPCC suite. |
Hi guys!
I want to make some comparison of different hpc runtimes on a set of benchmarks. As a first step I have implemented HPCC RandomAccess on Grappa. It slightly differs from demo-gups* in a way that is a bit closer to original HPCC RandomAccess benchmark. If you are interested you can include it to the master.
Best,
Alex