Skip to content

Conversation

@Spacechild1
Copy link
Owner

@Spacechild1 Spacechild1 commented Feb 22, 2022

dummy PR for development

configure with --enable-parallel!

check block~-help.pd and clone-help.pd for help and usage examples.

grafik

grafik

@Spacechild1 Spacechild1 changed the base branch from master to develop February 22, 2022 12:10
@Spacechild1 Spacechild1 force-pushed the multi-threading branch 3 times, most recently from ce378da to c6f64bd Compare February 23, 2022 00:44
@Spacechild1 Spacechild1 force-pushed the multi-threading branch 3 times, most recently from 5f513e0 to f55c4b3 Compare March 31, 2022 11:18
@Spacechild1 Spacechild1 changed the base branch from develop to master March 31, 2022 11:37
@Spacechild1 Spacechild1 changed the base branch from master to develop March 31, 2022 11:37
@Spacechild1 Spacechild1 force-pushed the multi-threading branch 3 times, most recently from 3550101 to 73979e9 Compare September 17, 2022 13:16
* public API to manage the DSP thread pool
* private API to create DSP task queues and DSP tasks
* PD_DSPTHREADS define to enable/disable the thread pool at compile time
* add global DSP task queue
* reset and join DSP global task queue in dsp_tick()
* add "-threads" command line option to set number of DSP threads
* add thread local variable for the current DSP thread index
* if clocks are set/unset in the main thread (= 0), do as usual
* if clocks are set/unset in a DSP helper thread (> 0), cache the desired time point and put the clock on a lock-free stack
* at the end of dsp_tick() we take the list of deferred clocks and set/unset them for real.
* swapping/resizing an array doesn't require a DSP graph update
* in control objects, garrayref can be faster because the garray is cached
* in perform routines, garrayref allows thread-safe read or write access to garrays
protect st_soundout with spinlocks
expands to CLASS_THREADSAFE if PD_PARALLEL is 1
* add a_numthreads to t_audiosettings
* add "threads" to preferences
* add "Audio threads" option in audio settings dialog
by default, the DSP graph uses the global signal context.
However, if a graph is processed in parallel, it needs its own signal context.
For this purpose, we can also temporarily push/pop a new signal context with signalcontext_push() and signalcontext_pop().
* add global DSP task queue
* parallel canvasses use their own signal context
* vinlet makes a copy of the parent input signal (in the new context)
* voutlet uses double buffering: voutlet_dspprolog() writes last buffer to parent output signal, voutlet_dsp() writes input signal to buffer.
* if "join" is true, block~ has its own DSP task queue
* in ugen_donegraph, we push the DSP task queue, reset it and finally join it. this will synchronize all parallel DSP tasks in subpatches or child abstractions.
* each DSP task object registers itself with every outer switch~ object. If a switch~ object changes it state, it notifies all its DSP task children, so they can in turn notify their DSP task queue.
* DSP tasks in switched-off parallel canvasses are simply not scheduled.
* each DSP task queue maintains a count of switched-off DSP tasks, so that it won't lock up in case there is a switch~ *between* the queue and the task.
* if "parallel" is true, clone has its own signal context and DSP task queue.
* child abstractions are scheduled as DSP tasks and finally joined.
* the cloned abstractions don't need to know that they are being processed in parallel, we can carry out all necessary steps in clone_dsp().
dsptaskqueue_update() checks if the canvas sub tree contains non-thread-safe objects. Note that any non-thread-safe objects *outside* the tree are ignored. This function is called in ugen_done_graph() for canvases with block~ + "join" and also in ugen_start() for the toplevel queue.

dsptaskqueue_check() checks if the task queue is thread-safe, and if false, posts the first N non-thread-safe object (only once per queue). This is called in ugen_done_graph() for canvases with block~ + "parallel".

Each parallel canvas checks if the enclosing joining canvas is thread-safe and prints an error message if false.

As an optimization, we traverse the complete object tree once in ugen_start() and mark every sub-tree (depth first) by setting gl_threadsafe. This speeds up subsequent calls to canvas_isthreadsafe() tremendously.

clone is handled specially since it contains both the DSP task queue and the DSP tasks, so the check can be performed all inside clone_dsp().

Thread safety checks can be disabled with the "-nothreadsafe" command line option.
used in thread_physical_concurrency() and later for thread pinning
instead of going to sleep everytime the task queue is (temporarily) empty, the DSP helper threads only wake up once at the beginning of each DSP tick and then they spin-wait until all tasks have been finished.

Pro: minimize wake-up latency = more stable performance
Con: burning CPU cycles

Can be enabled/disabled with the "-spinwait" resp. "-nospinwait" command line options.
allow to pin DSP threads to dedicated cores; useful for spinwaiting!

Can be enabled/disabled with the "-affinity" resp. "-noaffinity" command line options
@Spacechild1 Spacechild1 force-pushed the multi-threading branch 3 times, most recently from 7f78314 to d35002f Compare September 17, 2022 15:48
@Spacechild1
Copy link
Owner Author

@umlaeute What does the dist-check job do? It seems like I have to add s_spinlock.h, but I don't know where... "Normal" autotools builds work fine.

@umlaeute
Copy link

umlaeute commented Sep 17, 2022 via email

Copy link

@umlaeute umlaeute left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually the headers are not installed (hence the noinst_ prefix), but that is just as it should be (it's just that the commit message is a bit misleading) :-P

@Spacechild1
Copy link
Owner Author

(it's just that the commit message is a bit misleading)

It was just for testing :-)

actually the headers are not installed (hence the noinst_ prefix)

Actually, s_spinlock.h should be installed (it's a public header), but s_sync.h should not. I hope I got it right...

Spacechild1 pushed a commit that referenced this pull request Oct 17, 2022
@danomatika
Copy link

Can you add screenshots of the block and clone help patches?

@Spacechild1
Copy link
Owner Author

@danomatika done :-)

#endif

/* override for parallel processing support */
#ifndef PD_PARALLEL
Copy link

@danomatika danomatika Nov 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I would call this PDPARALLEL without the underscore to better match the existing PDINSTANCE and PDTHREADS defines.

Copy link
Owner Author

@Spacechild1 Spacechild1 Nov 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. On the other hand, there are quite a few constants/defines with a PD_ prefix:
PD_INTERNAL, PD_LONGINTTYPE, PD_FLOATSIZE, PD_BIGORSMALL, PD_<loglevel>, etc. Personally, I think PD_<name> is better style, but I don't care too much :-)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, any suggestions for better naming are very much appreciated! (Naming things is hard...)

@Spacechild1 Spacechild1 changed the base branch from develop to master March 23, 2023 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants