Skip to content

Conversation

@dzhang97
Copy link

Looking at the timing breakdown, it seems like get_darkest_area takes the most time, ~67% of the total frame time.

When the imageSkipSize and searchArea are equal, and the searchArea is a clean multiple of internalSkipSize, numpy can be used relatively easily to accelerate the loops. On my pi3, this improved performance of this function by ~10x, ~22ms to ~2.3ms. Overall frame time dropped from ~32ms to ~13ms. This may not scale the same on newer pi's, but the numpy acceleration should still be a sizeable improvement.

I can clean up this pr if this is interesting. Technically this implementation has limitations to the parameters, but I think it should be possible to generalize the design.

@NuncObdurat
Copy link
Contributor

Thanks! I'll try to test this in the coming weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants