Skip to content

Hardware sqrt helper functions #37

@AngelTomkins

Description

@AngelTomkins

In #32 it was shown that you can use the hardware sqrt operation to replace a software one in nds_renderer.c. The default libnds sqrt function has extra checks that are unnecessary and does not support async computation. Libnds' sqrt does a check to make sure the hardware divider is not busy before sending the value, it only takes ~30 main bus cycles according to my testing, this means that unless you have two hardware divides back to back, there is no reason to check this value twice. Blocksds uses a simpler approach by having async functions for sending values to the hardware math coprocessor, this means we can use the cpu while waiting for the hardware math to complete. With testing I have found that replacing the sqrt call referenced in #32 with an async send, then check if (s > 0) and then wait for the hardware to finish the operation. In testing this saves 10-20 microseconds per frame.

// Normalize the result
int s = (lights[i].nx * lights[i].nx + lights[i].ny * lights[i].ny + lights[i].nz * lights[i].nz) >> 8;

// Send squareroot value to hardware before comparing (s > 0), this saves 10-20 microseconds
// Devkitpro's libnds does not have helper functions for async hardware math. This should be 
// put into a helper function.
REG_SQRTCNT = SQRT_64;
REG_SQRT_PARAM = (s64)s << 16;
if (s > 0) {
    while (REG_SQRTCNT & SQRT_BUSY);
    s = REG_SQRT_RESULT;
    lights[i].nx = (lights[i].nx << 16) / s;
    lights[i].ny = (lights[i].ny << 16) / s;
    lights[i].nz = (lights[i].nz << 16) / s;
}

If we do not switch to Blocksds, I propose we at least have these functions in a header file to have better operability with the hardware. The question I have is, where should this function go, so that it can be used by more than just nds_renderer.c if it comes to be useful later on? Should this be a function or a preprocessor define: #define sqrt_asynch(x) ...?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions