Added documentation regarding bkpt and related registers#38
Added documentation regarding bkpt and related registers#38LoekHabets wants to merge 1 commit intomaazl:masterfrom
Conversation
|
Hmm, direct scheduling has the major disadvantage that it may give race conditions with other code accessing the hardware registers. Furthermore polling hogs the ARM CPU. On the other side the mailbox interface of the firmware adds a considerable delay. Maybe another approach might work as well. I did not check whether a race condition free check for running code is possible. In this case a IOCTL for polling (at low rate) could be implemented as IOCTL in vcio2 for long running QPU code. Unfortunately it might give false negative results when another QPU code starts within the polling intervall. |
|
Okay, in that case I suppose it is better to remove that paragraph until there is a safer solution, do you agree? I am out of my depth when it comes to writing Linux kernel drivers, but do you think it is possible to handle direct register access via vcio2 and put a global mutex on the hardware registers? And as far as I can tell, it is not necessary to poll the DBQITC register from the kernel driver since it can handle the IRQ. Low rate polling of the HLT register is fine, because that one will never change state until it (or RUN) is written to anyway. We just need to know which QPUs got which threads. Unfortunately, I have not yet found a way to influence or retrieve to which QPU a new thread is scheduled. Oddly enough, it appears that using the mailbox causes the QPUs to be scheduled in ascending order, while direct scheduling does so in descending order. Perhaps it prefers scheduling new threads onto unhalted QPUs first? I need to check this. One point that we are at risk of missing here is that by supporting the scheduling of new QPU threads while others are still running, we are exposing the fundamentally unsafe nature of the GPU's memory model: you must use VPM to be able to write data back into RAM. As far as I am aware, it is not possible to allocate sections of VPM to the QPUs in hardware outside of "all user programs get these 4 KiB". Different code may have different ways of handling VPM safety so hybrid scheduling is risky business, unless there is some standardized way to do this. I have been thinking about writing a shared .qinc file that contains some macros that "safe" programs can use, but that is beyond the scope of this discussion for now. |
I have done some research around the breakpoint instruction and some of the undocumented registers that are intended to be used with it. I figured I should put my findings in the Addendum.
I'd like to bring some extra attention to this paragraph:
I am not sure this should be included at all. After all this is a Linux-specific problem; as a user space program we do not have access to hardware interrupts. However, it also begs the question whether direct scheduling could be added to
vcio2in the future and whether it is possible to then bring IRQ functionality back, while retaining the option to run programs asynchronously or with very long timeouts.