Conversation
Add an explicit check for the .kernel:<filename> section that contains the compressed kernel binary for a zImage. If it's not present then we're probably trying to kexec an ordinary binary and we emit a warning to indicate the user might be making poor life decisions. Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Add the automake dependency and add fedora package names. Spending more than half a second thinking about this stuff is too much. Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Currently for ABI v2 binaries we jump to the entry point in the ELF header. For a vmlinux this works because the entry point is 0xc000000000000000 and the upper bits of the address are ignored in real mode. For a zImage the entry point is 0x20000000 and as a result kexec will jump to 0x20000000 + load_address which typically results in dying in a fire. Fix this by turning the entry point into an offset into the PT_LOAD section and jump to that instead. This patch allows us to kexec into a zImage directly, which rules. Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
Thanks Oliver, this looks good. I tried kexec'ing a zImage and a separate initrd, and the initrd got corrupted. Unfortunately the zImage code ignores the reserved fields in the fdt: And kexec-lite places the initrd right after the kernel: I added 16MB of padding between the kernel and initrd, and the corruption went away. It would be good to fix the zImage wrapper to understand the reserved entries, but a few ideas to handle existing kernels:
I also note that kexec-lite is setting r7 (ima_size) to 0, which I presume means we initalize the memory allocator to (-_end). |
|
The wrapper always starts allocating from _end up until ima_size. Passing a zero IMA size results in the heap size calculation underflowing to something large so everything works out just fine. Cool... Anyway the zImage does emit some warnings due to passing an ima_size of zero, but the OPAL console backend has been broken for a while so I never saw them. Fixing that results in: These warnings only really matter on actual epapr platforms where the IMA is a real thing, but we can squash them easily enough by passing mem_top from the kexec memory map as the IMA size. Anyway, for fixing the corruption I think placing the initrd before the zImage is probably safer. The only time the wrapper does any large (multi-MB) allocations is when it detects that the uncompressed vmlinux will overlap with the initrd in which case it allocates a new buffer and moves the initrd into it. I'm a little surprised we haven't run the problem of the uncompressed vmlinux overlapping with the zImage text, which is a fatal error. At a guess it's because kexec-lite starts allocating space for the loaded segments at linux,kernel-end and the new vmlinux is similarly sized to the old one. I'm not sure what the right fix is here since we fundementally don't know how big the new kernel is unless we decompress it inside of kexec. In skiboot we just load the zImage at 0x20000000 and assume that's enough for the new kernel. In that situation we don't need to deal with potentially large initrds though so that's probably not a good idea here. |
The ePAPR entry ABI expects to be passed the amount of accessible memory (aka the inital mapped area) in r7. The IMA concept only really makes sense on BookE parts which always have the MMU enabled, but don't necessarily have all of memory mapped. On PowerNV (where kexec-lite is mainly used) we can always access all of memory in real mode. The zImage will emit warnings if the dtb or initrd are outside the RMA so we should pass a sane value for the IMA size from the trampoline. This patch makes the trampoline pass the mem_top that was calculated when building the kexec memory map as the IMA size. Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Currently we reserve the memory from 0...linux,kernel-end in the kexec memory map which prevents the current kernel's text from being overwritten when we assemble the kexec segment buffers at kexec time. This reservation is only really needed when loading a crashkernel since it wants to preserve the existing kernel text to create a crashdump. For a generic vmlinux the reservation is mostly pointless since most kernel's will copy themselves down to in early boot anyway. When a zImage is loaded we do need to keep the start of memory reserved because the zImage will extract the vmlinux to zero on ePAPR platforms. By luck reserving up until linux,kernel-end will usually reserve enough space for the new kernel's text. This does however break down when the new kernel is signifigantly larger than the previous (e.g. it contains a large builtin initramfs). This patch replaces the 0...linux,kernel-end reservation with a fixed 256MB reservation, which ought to be enough for anybody. Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
This property is generated by Linux at boot time and shouldn't be included in the dtb. Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
The zImage uses the memory immediately after it as a heap area. This can result in data corruption if we load another kexec segment after it zimage (e.g. initrd) so add some padding to the kernel to prevent this situation. Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
Fixed the IMA problem and added some padding the to kernel image so the zImage will have a bit of heap space if it needs it. I tested it on a barreleye and it only allocated ~400kb so a 4MB heap should be plenty. I found another problem with how we use linux,kernel-end too. Currently we reserve from 0...linux,kernel-end in the kexec memory map, but there's no real reason to do so for a vmlinux since it will copy itself over the previous kernel anyway. When loading a zImage we do need to reserve space at zero because that's where it will decompress the image to. However, if the new vmlinux happens to be bigger than the old reserving up to linux,kernel-end isn't enough and the new kernel will overlap with the zImage which causes it to abort. I've fixed this by just reserving the first 256MB for the new kernel, hopefully that'll be enough. |
Fixes the entry point calculation so you can kexec into a zImage directly