Skip to content

System Call

KeePromise edited this page Apr 26, 2021 · 1 revision

System call

1.Design

The system call function is completed using the syscall and sysretq instructions. After the system call enters the kernel, the kernel system call function still uses the stack of the user address space, that is, the system call does not manually switch the stack. Because the system call processing function is written in C language, and the characteristics of C language function call are used: parameters are passed through the RDI, RSI, RDX, RCX, R8 and R9 registers, and RAX is used to save the function return value. Through this feature, use the syscall and sysretq instructions to simulate the system call as a normal C function call.

2.Implement

1.SystemCallTabel

The kernel will maintain a SystemCallTabel, save the function pointer of each system call, each system call contains 5 parameters, but the user layer needs to pass in 6 parameters, the first parameter is the system call number, used to index from the SystemCallTabel The processing function of the system call; the last 5 parameters correspond to the parameters of each system call.

//系统调用掉支持最多256个系统调用
struct SystemCallTabel{
    unsigned long (* fun[256])(unsigned long ,unsigned long,unsigned long,unsigned long,unsigned long);
};

2.User mode part

User mode entry function declaration, when other functions are called, 6 parameters will be saved to RDI, RSI, RDX, RCX, R8 and R9 registers.

//用户态入口函数声明
extern unsigned long systemIn(unsigned long vector,unsigned long vir1,unsigned long vir2,unsigned long vir3,unsigned long vir4,unsigned long vir5);

The specific definition of systemIn saves the value of the rcx register to r10. Because the syscall instruction is executed, the address of the next instruction (system call return address) will be saved to rcx; so before executing syscall, save the parameters passed in through rcx To r10.

__asm__ (
"systemIn: \n\t"
    "pushq %r10 \n\t"    
    "movq %rcx , %r10 \n\t"
    "syscall \n\t"
    "popq %r10 \n\t"
    "retq \n\t"
);

3.Kernel mode part

When the user mode executes the syscall instruction, it will jump to the systemIn position of the kernel for execution; the first address of systemIn needs to be registered in the corresponding register of msr.

This function initializes the environment when the syscall and sysret instructions are executed. addSysCall(sys_no,i) is used to register the system call to SystemCallTabel. After setting the register, initialize the system processing function of the system call table to the sys_no function.

void initSystemCall(){
    unsigned long STAR = 0x0013000800000000;
    unsigned long LSTAR = (unsigned long )systemIn;
    unsigned long i ;
    wrmsr(0xc0000081,STAR);
    wrmsr(0xc0000082,LSTAR);
    wrmsr(0xc0000084,0x0);

    for(i=0;i<256;i++){
        addSysCall(sys_no,i);
    }
}
unsigned long sys_no(unsigned long vir1,unsigned long vir2,unsigned long vir3,unsigned long vir4,unsigned long vir5){
    color_printk(RED,WHITE,"no syscall \n");
    return 0;
}

The systemIn function of the kernel mode:

After entering the kernel, first save the parameters saved in r10 to rcx; then call the c function systemCall.

//进入内核后,系统调用的入口函数
extern unsigned long systemIn();
__asm__ (
"systemIn: \n\t"
    "pushq %rcx \n\t"
    "movq %r10 , %rcx \n\t"
    "leaq systemCall(%rip) , %rax \n\t"
    "callq *%rax \n\t"
    "popq %rcx \n\t"
    "sysretq    \n\t"
);

The systemCall function calls specific functions in SystemCallTabel for processing according to the parameter values saved by the user to the RDI, RSI, RDX, RCX, R8 and R9 registers.

unsigned long systemCall(unsigned long vector,unsigned long vir1,unsigned long vir2,unsigned long vir3,unsigned long vir4,unsigned long vir5){
    return systemCallTabel.fun[vector](vir1,vir2,vir3,vir4,vir5);
}

After the systemCall process ends, the return address saved on the top of the stack is popped to rcx, and then the sysretq instruction is executed to return to the user mode, and the return value is saved in rax. Completed a system call transparent to the user mode call function.

3.Instance

Take the drawOneBlock library function as an example to describe a specific system call process. The purpose of this function is to draw a block on the screen.

Write specific system processing functions in kernel mode.

//在窗口画一个方块,系统调用号为5
//x,y坐标
//xlength,ylength 宽高
//color 块的颜色
unsigned long sys_showBlock(long x,long y,long xlength,long ylength,int color){
    reetrantlock();
    struct Window * mainWindow = taskManage.running->prev->window;
    if(x+xlength>mainWindow->xlength) xlength = mainWindow->xlength - x;
    if(y+ylength+20>mainWindow->ylength) ylength = mainWindow->ylength - y - 20;
    x = mainWindow->x+x;
    y = mainWindow->y+y+20;
    showOneBlock(x,y,xlength,ylength,mainWindow->high,color,mainWindow);
    reetrantUnLock();
    return 0;

}

Use addSysCall(unsigned long sysfun, unsigned long num) to add the above function to SystemCallTabel, the system call number is 5.

//添加系统调用函数
void addSysCall(unsigned long sysfun,unsigned long num){
    systemCallTabel.fun[num] = sysfun;
}
addSysCall(sys_showBlock,5);

Write user mode drawOneBlock library function, this function has no return value.

void drawOneBlock(long x,long y,long xlength,long ylength,int color){
    //调用用户态systemIn函数
    systemIn(5,x,y,xlength,ylength,color);
}

4.Reference

AMD64 Architecture Programmer’s Manual, Volume 2: System Programming

5.Code

KeeProMise/KePOS: Design and implement your own operating system (github.com)

Clone this wiki locally