Skip to content

ma5ter/pvm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build on Linux-gcc-clang

Portable Virtual Machine for Microcontrollers

Overview

PVM (Portable Virtual Machine) is an ultra-lightweight virtual machine designed to run on microcontrollers (MCUs) for small automation tasks. It occupies only 1.5 kB of ROM in average 32-bit ARM-based MCU. It provides a platform to execute bytecode generated by the MPC compiler, enabling efficient and portable code execution across different MCU architectures.

Features

  • Lightweight: Optimized for resource-constrained environments even low-range MCUs. Needs only 1.5 kB of ROM and a few bytes of RAM.
  • Portable: Designed to run on various MCU architectures, needs no heap or other dynamic memory management routines and even standard libraries.
  • Efficient: Minimizes memory usage with 8-bit opcode size and efficient constant load abilities. It takes only one byte to load literals from 0 to 127, and 2 to 6 bytes to load a 32-bit integer with the MSB set depending on constant optimization.
  • Extensible: Supports built-in functions to extend functionality and communicate with peripherals.
  • Safe: Includes error handling for stack underflow/overflow, invalid function indices, and program counter overrun.

Architecture

The library provides a solid opcode executor that accepts only a virtual machine instance reference (pointer).

Supplementary routines for basic checking binary executable structure and function to reset an instance of a virtual machine are also provided.

PVM Executable Structure

PVM allows executing binary scripts compiled by the MPC compiler, which have a defined structure. The format of a PVM executable is presented in the table below:

Field Size in bytes Description
vm_version 1 The version code of the PVM required to run this executable.
size 2 The total size of the executable in bytes excluding fixed size fields.
functions_count 1 The number of functions defined in the executable.
constants_count 1 The number of constants defined in the executable.
main_variables_count 1 The number of main variables used by the executable.
functions Variable The array of function descriptors. Each function is described by a structure.
constants Variable The array of constants used by the executable.
code Variable The bytecode of the executable, containing the instructions to be executed.

Fields sizes may be easily extended in future versions of the executable format extending limitation for number of corresponding elements

Detailed Field Descriptions

The version code of the PVM required running this executable vm_version.

This field indicates the minimum version of the PVM required to execute this binary. It ensures that the executable is compatible with the version of the PVM running on the MCU. If the PVM version is lower than the specified minimum version, the executable will not be loaded.

The total size of the executable in bytes size.

This field specifies the total size of all variable fields in the executable in bytes excluding fixed size fields. It includes the size of functions description table, constants, and code sections. This value is also used to verify the integrity of the executable during loading.

The number of functions defined in the executable functions_count.

This field specifies the number of functions defined in the executable. Each function is described by a structure that includes its address, argument size, variable size, return size, and flags indicating if the function is variadic or a system library function. The compiler sorts them by usage and removes unused user and system built-in functions from this table.

The number of constants defined in the executable constants_count.

This field specifies the number of constants defined in the executable. Constants are used to store fixed values that do not change during the execution of the program. The compiler determines them walking through the compiled module then sorts by frequency of use to access efficiently. They are stored in an array following the functions section.

The number of main variables used by the executable main_variables_count.

This field specifies the number of main function variables used by the executable. Main variables are not global variables and not accessible throughout global keyword. They are initialized at the start of the program with 0 and persist until the program terminates.

The array of function descriptors functions.

This variable-sized field is an array of function definitions. Each function is described by a function descriptor structure. The format of a PVM executable is presented in the table below:

Field Size Description
address 2 bytes The address of the function in the code section. This address points to the starting bytecode of the function.
arguments_count 1 byte The number of the arguments passed to the function. This value indicates the number of bytes required to store the arguments on the data stack.
variables_count 1 byte The number of the local variables used by the function. This value indicates the number of bytes required to store the local variables on the data stack.
returns_count 6 bits The number of the return values from the function. This value indicates the number of bytes required to store the return values on the data stack.
is_variadic 1 bit A flag indicating if the function accepts a variable number of arguments. If set, the function can accept additional arguments beyond the specified args_size.
is_built_in 1 bit A flag indicating if the function is a system library function. System library functions are built-in functions provided by the PVM and are not part of the user-defined code.
The array of constants used by the executable constants.

This variable-sized field is an array of constants used by the executable. Constants are fixed values that do not change during the execution of the program. They are stored in an array following the functions section and are accessed using the LDC instruction.

The bytecode of the executable code.

This variable-sized field contains the bytecode of the executable. The bytecode consists of a series of 8-bit instructions that are executed by the PVM. The instructions are designed to be compact and efficient, with a focus on minimizing resource usage. The code section follows the constants section in the executable.

PVM Instance

The PVM instance maintains the state of the virtual machine, including several critical components that ensure the proper execution and management of the bytecode. Below is a detailed explanation of each field within the PVM instance:

Timer and Timeout

Timer

This field holds the current time in milliseconds. It is used to track the elapsed time for sleep instructions. The timer is set using the now_ms function, which returns the current time.

Timeout

This field specifies the duration for which the PVM should sleep. When a sleep instruction (SLP) is executed, the timeout value is set, and the PVM enters a sleep state until the specified duration has elapsed. The combination of the timer and timeout fields allows the PVM to handle delays accurately.

Data Stack

The data stack is a crucial component of the PVM instance, used for storing temporary data during the execution of instructions. It is implemented as an array of pvm_data_t with a fixed size defined by PVM_DATA_STACK_SIZE during compile time.

Call Stack

The call stack is used to manage function calls and returns. It is implemented as an array of structures, each containing the return address, the start of the variables in the data stack, the size of the arguments, and the function index. The call stack size is defined by PVM_CALL_STACK_SIZE during compile time.

The call stack allows the PVM to keep track of the execution context for each function call, enabling nested function calls and proper return handling. The stack top pointer keeps track of the current position in the call stack.

Program Counter (PC)

The program counter (PC) is a register that points to the address of the next instruction to be executed. It is incremented automatically as instructions are fetched and executed.

Data Top

This field is a pointer that keeps track of the current position in the data stack. It is incremented when data is pushed onto the stack and decremented when data is popped from the stack.

Call Top

This field is a pointer that keeps track of the current position in the call stack. It is incremented when a function is called and decremented when a function returns.

Persistent Data

The persistent data section of the PVM instance includes two fields: binding and exe.

Binding

This field is used to store binding-specific data that persists across resets of the PVM instance. It is a user-defined field that can be used to store context-specific information like id of a dedicated output of the MCU this virtual machine is tied to.

Executable Pointer

This field is a pointer to the PVM executable structure described above.

The persistent data section ensures that the PVM instance can be reset without losing the executable and binding information, allowing for consistent execution across resets.

Built-in Functions

PVM supports built-in functions to extend its functionality. These functions are implemented in C and can be called directly by the VM. See the Usage section for more information.

Error Handling

PVM provides comprehensive error handling to ensure robust operation. See the Usage section for more information.

Getting Started

Prerequisites

  • An MCU with a C compiler.
  • The MPC compiler for compiling source code into PVM bytecode.

Installation

  1. Clone the Repository:

    git clone https://github.com/your-repo/pvm.git
    cd pvm

    Otherwise, add as a submodule.

  2. Include PVM in Your Project:

    • Add the PVM header file (pvm.h) to your project or include the CMakeLists.txt file.
    • Implement the necessary built-in functions and the now_ms function.
  3. Compile Your Script:

    • Compile your script(s) with the MPC compiler to generate the PVM bytecode.
    • Include the generated bytecode in your MCU firmware or implement a mechanism of dynamic uploading and storage.

Usage

Initialization

Initialize the PVM instance and load the executable:

#include "pvm.h"

pvm_t vm;
const pvm_exe_t *exe = ...; // Load your executable here

void init_pvm() {
    vm.persist.exe = exe;
    pvm_reset(&vm);
}

Execution

Execute the instructions in the PVM:

pvm_errno_t err = pvm_op(&vm);
if (err != PVM_NO_ERROR) {
    // Handle error
}

Error Handling

PVM includes a comprehensive error handling mechanism to manage runtime errors returned in the pvm_errno enum in the PVM (Portable Virtual Machine) defines various error codes that can be returned by PVM functions to indicate different types of errors. These error codes are used for error handling and debugging within the PVM. Here's a detailed explanation of each enum value:

  1. PVM_NO_ERROR: Indicates that no error has occurred. This is the default success code.

  2. PVM_MAIN_RETURN: Indicates that the main function has returned. This is typically used to signal the end of the main program execution.

  3. PVM_CALL_STACK_UNDERFLOW (PVM_MAIN_RETURN): Indicates that the call stack has underflowed. This means that an attempt was made to pop from an empty call stack. This error code is aliased to PVM_MAIN_RETURN.

  4. PVM_CALL_STACK_OVERFLOW: Indicates that the call stack has overflowed. This means that an attempt was made to push more frames onto the call stack than it can hold.

  5. PVM_DATA_STACK_UNDERFLOW: Indicates that the data stack has underflowed. This means that an attempt was made to pop from an empty data stack.

  6. PVM_DATA_STACK_OVERFLOW: Indicates that the data stack has overflowed. This means that an attempt was made to push more values onto the data stack than it can hold.

  7. PVM_ARG_OUT_OF_STACK: Indicates that there are not enough stack to hold arguments for a function call. This means that an attempt was made to call a function and push more arguments into the stack than it can hold.

  8. PVM_VAR_OUT_OF_STACK: Indicates that there are not enough stack to hold variables for a function call. This means that an attempt was made to call a function and push more variables into the stack than it can hold.

  9. PVM_RETURN_OUT_OF_STACK: Indicates that there are not enough stack to hold return values for a function call. This means that an attempt was made to call a function and push more return values into the stack than it can hold.

  10. PVM_DATA_STACK_SMASHED: Indicates that the data stack has been corrupted. This means that the stack has been overwritten or otherwise tampered with, leading to an inconsistent state.

  11. PVM_PC_OVERRUN: Indicates that the program counter (PC) has overrun. This means that the PC has exceeded the bounds of the executable code, typically due to an invalid jump or call instruction.

  12. PVM_EXE_NO_FUNCTION: Indicates that a function index is out of bounds. This means that an attempt was made to call a function that does not exist in the executable's function table.

  13. PVM_BUILTIN_NO_FUNCTION: Indicates that a built-in function index is out of bounds. This means that an attempt was made to call a built-in function that does not exist in the built-in function table.

  14. PVM_NO_VARIABLE: Indicates that a variable index is out of bounds. This means that an attempt was made to access a variable that does not exist in the current scope.

  15. PVM_NO_CONSTANT: Indicates that a constant index is out of bounds. This means that an attempt was made to access a constant that does not exist in the executable's constant table.

  16. PVM_VARIADIC_SIZE: Indicates that the size of variadic arguments is incorrect. This means that the number of variadic arguments passed to a function does not match the expected number.

These error codes help in identifying and handling various runtime errors that can occur during the execution of the PVM. They provide a way to diagnose issues and ensure the robust operation of the virtual machine.

Built-in Functions

Implement the built-in functions required by your application:

void my_builtin_function(pvm_t *vm, pvm_data_t arguments[], pvm_data_stack_t args_size) {
    // Implement your built-in function here
}

const pvm_builtins_t pvm_builtins[] = {
    { my_builtin_function },
    // Add more built-in functions here
};

const size_t pvm_builtins_size = sizeof(pvm_builtins) / sizeof(pvm_builtins[0]);

Arguments

Arguments are passed to a built-in function as an array pointer, the number of arguments is passed in the args_size parameter.

Variadic Functions

Variadic functions in PVM can accept a variable number of arguments. The number of variadic arguments is specified by pushing a value onto the data stack before calling the function this is done by the MPC compiler automatically. This value is then added to the fixed number of arguments and used to determine the total number of arguments passed to a function.

Example

Consider a variadic function print that can accept any number of arguments:

void print(pvm_t *vm, pvm_data_t arguments[], pvm_data_stack_t args_size) {
    for (int i = 0; i < args_size; i++) {
        // Print each argument
        printf("%d ", arguments[i]);
    }
    printf("\n");
}

Return value

If a build-in function returns one or tuple of values, they should be put into the same arguments array upon return.

Example

Consider a function get_date that returns a tuple of values:

void get_date(pvm_t *vm, pvm_data_t arguments[], pvm_data_stack_t args_size) {
	time_t t;
	time(&t);
	const struct tm *tm = localtime(&t);

	arguments[0] = tm->tm_year + 1900; // year
	arguments[1] = tm->tm_mon + 1; // month
	arguments[2] = tm->tm_mday; // date
}

Debugging

You can use simple debugging of each opcode by defining a custom header file with static functions (or macros) that print debugging information and specify this header upon CMake configure passing -DPVM_DEBUG macro. For example, using the header included in the samples folder:

cmake -DPVM_DEBUG='samples/debug.h' ..

This will produce the necessary code to output information about PC, instruction and the stack dump like this for the example code below:

PC:0 PSH 0 → {0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
PC:1 STV [2] 0 ← {0, 0, 0, 0, 0, 0, 0, 0, 0}
PC:2 PSH 0 → {0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
PC:3 LDC [0] 18000 → {18000, 0, 0, 0, 0, 0, 0, 0, 0, 0}
PC:4 CAL <*5> (1) = {4, 4, 2025, 0, 0, 0, 0, 0, 0, 0, 0, 0}
PC:5 STV [5] 4 ← {4, 2025, 0, 0, 0, 4, 0, 0, 0, 0, 0}
PC:6 STV [4] 4 ← {2025, 0, 0, 0, 4, 4, 0, 0, 0, 0}
PC:7 STV [3] 2025 ← {0, 0, 0, 4, 4, 2025, 0, 0, 0}
...

Complete Usage Example

Simple usage example can be found in the samples folder of the project.

Bytecode Format

The bytecode format consists of a series of instructions, each represented by a single byte. The instructions are designed to be compact and efficient, with a focus on minimizing resource usage.

Instruction Set

  • PSH: Push a constant literal value onto the data stack.
  • PSC: Push a constant complement value onto the data stack, appending 5 bits to the existing value.
  • LDC: Load a constant from the constant array using index.
  • LDV: Load a variable from the data stack using variable index.
  • STV: Store a value in a variable on the data stack using variable index.
  • CAL: Call a function by index.
  • RET: Return from a function.
  • JMP: Jump to a specific offset. This instruction can hold offset up to 18 instructions forward.
  • JMB: Jump back to a specific offset.
  • SLP: Sleep for a specified duration.
  • ADD, SUB, MUL, DIV: Arithmetic operations.
  • AND, IOR, XOR: Logical operations.
  • NEG, INV, INC, DEC: Unary operations.
  • BZE, BNZ, BEQ, BNE, BGT, BLT, BGE, BLE: Branching operations.

Example Code

Below is an example of a Python program that demonstrates the use of bytecode, constants, and control flow:

FUNCTIONS: 5, CONSTANTS: 1
FUNCTIONS DESCRIPTORS (5)
	ADDRESS: None; ARGUMENTS: 1; VARIABLES: 0; RETURNS: 0; func output(action) <0>; 2 usage(s)
	ADDRESS: None; ARGUMENTS: 0; VARIABLES: 0; RETURNS: 0; func print() <1>; 1 usage(s)
	ADDRESS: None; ARGUMENTS: 1; VARIABLES: 0; RETURNS: 3; func get_realtime(timezone_offset) <2>; 1 usage(s)
	ADDRESS: None; ARGUMENTS: 1; VARIABLES: 0; RETURNS: 3; func get_date(timezone_offset) <3>; 1 usage(s)
	ADDRESS: None; ARGUMENTS: 1; VARIABLES: 0; RETURNS: 1; func get_weekday(timezone_offset) <4>; 1 usage(s)
CONSTANTS (1)
	VALUE: 18000; int = 18000 <0>; 3 usage(s)
CODE:
 func main()
00000  00    PSH 0                          ;     default_state = Action.ACTION_OFF
00001  F2    STV var default_state <2>      ;     default_state = Action.ACTION_OFF
 label_1:                                   ;     while True:
00002  00    PSH const TIMEZONE_OFFSET = 18000 <0> ;  year, month, date = get_date(TIMEZONE_OFFSET)
00003  B6    LDC                            ;         year, month, date = get_date(TIMEZONE_OFFSET)
00004  D3    CAL func get_date(timezone_offset) <3> ; year, month, date = get_date(TIMEZONE_OFFSET)
00005  F5    STV var date <5>               ;         year, month, date = get_date(TIMEZONE_OFFSET)
00006  F4    STV var month <4>              ;         year, month, date = get_date(TIMEZONE_OFFSET)
00007  F3    STV var year <3>               ;         year, month, date = get_date(TIMEZONE_OFFSET)
00008  00    PSH const TIMEZONE_OFFSET = 18000 <0> ;  hour, minute, second = get_realtime(TIMEZONE_OFFSET)
00009  B6    LDC                            ;         hour, minute, second = get_realtime(TIMEZONE_OFFSET)
00010  D2    CAL func get_realtime(timezone_offset) <2> ; hour, minute, second = get_realtime(TIMEZONE_OFFSET)
00011  F8    STV var second <8>             ;         hour, minute, second = get_realtime(TIMEZONE_OFFSET)
00012  F7    STV var minute <7>             ;         hour, minute, second = get_realtime(TIMEZONE_OFFSET)
00013  F6    STV var hour <6>               ;         hour, minute, second = get_realtime(TIMEZONE_OFFSET)
00014  00    PSH const TIMEZONE_OFFSET = 18000 <0> ;  weekday = get_weekday(TIMEZONE_OFFSET)
00015  B6    LDC                            ;         weekday = get_weekday(TIMEZONE_OFFSET)
00016  D4    CAL func get_weekday(timezone_offset) <4> ; weekday = get_weekday(TIMEZONE_OFFSET)
00017  F0    STV var weekday <0>            ;         weekday = get_weekday(TIMEZONE_OFFSET)
00018  E3    LDV var year <3>               ;         print(year, month, date, hour, minute, second, weekday)
00019  E4    LDV var month <4>              ;         print(year, month, date, hour, minute, second, weekday)
00020  E5    LDV var date <5>               ;         print(year, month, date, hour, minute, second, weekday)
00021  E6    LDV var hour <6>               ;         print(year, month, date, hour, minute, second, weekday)
00022  E7    LDV var minute <7>             ;         print(year, month, date, hour, minute, second, weekday)
00023  E8    LDV var second <8>             ;         print(year, month, date, hour, minute, second, weekday)
00024  E0    LDV var weekday <0>            ;         print(year, month, date, hour, minute, second, weekday)
00025  07    PSH 7                          ;         <variadic args count>
00026  D1    CAL func print() <1>           ;         print(year, month, date, hour, minute, second, weekday)
00027  00    PSH 0                          ;         state = Action.ACTION_OFF
00028  F1    STV var state <1>              ;         state = Action.ACTION_OFF
00029  01    PSH 1                          ;         if weekday >= 1:
00030  E0    LDV var weekday <0>            ;         if weekday >= 1:
00031  05    PSH 5                          ;         if weekday >= 1:
00032  A5    BLT label_2                    ;         if weekday >= 1:
00033  05    PSH 5                          ;             if weekday <= 5:
00034  E0    LDV var weekday <0>            ;             if weekday <= 5:
00035  01    PSH 1                          ;             if weekday <= 5:
00036  A4    BGT label_2                    ;             if weekday <= 5:
00037  01    PSH 1                          ;                 output(Action.ACTION_ON)
00038  D0    CAL func output(action) <0>    ;                 output(Action.ACTION_ON)
 label_2:
00039  E1    LDV var state <1>              ;         if default_state != state:
00040  E2    LDV var default_state <2>      ;         if default_state != state:
00041  03    PSH 3                          ;         if default_state != state:
00042  A2    BEQ label_3                    ;         if default_state != state:
00043  E1    LDV var state <1>              ;             default_state = state
00044  F2    STV var default_state <2>      ;             default_state = state
00045  E1    LDV var state <1>              ;             output(state)
00046  D0    CAL func output(action) <0>    ;             output(state)
 label_3:
00047  1F    PSH 31                         ;         sleep(1000)
00048  88    PSC 8                          ;         sleep(1000)
00049  B4    SLP                            ;         sleep(1000)
00050  31    PSH 49                         ;     while True:
00051  B7    JMB label_1                    ;     while True:

License

This Portable Virtual Machine for Microcontrollers (pvm) is licensed under the LGPL License. See the LICENSE file for more details.

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

Author

ma5ter

About

Portable Virtual Machine (pvm)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors