rarsm · ladyslipper-glade · Jan 27, 2026
diff --git a/docs/Assembler-Directives.md b/docs/Assembler-Directives.md
@@ -0,0 +1,23 @@
+The currently supported directives are:
+
+| Name | Description|
+|------|------------|
+|.align|Align next data item on specified byte boundary (0=byte, 1=half, 2=word, 3=double)|
+|.ascii|Store the string in the Data segment but do not add null terminator|
+|.asciz|Store the string in the Data segment and add null terminator|
+|.byte|Store the listed value(s) as 8 bit bytes|
+|.data|Subsequent items stored in Data segment at next available address|
+|.double|Store the listed value(s) as double precision floating point|
+|.end_macro|End macro definition.  See .macro|
+|.eqv|Substitute second operand for first. First operand is symbol, second operand is expression (like #define)|
+|.extern|Declare the listed label and byte length to be a global data field|
+|.float|Store the listed value(s) as single precision floating point|
+|.globl|Declare the listed label(s) as global to enable referencing from other files|
+|.half|Store the listed value(s) as 16 bit halfwords on halfword boundary|
+|.include|Insert the contents of the specified file.  Put filename in quotes.|
+|.macro|Begin macro definition.  See .end_macro|
+|.section|Allows specifying sections without .text or .data directives. Included for gcc comparability|
+|.space|Reserve the next specified number of bytes in Data segment|
+|.string|Alias for .asciz|
+|.text|Subsequent items (instructions) stored in Text segment at next available address|
+|.word|Store the listed value(s) as 32 bit words on word boundary|
diff --git a/docs/Calling-Convention.md b/docs/Calling-Convention.md
@@ -0,0 +1,67 @@
+The last part of hello world that has yet to be explained is how to make a system call.
+
+## System calls
+
+System calls are a way to tell the Operating System (OS) that your program would like to do something such as print text to the screen, play a sound or open a file. To do that we generally have to cross the [kernel-user boundary](https://blog.codinghorror.com/understanding-user-and-kernel-mode/). 
+
+The way to actually cross the boundary is with system calls. By executing the `ecall` instruction we transition to kernel mode and then the OS can handle our request and then return control to the program. 
+
+Note: On Windows you don't use system calls directly; instead you call a function which makes the system call for you. However, RARS tries to match Linux's behavior regarding system calls.
+
+If you want to issue a system call, first you look in the [supported system call list](Environment-Calls.md), find its system call number.
+
+That number will need to be saved into register `a7`/`x17`; the OS needs to know what you are trying to do. Then if the system call needs inputs those will be put in `a0`-`a6`. With that in place, when `ecall` is executed, the OS will execute the call and if there is output put it in `a0`.
+
+## Function calls
+
+Function calls are similar to system calls, but we don't need to cross the kernel boundary. Instead, we save our current location and jump to the beginning of a function; when that function is done it will jump back to that saved location.
+
+A simple function might look like:
+```
+add_one: # has a C declaration of: int add_one(int);
+  addi a0, a0, 1
+  jalr zero, ra, 0 # Alternatively psuedo-op ret
+
+main:
+  li a0, 2
+  jal ra, add_one # Alternatively psuedo-op "jal add_one" or "call add_one" 
+  jal ra, add_one
+  # a0 should now be 4
+```
+
+The new instructions are `jal` and `jalr`. They stand for "Jump And Link" and "Jump And Link Register" respectively.
+
+`jal` works by saving the current address into its register argument and jumping to the label argument. `jalr` is similar but it jumps to the address stored in its second argument added with some offset.
+
+So `jal ra, add_one` saves the current address into `ra`, and jumps to the `add_one` function. The body of the function is executed and then `jalr zero, ra, 0` jumps back to the saved location without saving the current location. 
+
+## Register Names
+
+While registers can be used for pretty much anything, there is a standard on how to use them so code written by different people will work together. Table 18.2 from the RISC-V standard:
+
+| Register | ABI name | Description | Saver |
+|----------|----------|-------------|-------|
+|x0|zero|Hard-wired zero||
+|x1|ra|Return address|Caller|
+|x2|sp|Stack pointer|Callee|
+|x3|gp|Global pointer||
+|x4|tp|Thread pointer||
+|x5–7|t0–2|Temporaries|Caller|
+|x8|s0/fp|Saved register/frame pointer|Callee|
+|x9|s1|Saved register|Callee|
+|x10–11|a0–1|Function arguments/return values|Caller|
+|x12–17|a2–7|Function arguments|Caller|
+|x18–27|s2–11|Saved registers|Callee|
+|x28–31|t3–6|Temporaries|Caller|
+
+The saver column is referring to who should save a register to memory when calling a function. If its Caller saved, then if you want to keep the register's value you need to save it before you call a function. And if its Callee saved then you need to save it before you overwrite it, when your function is being called. Some examples of proper calling convention will be shown in future tutorials.
+
+`zero`, `gp` and `tp` don't have a saver because they are intended to stay the same across function calls.
+
+## Using the stack
+
+The stack pointer provides a way for functions to store data while letting called functions use registers or to store extra data that can't fit in registers.
+
+The general idea is that functions can move the pointer to reserve space in memory to write and then when the function is ready to return move the pointer back where it started.
+
+More precisely, the register `x2` / `sp` represents an available pointer aligned to at least a word boundary (more in some situations); this means that `sw zero, 0(sp)` would not write over any data in the stack.
diff --git a/docs/Creating-Hello-World.md b/docs/Creating-Hello-World.md
@@ -0,0 +1,71 @@
+
+The most basic hello world that could be made in C is something like:
+```C
+int main(){
+    puts("Hello World!\n");
+    return 0;
+}
+```
+
+We don't have access to the C standard library, though, so let us change puts into a call to `write(fd,buffer,len)`
+which directly maps to a Linux system call. Let us also lift the definition of the string into a global variable because assembly language doesn't allow strings as arguments.
+
+This leaves us with some directly translatable C code:
+```C
+char* str = "Hello World!\n"
+int main(){
+    write(1,str,13);
+    return 0;
+}
+```
+
+The first line can be translated into assembly as:
+```assembly
+.data # Tell the assembler we are defining data not code
+str:  # Label this position in memory so it can be referred to in our code 
+.string "Hello World!\n" # Copy the string "Hello World!\n" into memory 
+```
+
+To start defining the main function we will use the code:
+
+```assembly
+.text # Tell the assembler that we are writing code (text) now 
+main: # Make a label to say where our program should start from
+```
+
+The body of main is a little harder to directly translate because you have to set up each of the arguments to the system call one by one. In total `write(1,str,13)` will take 5 instructions:
+
+```assembly
+li a0, 1   # li means to Load Immediate and we want to load the value 1 into register a0
+la a1, str # la is similar to li, but works for loading addresses
+li a2, 13  # like the first line, but with 13. This is the final argument to the system call
+li a7, 64  # a7 is what determines which system call we are calling and we what to call write (64)
+ecall      # actually issue the call
+```
+
+`return 0` is going to need to be changed a little before we can translate it. To exit cleanly we will need to use the `exit` system call.  
+
+```assembly
+li a0, 0  # The exit code we will be returning is 0
+li a7, 93 # Again we need to indicate what system call we are making and this time we are calling exit(93)
+ecall 
+```
+Putting all of those snippets together we get the code:
+```assembly
+.data # Tell the assembler we are defining data not code
+str:  # Label this position in memory so it can be referred to in our code 
+  .string "Hello World!\n" # Copy the string "Hello World!\n" into memory 
+
+.text # Tell the assembler that we are writing code (text) now 
+main: # Make a label to say where our program should start from
+
+  li a0, 1   # li means to Load Immediate and we want to load the value 1 into register a0
+  la a1, str # la is similar to li, but works for loading addresses
+  li a2, 13  # like the first line, but with 13. This is the final argument to the system call
+  li a7, 64  # a7 is what determines which system call we are calling and we what to call write (64)
+  ecall      # actually issue the call
+
+  li a0, 0   # The exit code we will be returning is 0
+  li a7, 93  # Again we need to indicate what system call we are making and this time we are calling exit(93)
+  ecall 
+```
diff --git a/docs/Differences-Between-32-and-64-bit-Modes.md b/docs/Differences-Between-32-and-64-bit-Modes.md
@@ -0,0 +1,52 @@
+Unfortunately, 64 bit RISC-V is not directly compatible with 32 bit RISC-V. The semantics of `add`, and other arithmetic instructions change to work on the whole 64 bit registers rather than just the 32 bits that the 32 bit version would operate on. The "w" suffix instructions, mimic the behavior of the 32 bit versions of instructions and sign extend the top 32 bits.
+
+Shifts are also a little different. Shifts consider 1 more bit to determine how far to shift. This allows them to shift an extra 32 bits. Immediate shifts allow encoding an additional bit as well. 
+
+### 64 bit only instructions
+
+| Example Usage | Description |
+|---------------|-------------|
+|addiw t1,t2,-100|Addition immediate: set t1 to (t2 plus signed 12-bit immediate) using only the lower 32 bits|
+|addw t1,t2,t3|Addition: set t1 to (t2 plus t3) using only the lower 32 bits|
+|divuw t1,t2,t3|Division: set t1 to the result of t2/t3 using unsigned division limited to 32 bits|
+|divw t1,t2,t3|Division: set t1 to the result of t2/t3 using only the lower 32 bits|
+|fcvt.d.l f1, t1, dyn|Convert double from long: Assigns the value of t1 to f1|
+|fcvt.d.lu f1, t1, dyn|Convert double from unsigned long: Assigns the value of t1 to f1|
+|fcvt.l.d t1, f1, dyn|Convert 64 bit integer from double: Assigns the value of f1 (rounded) to t1|
+|fcvt.l.s t1, f1, dyn|Convert 64 bit integer from float: Assigns the value of f1 (rounded) to t1|
+|fcvt.lu.d t1, f1, dyn|Convert unsigned 64 bit integer from double: Assigns the value of f1 (rounded) to t1|
+|fcvt.lu.s t1, f1, dyn|Convert unsigned 64 bit integer from float: Assigns the value of f1 (rounded) to t1|
+|fcvt.s.l f1, t1, dyn|Convert float from long: Assigns the value of t1 to f1|
+|fcvt.s.lu f1, t1, dyn|Convert float from unsigned long: Assigns the value of t1 to f1|
+|fmv.d.x f1, t1|Move float: move bits representing a double from an 64 bit integer register|
+|fmv.x.d t1, f1|Move double: move bits representing a double to an 64 bit integer register|
+|ld t1, -100(t2)|Set t1 to contents of effective memory double word address|
+|lwu t1, -100(t2)|Set t1 to contents of effective memory word address without sign-extension|
+|mulw t1,t2,t3|Multiplication: set t1 to the lower 32 bits of t2*t3 using only the lower 32 bits of the input|
+|remuw t1,t2,t3|Remainder: set t1 to the remainder of t2/t3 using unsigned division limited to 32 bits|
+|remw t1,t2,t3|Remainder: set t1 to the remainder of t2/t3 using only the lower 32 bits|
+|sd t1, -100(t2)|Store double word : Store contents of t1 into effective memory double word address|
+|slli t1,t2,33|Shift left logical : Set t1 to result of shifting t2 left by number of bits specified by immediate|
+|slliw t1,t2,10|Shift left logical (32 bit): Set t1 to result of shifting t2 left by number of bits specified by immediate|
+|sllw t1,t2,t3|Shift left logical (32 bit): Set t1 to result of shifting t2 left by number of bits specified by value in low-order 5 bits of t3|
+|srai t1,t2,33|Shift right arithmetic : Set t1 to result of sign-extended shifting t2 right by number of bits specified by immediate|
+|sraiw t1,t2,10|Shift right arithmetic (32 bit): Set t1 to result of sign-extended shifting t2 right by number of bits specified by immediate|
+|sraw t1,t2,t3|Shift left logical (32 bit): Set t1 to result of shifting t2 left by number of bits specified by value in low-order 5 bits of t3|
+|srli t1,t2,33|Shift right logical : Set t1 to result of shifting t2 right by number of bits specified by immediate|
+|srliw t1,t2,10|Shift right logical (32 bit): Set t1 to result of shifting t2 right by number of bits specified by immediate|
+|srlw t1,t2,t3|Shift left logical (32 bit): Set t1 to result of shifting t2 left by number of bits specified by value in low-order 5 bits of t3|
+|subw t1,t2,t3|Subtraction: set t1 to (t2 minus t3) using only the lower 32 bits|
+
+### 64 bit only psuedo-instructions:
+
+| Example Usage | Description |
+|---------------|-------------|
+|fcvt.d.l  f1, t1         |Convert double from signed 64 bit integer: Assigns the value of t1 to f1|
+|fcvt.d.lu f1, t1         |Convert double from unsigned 64 bit integer: Assigns the value of t1 to f1|
+|fcvt.l.d  t1, f1         |Convert signed 64 bit integer from double: Assigns the value of f1 (rounded) to t1|
+|fcvt.l.s  t1, f1         |Convert signed 64 bit integer from float: Assigns the value of f1 (rounded) to t1|
+|fcvt.lu.d t1, f1         |Convert unsigned 64 bit integer from double: Assigns the value of f1 (rounded) to t1|
+|fcvt.lu.s t1, f1         |Convert unsigned 64 bit integer from float: Assigns the value of f1 (rounded) to t1|
+|fcvt.s.l  f1, t1         |Convert float from signed 64 bit integer: Assigns the value of t1 to f1|
+|fcvt.s.lu f1, t1         |Convert float from unsigned 64 bit integer: Assigns the value of t1 to f1|
+|li t1,1000000000000000 |Load Immediate : Set t1 to 64-bit immediate|