Skip to content

Latte72R/LaCC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

410 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

English | 日本語

LaCC is a minimalist C compiler that implements only the core language features you need to get simple C programs running.

DeepWiki GitHub License

Supported Features

1. Data Types

  • Primitive: int, char, void, unsigned, long, long long, short, _Bool
  • Derived: pointer types (T*), arrays (T[])
  • Composite: structures (struct), unions (union), enumerations (enum)

2. Functions

  • Definition: specify parameter and return types
  • Declaration & Invocation: define and call functions; use return to send back a value

3. Global & Local Variables

Both global and local (stack) variable declarations are supported.

4. Control Flow

  • Conditional Branching
    • if (condition) { … }
    • else { … }
  • Loops
    • for (init; condition; step) { … }
      You can omit any or all of the three components of a for loop (initialization, condition, and step).
    • while (condition) { … }
    • do { … } while (condition);
  • Loop Control
    • break exits a loop
    • continue skips to the next iteration

5. Operators

  • Arithmetic: +, -, *, /, %
  • Relational: ==, !=, <, <=, >, >=
  • Logical: &&, ||, !
  • Bitwise: &, |, ^, ~, <<, >>

6. Others

  • Include directive LaCC can process #include statements with double quotes (e.g., #include "foo.h") and with angle brackets (e.g., #include <bar.h>).
    Quoted includes first look relative to the including file, while angle brackets search configured include paths.
    /usr/include and /usr/include/x86_64-linux-gnu are added automatically, but many system headers still rely on unsupported language features.

  • Preprocessor macros
    Object-like and function-like #define directives (including # stringizing and ## token pasting) are expanded during tokenization.
    Diagnostics that originate inside macro expansions point back to the original file/line so complex headers remain debuggable.

  • Preprocessor conditionals
    Conditional compilation with #if, #ifdef, #ifndef, #elif, #else, and #endif is supported, along with #undef for removing macro definitions.

  • Built-in predefined macros
    __LACC__, __x86_64__, __LP64__, , and a handful of compat aliases are defined so typical Unix headers can detect the environment.

  • Initializer lists for arrays and structs (with limitations)

    1. Array initialization with a list of integer constants:
      int arr[3] = {3, 6, 2};
    2. String literal initialization for character arrays:
      char str[15] = "Hello, World!\n";
    3. Designated initializers for structs/unions:
      struct AB v = {.a = 1, .b = 2};

    Initializer expressions must currently be integer constants (or string literals for char[]). Nested array initializers beyond a single level are not supported.

  • Extern declarations
    LaCC supports external variable declarations with basic types, pointers, and arrays.

  • Typedef support
    LaCC supports the typedef keyword for creating type aliases.

  • Type qualifiers & storage-class specifiers: const, volatile, static, and pointer-only qualifiers seen in system headers (restrict, _Nullable, _Nonnull, __strong, etc.) are parsed and safely ignored.

  • goto statement and labels
    LaCC supports goto statements and label definitions, allowing for non-linear control flow.

  • Struct and Union member access
    Both dot notation (.) for direct access and arrow notation (->) for pointer access are supported.

  • Struct/union bit-fields
    C-style bit-field declarations (with optional zero-width separators and unnamed fields) are parsed and contribute to layout checks. Code-gen still treats them as opaque aggregates.

  • Binary and Hexadecimal numbers: 0b001011 or 0xFF2A

  • Inline Assembly passthrough
    __asm__, and variants with volatile/clobber lists are recognized and skipped, allowing headers that embed inline assembly to preprocess successfully (no assembly is emitted by LaCC).

  • Comment support

    // single-line comment
    /* multi-line
       comment */
  • switch statements and case / default labels
    LaCC supports switch statements with case labels for branching based on the value of an expression.

  • Explicit type casting
    LaCC allows explicit type casting between compatible pointer types.

  • Ternary conditional operator LaCC supports the ternary conditional operator (?:) for inline conditional expressions.

Note on float/double support

To keep system headers parsable, float and double are recognized only for parsing purposes.

  • Arithmetic, comparisons, and code generation for floating-point types are not implemented yet.
  • sizeof(float) is treated as 4 and sizeof(double) as 8 (LP64). long double is also simplified to 8 bytes.
  • Header typedefs like typedef float _Float32; and typedef double _Float64; parse successfully, but do not rely on using these types in expressions for now.

Full floating-point semantics (dedicated TY_FLOAT / TY_DOUBLE, arithmetic, and ABI handling) may be added later. Until then, avoid expressions that require floating-point operations.

Unsupported Constructs

LaCC does not support the following:

  • Floating-point operations: while float and double are recognized for parsing and sizeof, arithmetic and codegen are not implemented
  • Initializer lists do not yet support deeply nested array initializers
  • Inline assembly
  • Variadic functions (macros such as va_list, va_start, and va_arg are not supported)
  • Nested functions (functions defined within other functions)
  • Variable Length Arrays (VLAs)

Limitations

Single‐Unit Compilation

LaCC only handles one .c file at a time — there's no support for separate compilation or linking multiple translation units.

Optimizations

LaCC does not emit assembly directly from AST nodes. It implements optimization around MIR (an intermediate representation).

Optimization behavior depends on the optimization level:

  • -O0

    • Generates MIR, then runs normal register allocation and instruction emission.
    • Does not run the main optimization passes (inline expansion / mem2reg / CFG cleanup).
    • Still applies parser-side constant folding and local simplifications in the emitter where possible (for example, immediate-form instruction selection).
  • -O1

    • Includes all -O0 behavior plus staged MIR-level optimization passes.
    • Typical passes include:
      • Conditional inlining of small static inline functions
      • Copy propagation / DCE
      • Compare/branch fusion (for cmp+setcc+jz-style patterns)
      • Unreachable block and unreferenced label pruning
      • mem2reg (promotable locals to registers)
      • CFG-based dead store elimination
      • VReg compaction
      • Constant folding at the emission stage

-O is currently equivalent to -O1.
There is no separate -O2 (or higher) optimization pipeline yet.

To compare optimization results (assembly line counts), use:

make asmcmp

For MIR / register-allocation debugging output, use:

LACC_DUMP_MIR=1 ./build/lacc -O1 -S foo.c -o foo.s
LACC_DUMP_RA=1 ./build/lacc -O1 -S foo.c -o foo.s
LACC_DUMP_RA=1 LACC_DUMP_RA_FN=main ./build/lacc -O1 -S foo.c -o foo.s

Getting Started with LaCC

1. Clone the repository and enter it

git clone https://github.com/Latte72R/LaCC
cd LaCC

After that, you have a few make targets to build and test your compiler:

2. Build the self-hosted compiler

make selfhost

Here, a bootstrap compiler bootstrap is used to recompile the compiler source itself, producing a self-hosted compiler named lacc.
This ensures that your compiler can correctly compile its own code.

3. Run specific files with the self-hosted compiler

make run FILE=./examples/lifegame.c
make run FILE=./examples/rotate.c

This command compiles and runs the specified C file using the self-hosted compiler lacc.

4. Run unit tests with the self-hosted compiler

make unittest

Passing all tests confirms that your self-hosted compiler behaves as expected.
The unit tests are located in the tests/unittest.c file.

5. Run warning tests with the self-hosted compiler

make warntest

This command runs warning tests to ensure that the compiler correctly identifies and reports warnings. The warning tests are located in the tests/warntest.c file.

6. Run error tests with the self-hosted compiler

make errortest

This command runs error tests to ensure that the compiler correctly identifies and reports errors. The error tests are located in the tests/errortest.sh file.

7. Clean up build artifacts

make clean

Removes the generated binaries and assembly files created during the build process.

7. Show help

make help

Displays a list of available make targets and their descriptions.

About the Author

LaCC is designed and maintained by student engineer Latte72 !

Links

About

Latte's C Compiler

Topics

Resources

License

Stars

Watchers

Forks

Contributors