LLVM JIT sample code

1. Introduction

I wrote and run LLVM JIT sample program by refering to [1] and the LLVM official page.

2. Prerequities

clang: version 5.0.1
llvm: version 5.0.1
* I installed the above softwares according to [1]
implementation language: C language

3. Sample Code itself

This program
(i)create LLVM IR(Intermidiate Representation) program which multiples two values.
(ii) Call (i) IR program from main function

#include <llvm-c/Core.h>
#include <llvm-c/ExecutionEngine.h>
#include <llvm-c/Target.h>
#include <llvm-c/Analysis.h>
#include <llvm-c/BitWriter.h>

#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char const *argv[]) {
    LLVMModuleRef mod = LLVMModuleCreateWithName("test_module");

    LLVMTypeRef param_types[] = { LLVMInt32Type(), LLVMInt32Type() };
    LLVMTypeRef ret_type = LLVMFunctionType(LLVMInt32Type(), param_types, 2, 0);
    LLVMValueRef mul = LLVMAddFunction(mod, "mul", ret_type);

    LLVMBasicBlockRef entry = LLVMAppendBasicBlock(mul, "entry");

    LLVMBuilderRef builder = LLVMCreateBuilder();
    LLVMPositionBuilderAtEnd(builder, entry);
    LLVMValueRef lhs = LLVMGetParam(mul, 0);
    LLVMValueRef rhs = LLVMGetParam(mul, 1);
    LLVMValueRef tmp = LLVMBuildMul(builder, lhs, rhs, "tmp");
    LLVMBuildRet(builder, tmp);

    char *error = NULL;
    LLVMVerifyModule(mod, LLVMAbortProcessAction, &error);
    LLVMDisposeMessage(error);

    LLVMExecutionEngineRef engine;
    error = NULL;
    LLVMLinkInMCJIT();
    LLVMInitializeNativeTarget();
    LLVMInitializeNativeAsmPrinter();
    LLVMCreateExecutionEngineForModule(&engine, mod, &error);

    if (argc < 3) {
        fprintf(stderr, "usage: %s x y\n", argv[0]);
        exit(-1);
    }
    long long x = strtoll(argv[1], NULL, 10);
    long long y = strtoll(argv[2], NULL, 10);
    
    int (*mul_func)(int, int) = (int (*)(int, int))LLVMGetFunctionAddress(engine, "mul");
    printf("%d\n", mul_func(x, y));

    LLVMWriteBitcodeToFile(mod, "mul.bc");
    LLVMDisposeModule(mod);
}

4. Commentary of Sample Code

4.1 Creating LLVM IR Program

(1) Including header files

LLVM is implemented in C++ and LLVM C wrapper interface LLVM-C.

#include <llvm-c/Core.h>
#include <llvm-c/ExecutionEngine.h>
#include <llvm-c/Target.h>
#include <llvm-c/Analysis.h>
#include <llvm-c/BitWriter.h>

(2) Creating Module
Modules represent the top-level structure in an LLVM program[10]. LLVMModuleCreateWithName function creates a new, empty module in the global context[10]. LLVMModuleCreateWithName internally creates llvm:: Module[16]

LLVMModuleRef mod = LLVMModuleCreateWithName("test_module");

(3) Adding Function to Module
LLVMAddFunction function adds FunctionType to Module. FunctionType is created by LLVMFunctionType function. FunctionType inputs are a return type, parameter types, num of parameter, flag meaning whether a function has variable length arguments or not.

LLVMTypeRef param_types[] = { LLVMInt32Type(), LLVMInt32Type() };
LLVMTypeRef ret_type = LLVMFunctionType(LLVMInt32Type(), param_types, 2, 0);

The following is LLVMAddFunction function call.

LLVMValueRef mul = LLVMAddFunction(mod, "mul", ret_type);

(4) Appending Basic Block to Module

A basic block represents a single entry single exit section of code. Basic blocks contain a list of instructions which form the body of the block. LLVMAppendBasicBlock Appends a basic block to the end of a function using the global context[12].

LLVMBasicBlockRef entry = LLVMAppendBasicBlock(mul, "entry");

(5) Adding IR codes to Basic Block

An instruction builder represents a point within a basic block and is the exclusive means of building instructions using the C interface[11]. LLVMCreateBuilder calls IRBuilder constructor[11]. LLVMPositionBuilderAtEnd calls SetInsertPoint method of llvm::IRBuilderBase Class which specifies that created instructions should be appended to the end of the specified block[13].

LLVMBuilderRef builder = LLVMCreateBuilder();
LLVMPositionBuilderAtEnd(builder, entry);

The following code adds instructions which get two parameters of mul function and multiple them and return the multiple result. LLVMGetParams calls llvm::Function::arg_begin => CheckLazyArguments => Function::BuildLazyArguments which is a core code[14]. LLVMBuildMul[11] calls CreateMul method which creates multiple instruction[15].

LLVMValueRef lhs = LLVMGetParam(mul, 0);
LLVMValueRef rhs = LLVMGetParam(mul, 1);
LLVMValueRef tmp = LLVMBuildMul(builder, lhs, rhs, "tmp");
LLVMBuildRet(builder, tmp);

4.2 JIT Engine Creation

MCJIT is the second generation of LLVM JIT API[4](the latest one is ORC). LLVMCreateExecutionEngineForModule creates JIT engine with module. LLVMCreateExecutionEngineForModule calls ExecutionEngineBuilder::create assume that JIT is linked. So LLVMLinkInMCJIT should be called before LLVMCreateExecutionEngineForModule. LLVMLinkInMCJIT calls MCJIT::createJIT. LLVMCreateExecutionEngineForModule[16] calls ExecutionEngineBuilder::create which create an instance of the MCJIT engine[5]. For code emission, settings of target machine setting and asm printer are necessary[18]. So LLVMInitializeNativeTarget and LLVMInitializeNativeAsmPrinter should be called before LLVMCreateExecutionEngineForModule.

LLVMExecutionEngineRef engine;
error = NULL;
LLVMLinkInMCJIT();
LLVMInitializeNativeTarget();
LLVMInitializeNativeAsmPrinter();
LLVMCreateExecutionEngineForModule(&engine, mod, &error);

4.3 Running IR program using JIT engine

LLVMGetFunctionAddress calls llvm::ExecutionEngine::getFunctionAddress which return the address of the specified function. This may involve code generation.

int (*mul_func)(int, int) = (int (*)(int, int))LLVMGetFunctionAddress(engine, "mul");
printf("%d\n", mul_func(x, y));

4.4 Releasing resources

LLVMDisposeModule should be paired with LLVMModuleCreateWithName.

LLVMDisposeModule(mod);

5. Build and Run

clang -I. -g `llvm-config --cflags` -c ./mul.c -o ./mul.o
clang++ -o ./mul `llvm-config --cxxflags --ldflags --libs core executionengine mcjit interpreter analysis native bitwriter --system-libs` mul.o
./mul 2 3
# 6

6. References

[1] How to get started with the LLVM C API
https://www.pauladamsmith.com/blog/2015/01/how-to-get-started-with-llvm-c-api.html

[2] How to find operators accelarated by JIT in PostgreSQL
https://osmanthus.work/?p=158

[3] Getting Started/Tutorials, Kaleidoscope: Adding JIT and Optimizer Support
https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl04.html#adding-a-jit-compiler

[4] Getting Started/Tutorials, Building a JIT: Starting out with KaleidoscopeJIT
https://releases.llvm.org/5.0.1/docs/tutorial/BuildingAJIT1.html

[5] LLVM User Guides, MCJIT Design and Implementation
https://releases.llvm.org/5.0.1/docs/MCJITDesignAndImplementation.html

[6] LLVM document, Kaleidoscope: Adding JIT and Optimizer Support, Adding a JIT Compiler
https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl04.html#adding-a-jit-compiler

[7] LLVM References, Doxygen generated documentation
https://llvm.org/doxygen/

[8] LLVM References, Doxygen generated documentation, LLVM-C: C interface to LLVM
https://llvm.org/doxygen/group__LLVMC.html

[9] LLVM References, Doxygen generated documentation, LLVM-C, Core, Type
https://llvm.org/doxygen/group__LLVMCCoreType.html

[10] LLVM References, Doxygen generated documentation, LLVM-C, Core, Module
https://llvm.org/doxygen/group__LLVMCCoreModule.html

[11] LLVM References, Doxygen generated documentation, LLVM-C, Instruction Builders
https://llvm.org/doxygen/group__LLVMCCoreInstructionBuilder.html

[12] LLVM References, Doxygen generated documentation, LLVM-C, Core, Basic Block
https://llvm.org/doxygen/group__LLVMCCoreValueBasicBlock.html

[13] LLVM References, Doxygen generated documentation, llvm::IRBuilderBase Class Reference
https://llvm.org/doxygen/classllvm_1_1IRBuilderBase.html

[14] LLVM References, Doxygen generated documentation, llvm/Function.h
https://llvm.org/doxygen/Function_8cpp_source.html

[15] LLVM References, Doxygen generated documentation, Reassociate.cpp File Reference
https://llvm.org/doxygen/Reassociate_8cpp.html

[16] LLVM References, Doxygen generated documentation, LLVM-C, Execution Engine
https://llvm.org/doxygen/group__LLVMCExecutionEngine.html

[17] LLVM References, Doxygen generated documentation, ExecutionEngine.cpp
https://llvm.org/doxygen/ExecutionEngine_8cpp_source.html

[18] The LLVM Target-Independent Code Generator
https://releases.llvm.org/5.0.1/docs/CodeGenerator.html

[19] llvm::ExecutionEngine Class Reference
https://llvm.org/doxygen/classllvm_1_1ExecutionEngine.html

Published by ktke109

I love open souce database management systems.