# Differences to Boquist's PhD thesis RISC backend - support for multiple types (float, int, etc.) - LLVM is typed ; values must have proper types i.e. - no register pinning ; instead pass registers (e.g. heap pointer) as local parameter - rely on LLVM register allocator Boquist's RISC backend only supports int type ; there are problematic expressions: unit (A Int | B Float) IDEA: - GRIN must be monomorph in LLVM type system - Vectorisation is an efficient mapping to product types ``` node = tag + simple values + node pointers con data = simple values + node pointers slim layout: tag + con data fat layout: tag + con1 data + ... + conN data ``` IDEA: - support array literals - support struct literals HINT: - fetch, store, update must be layout monorphic - the only supported cast is pointer cast ## Legalisation implement new GRIN transformations - llvm monomorphisation ; split polymorphic (in LLVM type system) GRIN node components - node layout calculation It is possible to put monomorphisation into the vectorisation transformation as it is already relies on the HPT result. ## Notes - CodeGen supports monomorphic GRIN ; monomorphic GRIN validation pass - new transformation: grin-monomorphisation turns variadic typed GRIN into monomorphic type GRIN (using HPT result) - GRIN type system node heap location simple type tag - GRIN type system for LLVM codegen node TAG heap location simple type LLVM TYPE tag N - new transformation: llvm-monomorphisation makes GRIN monomorphic in LLVM type system (using HPT result) - GRIN transformations must not lose information. If the information is only stored in the TAG info table then the TAG info table must be part of the GRIN language. e.g. as types LLVM bitcast experiments - convert i16 to - convert i8 to # LLVM codegen without analysis It is possible to compile to LLVM from high level GRIN without analysis. However an universal value representation is required where every GRIN register is mapped to a vector of universal values. Basically an interpreter can be generated for the input source code. If the source language can provide type information then the value representation can be more efficient. # Node representation The Heap-Points-To analysis calculates a type set (set of possible value types) for every GRIN variable and heap location. The corresponding type set for a variables or heap location is described by the result of the HPT analysis. e.g. ```haskell Heap 1 -> {CInt[{T_Int64}]} 2 -> {CInt[{T_Int64}]} 3 -> {CInt[{T_Int64}] ,Fadd[{1},{2}]} Env a -> {1,2,3} b -> {T_Int64} c -> {CInt[{T_Int64}] ,Fadd[{1},{2}]} ``` Each type set describe the possible values that a variable or heap location can hold. A type set is a disjoint union of value types that the variable can store at a time. The GRIN values can not contain every possible value at a time. GRIN value types: - simple type - node - location Currently in GRIN only the following value type combinations can form a valid type set: - `{simple type}` - singleton type set of a simple type - `{node+}` - type set of one or more node type - `{location+} ` - type set of one or more location types Due to the disjoint property of the GRIN values, they can be represented as tagged unions. ## Tag construction Beside node types, type sets must have tags to mark their current content. The type set tags can be constructed the following way: - `{simple type}` - singleton set, no tag needed - `{node+}` - node tags can be reused - `{location+} ` - location values are raw pointers, the abstract location index can be used as tag; location as tagged union value `{location, pointer}` - `{location} ` - singleton set, no tag is needed ## Operations Type set tagged union operations: - `pack (value :: type :: type set) = (tagged union :: type set)` - `unpack (tagged union :: type set) (tag/witness :: type :: type set ) = (value :: type :: type set)` Node operations: - `build (tag :: node tag) (values :: [type]) = (node :: type)` - `project (node :: type) (elemIndex :: Int) = (element :: type)` ## TODO - prune dead variables from HPTResult before converting to TypeEnv - hash cons TypeEnv to get rid of duplicate types - use better variable names in the generated LLVM IR - remove special heap pointer handling from codegen ; expose it in GRIN via a transfromation ; heap pointer should be a parameter and return value of `store`.