grin/GRIN-LLVM-CodeGen.md
Csaba Hruska d0e7cc2a02 add todo
2018-02-23 12:26:59 +01:00

123 lines
4.5 KiB
Markdown

# Differences to Boquist's PhD thesis RISC backend
- support for multiple types (float, int, etc.)
- LLVM is typed ; values must have proper types i.e.
- no register pinning ; instead pass registers (e.g. heap pointer) as local parameter
- rely on LLVM register allocator
Boquist's RISC backend only supports int type ; there are problematic expressions: unit (A Int | B Float)
IDEA:
- GRIN must be monomorph in LLVM type system
- Vectorisation is an efficient mapping to product types
```
node = tag + simple values + node pointers
con data = simple values + node pointers
slim layout: tag + con data
fat layout: tag + con1 data + ... + conN data
```
IDEA:
- support array literals
- support struct literals
HINT:
- fetch, store, update must be layout monorphic
- the only supported cast is pointer cast
## Legalisation
implement new GRIN transformations
- llvm monomorphisation ; split polymorphic (in LLVM type system) GRIN node components
- node layout calculation
It is possible to put monomorphisation into the vectorisation transformation as it is already relies on the HPT result.
## Notes
- CodeGen supports monomorphic GRIN ; monomorphic GRIN validation pass
- new transformation: grin-monomorphisation
turns variadic typed GRIN into monomorphic type GRIN (using HPT result)
- GRIN type system
node
heap location
simple type
tag
- GRIN type system for LLVM codegen
node TAG
heap location
simple type LLVM TYPE
tag N
- new transformation: llvm-monomorphisation
makes GRIN monomorphic in LLVM type system (using HPT result)
- GRIN transformations must not lose information. If the information is only stored in the TAG info table then the TAG info table must be part of the GRIN language. e.g. as types
LLVM bitcast experiments
- convert i16 to <i8,i8>
- convert i8 to <i2,i2,i2,i2>
# LLVM codegen without analysis
It is possible to compile to LLVM from high level GRIN without analysis.
However an universal value representation is required where every GRIN register is mapped to a vector of universal values.
Basically an interpreter can be generated for the input source code.
If the source language can provide type information then the value representation can be more efficient.
# Node representation
The Heap-Points-To analysis calculates a type set (set of possible value types) for every GRIN variable and heap location.
The corresponding type set for a variables or heap location is described by the result of the HPT analysis. e.g.
```haskell
Heap
1 -> {CInt[{T_Int64}]}
2 -> {CInt[{T_Int64}]}
3 -> {CInt[{T_Int64}]
,Fadd[{1},{2}]}
Env
a -> {1,2,3}
b -> {T_Int64}
c -> {CInt[{T_Int64}]
,Fadd[{1},{2}]}
```
Each type set describe the possible values that a variable or heap location can hold.
A type set is a disjoint union of value types that the variable can store at a time.
The GRIN values can not contain every possible value at a time.
GRIN value types:
- simple type
- node
- location
Currently in GRIN only the following value type combinations can form a valid type set:
- `{simple type}` - singleton type set of a simple type
- `{node+}` - type set of one or more node type
- `{location+} ` - type set of one or more location types
Due to the disjoint property of the GRIN values, they can be represented as tagged unions.
## Tag construction
Beside node types, type sets must have tags to mark their current content.
The type set tags can be constructed the following way:
- `{simple type}` - singleton set, no tag needed
- `{node+}` - node tags can be reused
- `{location+} ` - location values are raw pointers, the abstract location index can be used as tag;
location as tagged union value `{location, pointer}`
- `{location} ` - singleton set, no tag is needed
## Operations
Type set tagged union operations:
- `pack (value :: type :: type set) = (tagged union :: type set)`
- `unpack (tagged union :: type set) (tag/witness :: type :: type set ) = (value :: type :: type set)`
Node operations:
- `build (tag :: node tag) (values :: [type]) = (node :: type)`
- `project (node :: type) (elemIndex :: Int) = (element :: type)`
## TODO
- prune dead variables from HPTResult before converting to TypeEnv
- hash cons TypeEnv to get rid of duplicate types
- use better variable names in the generated LLVM IR
- remove special heap pointer handling from codegen ; expose it in GRIN via a transfromation ; heap pointer should be a parameter and return value of `store`.