Initial FFI documentation

2024-11-24 04:43:25 +03:00 · 2020-03-02 14:29:15 +00:00 · 2020-03-02 14:29:15 +00:00 · 2893e6f9e8
commit 2893e6f9e8
parent 63847ca69f
1 changed files with 374 additions and 1 deletions
--- a/docs/ffi/ffi.rst
+++ b/docs/ffi/ffi.rst
@ -2,4 +2,377 @@
 Foreign Function Interface
 **************************

-[TODO]
+Idris 2 is designed to support multiple code generators. The default target is
+Chez Scheme, with a Racket code generator also supported. However, the
+intention is, as with Idris 1, to support multiple targets on multiple platforms,
+including e.g. JavaScript, JVM, .NET, and others yet to be invented.
+This makes the design of a foreign function interface (FFI), which calls
+functions in other languages, a little challenging, since ideally it will
+support all possible targets!
+
+To this end, the Idris 2 FFI aims to be flexible and adaptable, while still
+supporting most common requirements without too much need for "glue" code in
+the foreign language.
+Foreign functions are declared with the ``%foreign`` directive, which takes the
+following general form:
+
+.. code-block:: idris
+
+    %foreign [specifiers]
+    name : t
+
+The specifier is an Idris ``String`` says in which language the foreign
+function is written, what it's called, and where to find it. There may be more
+than one specifier, and a code generator is free to choose specifier it
+understands. In general, a specifier has the form "Language:name,library". For
+example, in C:
+
+.. code-block:: idris
+
+    %foreign "C:puts,libc"
+    puts : String -> PrimIO Int
+
+It is up to specific code generators to decide how to locate the function and
+the library. In this document, we will assume the default Chez Scheme code
+generator (the examples also work with the Racket code generator) and that the
+foreign language is C.
+
+Example
+-------
+
+As a running example, we are going to work with a small C file. Save the
+following content to a file ``smallc.c``
+
+::
+
+    #include <stdio.h>
+
+    int add(int x, int y) {
+        return x+y;
+    }
+
+    int addWithMessage(char* msg, int x, int y) {
+        printf("%s: %d + %d = %d\n", msg, x, y, x+y);
+        return x+y;
+    }
+
+Then, compile it to a shared library with::
+
+    cc -shared smallc.c -o libsmall.so
+
+We can now write an Idris program which calls each of these. First, we'll
+write a small program which uses ``add`` to add two integers:
+
+.. code-block:: idris
+
+    %foreign "C:add,libsmall"
+    add : Int -> Int -> Int
+  
+    main : IO ()
+    main = printLn (add 70 24)
+
+The ``%foreign`` declaration states that ``add`` is written in C, with the
+name ``add`` in the library ``libsmall``. As long as the run time is able
+to locate ``libsmall.so`` (in practice it looks in the current directory and
+the system library paths) we can run this at the REPL:
+
+::
+
+    Main> :exec main
+    94
+
+Note that it is the programmers responsibility to make sure that the
+Idris function and C function have corresponding types. There is no way for
+the machine to check this! If you get it wrong, you will get unpredictable
+behaviour.
+
+Since ``add`` has no side effects, we've given it a return type of ``Int``.
+But what if the function has some effect on the outside world, like
+``addWithMessage``? In this case, we use ``PrimIO Int`` to say that it
+return a primitive IO action:
+
+.. code-block:: idris
+
+    %foreign "C:addWithMessage,libsmall"
+    prim_addWithMessage : String -> Int -> Int -> PrimIO Int
+
+Internally, ``PrimIO Int`` is a function which takes the current (linear)
+state of the world, and returns an ``Int`` with an updated state of the world.
+We can convert this into an ``IO`` action using ``primIO``:
+
+.. code-block:: idris
+
+    primIO : PrimIO a -> IO a
+
+So, we can extend our program as follows:
+
+.. code-block:: idris
+
+  addWithMessage : String -> Int -> Int -> IO Int
+  addWithMessage s x y = primIO $ prim_addWithMessage s x y
+  
+  main : IO ()
+  main
+      = do printLn (add 70 24)
+           primIO $ prim_addWithMessage "Sum" 70 24
+           pure ()
+
+It is up to the programmer to declare which functions are pure, and which have
+side effects, via ``PrimIO``. Executing this gives:
+
+::
+
+    Main> :exec main
+    94
+    Sum: 70 + 24 = 94
+
+We have seen two specifiers for foreign functions:
+
+.. code-block:: idris
+
+    %foreign "C:add,libsmall"
+    %foreign "C:addWithMessage,libsmall"
+
+These both have the same form: ``"C:[name],libsmall"`` so instead of writing
+the concrete ``String``, we write a function to compute the specifier, and
+use that instead:
+
+.. code-block:: idris
+
+    libsmall : String -> String
+    libsmall fn = "C:" ++ fn ++ ",libsmall"
+
+    %foreign (libsmall "add")
+    add : Int -> Int -> Int
+
+    %foreign (libsmall "addWithMessage")
+    prim_addWithMessage : String -> Int -> Int -> PrimIO Int
+
+Primitive FFI Types
+-------------------
+
+The types which can be passed to and returned from foreign functions are
+restricted to those which it is reasonable to assume any back end can handle.
+In practice, this means most primitive types, and a limited selection of
+others.  Argument types can be any of the following primitives:
+
+* ``Int``
+* ``Char``
+* ``Double`` (as ``double`` in C)
+* ``String`` (as ``char*`` in C)
+* ``Ptr t`` and ``AnyPtr`` (both as ``void*`` in C)
+
+Return types can be any of the above, plus:
+
+* ``()``
+* ``PrimIO t``, where ``t`` is a valid return type other than a ``PrimIO``.
+
+Additionally, foreign functions can take *callbacks*, and take and return
+C ``struct`` pointers.
+
+Callbacks
+---------
+
+It is often useful in C for a function to take a *callback*, that is a function
+which is called after doing some work. For example, we can write a function
+which takes callback that takes a ``char*`` and an ``int`` and returns a
+``int*``, in C, as follows (added to ``smallc.c`` above):
+
+::
+
+    char* applyFn(char* x, int y, StringFn f) {
+        printf("Applying callback to %s %d\n", x, y);
+        return f(x, y);
+    }
+
+Then, we can access this from Idris by declaring it as a ``%foreign``
+function and wrapping it in ``IO``, with the C function calling the Idris
+function as the callback:
+
+.. code-block:: idris
+
+    %foreign (libsmall "applyFn")
+    prim_applyFn : String -> Int -> (String -> Int -> String) -> PrimIO String
+    
+    applyFn : String -> Int -> (String -> Int -> String) -> IO String
+    applyFn c i f = primIO $ prim_applyFn c i f
+
+For example, we can try this as follows:
+
+.. code-block:: idris
+
+    pluralise : String -> Int -> String
+    pluralise str x
+        = show x ++ " " ++
+                 if x == 1
+                    then str
+                    else str ++ "s"
+    
+    main : IO ()
+    main
+        = do str1 <- applyFn "Biscuit" 10 pluralise
+             putStrLn str1
+             str2 <- applyFn "Tree" 1 pluralise
+             putStrLn str2
+
+As a variant, the callback could have a side effect:
+
+.. code-block:: idris
+
+    %foreign (libsmall "applyFn")
+    prim_applyFnIO : String -> Int -> (String -> Int -> PrimIO String) ->
+                     PrimIO String
+  
+This is a little more fiddly to lift to an ``IO`` function, due to the callback,
+but we can do so using ``toPrim : IO a -> PrimIO a``:
+  
+.. code-block:: idris
+
+    applyFnIO : String -> Int -> (String -> Int -> IO String) -> IO String
+    applyFnIO c i f = primIO $ prim_applyFnIO c i (\s, i => toPrim $ f s i)
+  
+For example, the above ``pluralise`` example, but printing a message in the
+callback:
+
+.. code-block:: idris
+
+    pluralise : String -> Int -> IO String
+    pluralise str x
+        = do putStrLn "Pluralising"
+             pure $ show x ++ " " ++
+                    if x == 1
+                       then str
+                       else str ++ "s"
+    
+    main : IO ()
+    main
+        = do str1 <- applyFnIO "Biscuit" 10 pluralise
+             putStrLn str1
+             str2 <- applyFnIO "Tree" 1 pluralise
+             putStrLn str2
+
+Structs
+-------
+
+Many C APIs pass around more complex data structures, as a ``struct``.
+We do not aim to be completely general in the C types we support, because
+this will make it harder to write code which is portable across multiple
+back ends. However, it is still often useful to be able to access a ``struct``
+directly. For example, add the following to the top of ``smallc.c``, and
+rebuild ``libsmall.so``:
+
+::
+
+    #include <stdlib.h>
+
+    typedef struct {
+        int x;
+        int y;
+    } point;
+
+    point* mkPoint(int x, int y) {
+        point* pt = malloc(sizeof(point));
+        pt->x = x;
+        pt->y = y;
+        return pt;
+    }
+
+    void freePoint(point* pt) {
+        free(pt);
+    }
+
+We can define a type for accessing ``point`` in Idris by importing
+``System.FFI`` and using the ``Struct`` type, as follows:
+
+.. code-block:: idris
+
+    Point : Type
+    Point = Struct "point" [("x", Int), ("y", Int)]
+    
+    %foreign (libsmall "mkPoint")
+    mkPoint : Int -> Int -> Point
+    
+    %foreign (libsmall "freePoint")
+    prim_freePoint : Point -> PrimIO ()
+    
+    freePoint : Point -> IO ()
+    freePoint p = primIO $ prim_freePoint p
+
+The ``Point`` type in Idris now corresponds to ``point*`` in C. Fields can
+be read and written using the following, also from ``System.FFI``:
+
+.. code-block:: idris
+
+    getField : Struct s fs -> (n : String) ->
+               FieldType n ty fs => ty
+    setField : Struct s fs -> (n : String) ->
+               FieldType n ty fs => ty -> IO ()
+
+Notice that fields are accessed by name, and must be available in the
+struct, given the constraint ``FieldType n ty fs``, which states that the
+field named ``n`` has type ``ty`` in the structure fields ``fs``.
+So, we can display a ``Point`` as follows by accessing the fields directly:
+
+.. code-block:: idris
+
+    showPoint : Point -> String
+    showPoint pt
+        = let x : Int = getField pt "x"
+              y : Int = getField pt "y" in
+              show (x, y)
+
+And, as a complete example, we can initialise, update, display and
+delete a ``Point`` as follows:
+
+.. code-block:: idris
+
+    main : IO ()
+    main = do let pt = mkPoint 20 30
+              setField pt "x" (the Int 40)
+              putStrLn $ showPoint pt
+              freePoint pt
+
+The field types of a ``Struct`` can be any of the following:
+
+* ``Int``
+* ``Char``
+* ``Double`` (``double`` in C)
+* ``Ptr a`` or ``AnyPtr`` (``void*`` in C)
+* Another ``Struct``, which is a pointer to a ``struct`` in C
+
+Note that this doesn't include ``String`` or function types! This is primarily
+because these aren't directly supported by the Chez back end. However, you can
+use another pointer type and convert. For example, assuming you have, in C:
+
+::
+
+    typedef struct {
+        char* name;
+        point* pt;
+    } namedpoint;
+
+You can represent this in Idris as:
+
+::
+
+    NamedPoint : Type
+    NamedPoint 
+        = Struct "namedpoint" 
+                   [("name", Ptr String),
+                   ("pt", Point)]
+
+That is, using a ``Ptr String`` instead of a ``String`` directly. Then you
+can convert between a ``void*`` and a ``char*`` in C:
+
+::
+
+    char* getString(void *p) {
+        return (char*)p;
+    }
+
+...and use this to convert to a ``String`` in Idris:
+
+.. code-block:: idris
+
+    %foreign (pfn "getString")
+    getStr : Ptr String -> String