mirror of
https://github.com/jberthold/packman.git
synced 2024-10-26 14:09:53 +03:00
Fixes in code comments and Haddock documentation (no code)
This commit is contained in:
parent
34b56af22f
commit
0c2776b164
@ -4,9 +4,9 @@
|
||||
{- |
|
||||
|
||||
Module : GHC.Packing
|
||||
Copyright : (c) Jost Berthold, 2010-2014,
|
||||
Copyright : (c) Jost Berthold, 2010-2015,
|
||||
License : BSD3
|
||||
Maintainer : jb.diku@gmail.com
|
||||
Maintainer : jost.berthold@gmail.com
|
||||
Stability : experimental
|
||||
Portability : no (depends on GHC internals)
|
||||
|
||||
@ -31,11 +31,12 @@ heap data:
|
||||
The routine will throw a 'PackException' if an error occurs inside the
|
||||
C code which accesses the Haskell heap (see @'PackException'@).
|
||||
In presence of concurrent threads, another thread might be evaluating
|
||||
data /referred to/ by the data to be serialised. It would be nice to
|
||||
/block/ the calling thread in this case, but this is not possible in
|
||||
the library version (see <#background Background Information> below).
|
||||
'trySerialize' variant will instead signal the condition as
|
||||
'PackException' 'P_BLACKHOLE'.
|
||||
data /referred to/ by the data to be serialised. In this case, the calling
|
||||
thread will /block/ on the ongoing evaluation and continue when evaluated
|
||||
data is available.
|
||||
Internally, there is a 'PackException' 'P_BLACKHOLE' to signal the
|
||||
condition, but it is hidden inside the core library
|
||||
(see <#background Background Information> below).
|
||||
|
||||
The inverse operation to serialisation is
|
||||
|
||||
@ -191,27 +192,29 @@ and better usability.
|
||||
The original primitive @'serialize'@ is modified and now returns error
|
||||
codes, leading to the following type (again paraphrasing):
|
||||
|
||||
> serialize# :: a -> IO ( Int# , ByteArray# )
|
||||
> trySerialize# :: a -> IO ( Int# , ByteArray# )
|
||||
|
||||
where the @Int#@ encodes potential error conditions returned by the runtime.
|
||||
|
||||
A second primitive operation has been defined, which considers the presence
|
||||
of concurrent evaluations of the serialised data by other threads:
|
||||
A second primitive operation has been defined, which uses a pre-allocated
|
||||
@ByteArray#@
|
||||
|
||||
> trySerialize# :: a -> IO ( Int# , ByteArray# )
|
||||
> trySerializeWith# :: a -> ByteArray# -> IO ( Int# , ByteArray# )
|
||||
|
||||
Further to returning error codes, this primitive operation will not block
|
||||
Further to returning error codes, the newer primitive operation do not block
|
||||
the calling thread when the serialisation encounters a blackhole in the
|
||||
heap. While blocking is a perfectly acceptable behaviour (making packing
|
||||
behave analogous to evaluation wrt. concurrency), the @'trySerialize'@
|
||||
variant allows one to explicitly control it and avoid becoming unresponsive.
|
||||
heap.
|
||||
It would be possible to observe the existence of blackholes from Haskell by
|
||||
the return code of these primitive operation. This could - in theory - be
|
||||
used to explicitly control and avoid blocking (avoiding unresponsive behaviour).
|
||||
In practice, however, making blackholes observable from Haskell is
|
||||
certainly undesirable. Therefore, the primitive operation will return
|
||||
the address of the blackhole. This makes it possible to encode blocking on the blackhole at the Haskell level (see code in the @GHC.Packing.Core@ module).
|
||||
certainly undesirable. The primitive operations return the address of the
|
||||
blackhole, and the caller will block on this blackhole at
|
||||
the Haskell level (see code in the @GHC.Packing.Core@ module).
|
||||
|
||||
The Haskell layer and its types protect the interface function @'deserialize'@
|
||||
from being applied to grossly wrong data (by checking a fingerprint of the
|
||||
executable and the expected type), but deserialisation is fragile by nature
|
||||
executable and the expected type), but deserialisation is still rather fragile
|
||||
(unpacking code pointers and data).
|
||||
The primitive operation in the runtime system will only detect grossly wrong
|
||||
formats, and the primitive will return error code @'P_GARBLED'@ when data
|
||||
|
@ -5,9 +5,9 @@
|
||||
{-|
|
||||
|
||||
Module : GHC.Packing
|
||||
Copyright : (c) Jost Berthold, 2010-2014,
|
||||
Copyright : (c) Jost Berthold, 2010-2015,
|
||||
License : BSD3
|
||||
Maintainer : jb.diku@gmail.com
|
||||
Maintainer : jost.berthold@gmail.com
|
||||
Stability : experimental
|
||||
Portability : no (depends on GHC internals)
|
||||
|
||||
@ -41,13 +41,13 @@ import Control.Exception(throw)
|
||||
trySerialize :: a -> IO (Serialized a) -- throws PackException (RTS)
|
||||
trySerialize x = trySerializeWith x defaultBufSize
|
||||
|
||||
-- | A default buffer size, used when using the old API
|
||||
-- | default buffer size used by trySerialize
|
||||
defaultBufSize :: Int
|
||||
defaultBufSize = 10 * 2^20 -- 10 MB
|
||||
|
||||
-- | Extended interface function: Allocates a buffer of given size (in
|
||||
-- bytes), serialises data into it, then truncates the buffer to the
|
||||
-- actually required size before returning it (as @'Serialized' a@)
|
||||
-- required size before returning it (as @'Serialized' a@)
|
||||
trySerializeWith :: a -> Int -> IO (Serialized a) -- using instance PrimMonad IO
|
||||
trySerializeWith dat bufsize
|
||||
= do buf <- newByteArray bufsize
|
||||
|
@ -3,9 +3,9 @@
|
||||
{-|
|
||||
|
||||
Module : GHC.Packing.PackException
|
||||
Copyright : (c) Jost Berthold, 2010-2014,
|
||||
Copyright : (c) Jost Berthold, 2010-2015,
|
||||
License : BSD3
|
||||
Maintainer : jb.diku@gmail.com
|
||||
Maintainer : jost.berthold@gmail.com
|
||||
Stability : experimental
|
||||
Portability : no (depends on GHC internals)
|
||||
|
||||
@ -13,19 +13,20 @@ Exception type for packman library, using magic constants #include'd
|
||||
from a C header file shared with the foreign primitive operation code.
|
||||
|
||||
'PackException's can occur at Haskell level or in the foreign primop.
|
||||
The Haskell-level exceptions all occur when reading in
|
||||
'GHC.Packing.Serialised' data, and are:
|
||||
|
||||
* 'P_BinaryMismatch': the serialised data have been produced by a
|
||||
All Haskell-level exceptions are cases of invalid data when /reading/
|
||||
and /deserialising/ 'GHC.Packing.Serialised' data:
|
||||
|
||||
* 'P_BinaryMismatch': serialised data were produced by a
|
||||
different executable (must be the same binary).
|
||||
* 'P_TypeMismatch': the serialised data have the wrong type
|
||||
* 'P_TypeMismatch': serialised data have the wrong type
|
||||
* 'P_ParseError': serialised data could not be parsed (from binary or
|
||||
text format)
|
||||
|
||||
The other exceptions are return codes of the foreign primitive
|
||||
operation, and indicate errors at the C level. Most of them occur when
|
||||
serialising data; the exception is 'P_GARBLED' which indicates corrupt
|
||||
serialised data.
|
||||
The exceptions caused by the foreign primops (return codes)
|
||||
indicate errors at the C level. Most of them can occur when
|
||||
serialising data; the exception is 'P_GARBLED' which indicates that
|
||||
serialised data is garbled.
|
||||
|
||||
-}
|
||||
|
||||
@ -50,20 +51,21 @@ data PackException =
|
||||
P_SUCCESS -- ^ no error, ==0.
|
||||
-- Internal code, should never be seen by users.
|
||||
| P_BLACKHOLE -- ^ RTS: packing hit a blackhole.
|
||||
-- Used internally, should probably not be seen by users.
|
||||
-- Used internally, not passed to users.
|
||||
| P_NOBUFFER -- ^ RTS: buffer too small
|
||||
| P_CANNOTPACK -- ^ RTS: contains closure which cannot be packed (MVar, TVar)
|
||||
| P_UNSUPPORTED -- ^ RTS: contains unsupported closure type (implementation missing)
|
||||
| P_IMPOSSIBLE -- ^ RTS: impossible case (stack frame, message,...RTS bug!)
|
||||
| P_GARBLED -- ^ RTS: corrupted data for deserialisation
|
||||
|
||||
-- Error codes from inside Haskell
|
||||
| P_ParseError -- ^ Haskell: Packet data could not be parsed
|
||||
| P_BinaryMismatch -- ^ Haskell: Executable binaries do not match
|
||||
| P_TypeMismatch -- ^ Haskell: Packet data encodes unexpected type
|
||||
deriving (Eq, Ord, Typeable)
|
||||
|
||||
-- | decode an 'Int#' to a @'PackException'@. Magic constants are read
|
||||
-- from file /cbits/Errors.h/.
|
||||
-- | decodes an 'Int#' to a @'PackException'@. Magic constants are read
|
||||
-- from file /cbits///Errors.h/.
|
||||
decodeEx :: Int## -> PackException
|
||||
decodeEx #{const P_SUCCESS}## = P_SUCCESS -- unexpected
|
||||
decodeEx #{const P_BLACKHOLE}## = P_BLACKHOLE
|
||||
@ -92,7 +94,7 @@ instance Show PackException where
|
||||
|
||||
instance Exception PackException
|
||||
|
||||
-- | internally used: checks if the given code indicates 'P_BLACKHOLE'
|
||||
-- | internal: checks if the given code indicates 'P_BLACKHOLE'
|
||||
isBHExc :: Int## -> Bool
|
||||
isBHExc #{const P_BLACKHOLE}## = True
|
||||
isBHExc e## = False
|
||||
|
@ -3,9 +3,9 @@
|
||||
{-|
|
||||
|
||||
Module : GHC.Packing.Type
|
||||
Copyright : (c) Jost Berthold, 2010-2014,
|
||||
Copyright : (c) Jost Berthold, 2010-2015,
|
||||
License : BSD3
|
||||
Maintainer : Jost Berthold <jb.diku@gmail.com>
|
||||
Maintainer : Jost Berthold <jost.berthold@gmail.com>
|
||||
Stability : experimental
|
||||
Portability : no (depends on GHC internals)
|
||||
|
||||
@ -13,8 +13,8 @@ Portability : no (depends on GHC internals)
|
||||
|
||||
The data type @'Serialized' a@ includes a phantom type @a@ to ensure
|
||||
type safety within one and the same program run. Type @a@ can be
|
||||
polymorphic (at compile time, that is) when @Serialized a@ is not used
|
||||
apart from being argument to @deserialize@.
|
||||
polymorphic (at compile time, that is) when @'Serialized' a@ is not used
|
||||
apart from being argument to @'deserialize'@.
|
||||
|
||||
The @Show@, @Read@, and @Binary@ instances of @Serialized a@ require an
|
||||
additional @Typeable@ context (which requires @a@ to be monomorphic)
|
||||
@ -67,7 +67,7 @@ import Control.Exception(throw)
|
||||
import GHC.Packing.PackException
|
||||
|
||||
-- | The type of Serialized data. Phantom type 'a' ensures that we
|
||||
-- unpack the expected type do not unpack rubbish.
|
||||
-- unpack data as the expected type.
|
||||
data Serialized a = Serialized { packetData :: ByteArray# }
|
||||
|
||||
{- $ShowReadBinary
|
||||
@ -75,14 +75,14 @@ data Serialized a = Serialized { packetData :: ByteArray# }
|
||||
The power of evaluation-orthogonal serialisation is that one can
|
||||
/externalise/ partially evaluated data (containing thunks), for
|
||||
instance write it to disk or send it over a network.
|
||||
|
||||
Therefore, the module defines a 'Binary' instance for 'Serialized a',
|
||||
as well as instances for 'Read' and 'Show'@ which satisfy
|
||||
@ read . show == id :: 'Serialized' a -> 'Serialized' a@.
|
||||
|
||||
> read . show == id :: 'Serialized' a -> 'Serialized' a
|
||||
|
||||
The phantom type is enough to ensure type-correctness when serialised
|
||||
data remain in one single program run. However, when data from
|
||||
previous runs are read in from an external source, their type needs to
|
||||
previous runs are read from an external source, their type needs to
|
||||
be checked at runtime. Type information must be stored together with
|
||||
the (binary) serialisation data.
|
||||
|
||||
@ -90,7 +90,7 @@ The serialised data contain pointers to static data in the generating
|
||||
program (top-level functions and constants) and very likely to
|
||||
additional library code. Therefore, the /exact same binary/ must be
|
||||
used when reading in serialised data from an external source. A hash
|
||||
of the executable is therefore included in the representation as well.
|
||||
of the executable is included in the representation to ensure this.
|
||||
|
||||
-}
|
||||
|
||||
@ -121,11 +121,11 @@ showWArray arr = unlines [ show i ++ ":" ++ unwords (map showH row)
|
||||
where (first,rest) = splitAt 4 xs
|
||||
|
||||
-----------------------------------------------
|
||||
-- | Reads the format generated by the (@'Show'@) instance, checks
|
||||
-- | Reads the format generated by the 'Show' instance, checks
|
||||
-- hash values for executable and type and parses exactly as much as
|
||||
-- the included data size announces.
|
||||
instance Typeable a => Read (Serialized a)
|
||||
-- using ReadP parser (base-4.x), eats
|
||||
-- using ReadP parser (base-4.x)
|
||||
where readsPrec _ input
|
||||
= case parseP input of
|
||||
[] -> throw P_ParseError -- no parse
|
||||
@ -138,13 +138,14 @@ instance Typeable a => Read (Serialized a)
|
||||
other-> throw P_ParseError
|
||||
-- ambiguous parse for packet
|
||||
|
||||
-- | Packet Parser: read header with size and type, then iterate over
|
||||
-- array values, reading several hex words in one row, separated by
|
||||
-- tab and space. Packet size needed to avoid returning a prefix.
|
||||
-- | Packet Parser, reads the format generated by the @Read@ instance.
|
||||
-- Could also consume other formats of the array (not implemented).
|
||||
-- Returns: (data size in words, type fingerprint, array values)
|
||||
parseP :: ReadS (Int, FP, [TargetWord])
|
||||
parseP = readP_to_S $
|
||||
-- read header with size and type, then iterate over array values,
|
||||
-- reading several hex words in one row, separated by
|
||||
-- tab and space. Packet size needed to avoid returning a prefix.
|
||||
do string "Serialization Packet, size "
|
||||
sz_str <- munch1 isDigit
|
||||
let sz = read sz_str::Int
|
||||
@ -222,7 +223,7 @@ instance Typeable a => Binary (Serialized a) where
|
||||
-- fields, to be able to /read/ fingerprints
|
||||
data FP = FP Word64 Word64 deriving (Read, Show, Eq)
|
||||
|
||||
-- | comparing 'FP's
|
||||
-- | checks whether the type of the given expression matches the given Fingerprint
|
||||
matches :: Typeable a => a -> FP -> Bool
|
||||
matches x (FP c1 c2) = f1 == c1 && f2 == c2
|
||||
where (GHC.Fingerprint.Fingerprint f1 f2) = typeRepFingerprint (typeOf x)
|
||||
@ -233,11 +234,11 @@ typeRepFingerprint typeRep = ghcFP
|
||||
where TypeRep ghcFP _ _ = typeRep
|
||||
#endif
|
||||
|
||||
-- | creating an 'FP' from a GHC 'Fingerprint'
|
||||
-- | creates an 'FP' from a GHC 'Fingerprint'
|
||||
toFP :: GHC.Fingerprint.Fingerprint -> FP
|
||||
toFP (GHC.Fingerprint.Fingerprint f1 f2) = FP f1 f2
|
||||
|
||||
-- | creating a type fingerprint
|
||||
-- | returns the type fingerprint of an expression
|
||||
typeFP :: Typeable a => a -> FP
|
||||
typeFP = toFP . typeRepFingerprint . typeOf
|
||||
|
||||
|
2
LICENSE
2
LICENSE
@ -1,5 +1,5 @@
|
||||
The packman serialisation library for GHC
|
||||
Copyright (c) 2014, Jost Berthold <jb.diku@gmail.com>
|
||||
Copyright (c) 2014-15, Jost Berthold <jost.berthold@gmail.com>
|
||||
All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
|
53
README.md
53
README.md
@ -1,18 +1,17 @@
|
||||
packman
|
||||
Packman
|
||||
=======
|
||||
|
||||
Evaluation-orthogonal serialisation of Haskell data, as a library
|
||||
|
||||
|
||||
In brief, this is the packing code of Eden and GpH, ripped out of the runtime system and rewritten to make it thread-safe and return exception codes when it fails.
|
||||
|
||||
Most of this work was already there when I presented at HIW last year, [see slides from HIW 2013](http://www.haskell.org/wikiupload/2/28/HIW2013PackingAPI.pdf) but all code was in the runtime system then.
|
||||
|
||||
The basic idea was described earlier, in a [2010 IFL paper](http://www.diku.dk/~berthold/papers/mainIFL10-withCopyright.pdf)
|
||||
|
||||
### Haskell API
|
||||
|
||||
A Haskell API around this (C-implemented) functionality provides the following API:
|
||||
This package provides Haskell data serialisation independent of evaluation,
|
||||
by accessing the Haskell heap using foreign primitive operations.
|
||||
Any Haskell data structure (with a few limitations) can be serialised and
|
||||
later deserialised in the same or a new run of the same program (that means,
|
||||
the same executable file).
|
||||
|
||||
A Haskell API around the C-implemented core provides the following API:
|
||||
|
||||
```
|
||||
trySerializeWith :: a -> Int -> IO (Serialized a) -- Int is maximum buffer size to use
|
||||
@ -20,11 +19,14 @@ trySerialize :: a -> IO (Serialized a) -- uses default (maximum) buff
|
||||
deserialize :: Serialized a -> IO a
|
||||
```
|
||||
|
||||
Note that this serialisation is orthogonal to evaluation: the argument is serialised **in its current state of evaluation**, it might be entirely unevaluated (a thunk) or only partially evaluated (containing thunks).
|
||||
Note that this serialisation is orthogonal to evaluation: the argument is
|
||||
serialised **in its current state of evaluation**, it might be entirely
|
||||
unevaluated (a thunk) or only partially evaluated (containing thunks).
|
||||
|
||||
The `Serialized a` type is an opaque representation of serialised Haskell data (it contains a `ByteArray`).
|
||||
`Serialized a` provides instances for `Show` and `Read` which satisfy `read . show == id`, and a `Binary` instance.
|
||||
For these instances, types are checked dynamically type-safe, therefore the `Typeable` context.
|
||||
The `Serialized a` type is an opaque representation of serialised Haskell data (it
|
||||
contains a `ByteArray`). `Serialized a` provides instances for `Show` and `Read`
|
||||
which satisfy `read . show == id`, and a `Binary` instance. For these instances,
|
||||
types are checked dynamically type-safe, therefore the `Typeable` context.
|
||||
|
||||
### Advantages
|
||||
|
||||
@ -39,10 +41,29 @@ The ugly solution in the library: the API signals such conditions as exceptions.
|
||||
|
||||
Another limitation is that serialised data **can only be used by the very same binary**. This is however common for many approaches to distributed programming using functional languages.
|
||||
|
||||
If you find this library useful, I (Jost Berthold) would be happy to hear from you.
|
||||
If you find this library useful, I would be happy to hear from you. Patches are welcome.
|
||||
|
||||
Acknowledgements:
|
||||
-----------------
|
||||
|
||||
Phil Trinder suggested to separate serialisation from other functionality of the parallel runtime system in 2009.
|
||||
#### Reading material
|
||||
|
||||
In brief, this is the packing code of Eden and GpH, ripped out of the runtime
|
||||
system and rewritten to make it thread-safe and return exception codes when
|
||||
it fails.
|
||||
|
||||
Most of what is provided by the library was already there when I
|
||||
presented at HIW in 2013,
|
||||
[see slides from HIW 2013](http://www.haskell.org/wikiupload/2/28/HIW2013PackingAPI.pdf)
|
||||
but all code was in the runtime system then.
|
||||
|
||||
The basic idea was described earlier, in a
|
||||
[2010 IFL paper](http://www.diku.dk/~berthold/papers/mainIFL10-withCopyright.pdf),
|
||||
including a study of possible applications, especially checkpointing
|
||||
and memoisation.
|
||||
|
||||
#### Acknowledgements
|
||||
|
||||
The idea to separate serialisation from other functionality of the parallel runtime system was suggested by Phil Trinder in 2009.
|
||||
Hans-Wolfgang Loidl introduced me to the GUM packing code, worked with me on the parallel runtime system for a long time, and always provided valuable feedback.
|
||||
Kevin Hammond is the original author of the packing code used by packman and the Eden RTS. It has been rewritten a few times and improved by a number of people (including Phil Trinder and Hans-Wolfgang Loidl).
|
||||
Michael Budde and Åsbjørn Jøkladal assembled the first cabalised library version as a student project in our course "Topics in programming languages" 2014 (where the topic was parallel functional programming).
|
||||
|
@ -1,6 +1,6 @@
|
||||
module Main where
|
||||
|
||||
import GHC.Packing -- Data.Serialize.Packman
|
||||
import GHC.Packing
|
||||
import Control.Exception
|
||||
|
||||
data Foo = A | B | C | D deriving Show
|
||||
|
@ -1,6 +1,3 @@
|
||||
{-
|
||||
Some tests to
|
||||
-}
|
||||
-- module TestSerialisation(tests)
|
||||
-- where
|
||||
|
||||
|
@ -11,7 +11,7 @@ category: Serialization, Data, GHC
|
||||
license: BSD3
|
||||
license-file: LICENSE
|
||||
author: Michael Budde, Ásbjørn V. Jøkladal, Jost Berthold
|
||||
maintainer: jb.diku@gmail.com
|
||||
maintainer: jost.berthold@gmail.com
|
||||
build-type: Simple
|
||||
cabal-version: >= 1.20
|
||||
tested-with: GHC==7.8.2, GHC==7.8.3, GHC==7.10.2
|
||||
|
Loading…
Reference in New Issue
Block a user