2021-04-01 23:17:50 +03:00
|
|
|
.. highlight:: haskell
|
|
|
|
|
|
|
|
Getting Started
|
|
|
|
===============
|
|
|
|
|
|
|
|
In this section, we'll take a look at using Rel8 to work with a small database
|
|
|
|
for Haskell packages. We'll take a look at idiomatic usage of Rel8, mapping
|
|
|
|
tables to Haskell, and then look at writing some simple queries.
|
|
|
|
|
|
|
|
Before we get started, we'll be using the following language extensions and
|
|
|
|
imports throughout this guide::
|
|
|
|
|
|
|
|
{-# language -XBlockArguments #-}
|
|
|
|
{-# language -XDeriveAnyClass #-}
|
|
|
|
{-# language -XDeriveGeneric #-}
|
|
|
|
{-# language -XDerivingStrategies #-}
|
|
|
|
{-# language -XDerivingVia #-}
|
|
|
|
{-# language -XDuplicateRecordFields #-}
|
|
|
|
{-# language -XGeneralizedNewtypeDeriving #-}
|
|
|
|
{-# language -XOverloadedStrings #-}
|
|
|
|
{-# language -XStandaloneDeriving #-}
|
|
|
|
{-# language -XTypeApplications #-}
|
|
|
|
{-# language -XTypeFamilies #-}
|
|
|
|
|
|
|
|
import Prelude
|
|
|
|
|
|
|
|
The Example Schema
|
|
|
|
------------------
|
|
|
|
|
|
|
|
Before we start writing any Haskell, let's take a look at the schema we'll work
|
|
|
|
with. The `author` table has three columns:
|
|
|
|
|
|
|
|
+-----------------+-------------+----------+
|
|
|
|
| Column Name | Type | Nullable |
|
|
|
|
+=================+=============+==========+
|
|
|
|
| ``author_id`` | ``integer`` | not null |
|
|
|
|
+-----------------+-------------+----------+
|
|
|
|
| ``name`` | ``text`` | not null |
|
|
|
|
+-----------------+-------------+----------+
|
|
|
|
| ``url`` | ``text`` | |
|
|
|
|
+-----------------+-------------+----------+
|
|
|
|
|
|
|
|
and the `project` table has two:
|
|
|
|
|
|
|
|
+-----------------+-------------+----------+
|
|
|
|
| Column Name | Type | Nullable |
|
|
|
|
+=================+=============+==========+
|
|
|
|
| ``author_id`` | ``integer`` | not null |
|
|
|
|
+-----------------+-------------+----------+
|
|
|
|
| ``name`` | ``text`` | not null |
|
|
|
|
+-----------------+-------------+----------+
|
|
|
|
|
|
|
|
A ``project`` always has an ``author``, but not all ``author``\s have projects.
|
|
|
|
Each ``author`` has a name and (maybe) an associated website, and each project
|
|
|
|
has a name.
|
|
|
|
|
|
|
|
Mapping Schemas to Haskell
|
|
|
|
--------------------------
|
|
|
|
|
|
|
|
Now that we've seen our schema, we can begin writing a mapping in Rel8. The
|
|
|
|
idiomatic way to map a table is to use a record that is parameterised what Rel8
|
|
|
|
calls an *interpretation functor*, and to define each field with ``Column``.
|
|
|
|
For this type to be usable with Rel8 we need it to be an instance of
|
|
|
|
``Rel8able``, which can be derived with a combination of ``DeriveAnyClass`` and
|
|
|
|
``DeriveGeneric`` language extensions.
|
|
|
|
|
|
|
|
Following these steps for ``author``, we have::
|
|
|
|
|
|
|
|
data Author f = Author
|
|
|
|
{ authorId :: Column f Int64
|
|
|
|
, name :: Column f Text
|
|
|
|
, url :: Column f (Maybe Text)
|
|
|
|
}
|
|
|
|
deriving stock (Generic)
|
|
|
|
deriving anyclass (Rel8able)
|
|
|
|
|
|
|
|
This is a perfectly reasonable definition, but cautious readers might notice a
|
|
|
|
problem - in particular, with the type of the ``authorId`` field. While
|
|
|
|
``Int64`` is correct, it's not the best type. If we had other identifier types
|
|
|
|
in our project, it would be too easy to accidentally mix them up and create
|
|
|
|
nonsensical joins. As Haskell programmers, we often solve this problem by
|
|
|
|
creating ``newtype`` wrappers, and we can also use this technique with Rel8::
|
|
|
|
|
|
|
|
newtype AuthorId = AuthorId { toInt64 :: Int64 }
|
|
|
|
deriving newtype (DBEq, DBType, Eq, Show)
|
|
|
|
|
|
|
|
Now we can write our final schema mapping. First, the ``author`` table::
|
|
|
|
|
|
|
|
data Author f = Author
|
|
|
|
{ authorId :: Column f AuthorId
|
|
|
|
, authorName :: Column f Text
|
|
|
|
, authorUrl :: Column f (Maybe Text)
|
|
|
|
}
|
|
|
|
deriving stock (Generic)
|
|
|
|
deriving anyclass (Rel8able)
|
|
|
|
|
|
|
|
And similarly, the ``project`` table::
|
|
|
|
|
|
|
|
data Project f = Project
|
|
|
|
{ projectAuthorId :: Column f AuthorId
|
|
|
|
, projectName :: Column f Text
|
|
|
|
}
|
|
|
|
deriving stock (Generic)
|
|
|
|
deriving anyclass (Rel8able
|
|
|
|
|
|
|
|
To show query results in this documentation, we'll also need ``Show`` instances:
|
|
|
|
Unfortunately these definitions look a bit scary, but they are essentially just
|
|
|
|
``deriving (Show)``::
|
|
|
|
|
|
|
|
deriving stock instance f ~ Identity => Show (Author f)
|
|
|
|
deriving stock instance f ~ Identity => Show (Project f)
|
|
|
|
|
|
|
|
These data types describe the structural mapping of the tables, but we also
|
|
|
|
need to specify a ``TableSchema`` for each table. A ``TableSchema`` contains
|
|
|
|
the name of the table and the name of all columns in the table, which will
|
|
|
|
ultimately allow us to ``SELECT`` and ``INSERT`` rows for these tables.
|
|
|
|
|
|
|
|
To define a ``TableSchema``, we just need to fill construct appropriate
|
|
|
|
``TableSchema`` values. When it comes to the ``tableColumns`` field, we
|
|
|
|
construct values of our data types above, and set each field to the name of the
|
|
|
|
column that it maps to.
|
|
|
|
|
|
|
|
First, ``authorSchema`` describes the column names of the ``author`` table when
|
|
|
|
associated with the ``Author`` type::
|
|
|
|
|
|
|
|
authorSchema :: TableSchema (Author Name)
|
|
|
|
authorSchema = TableSchema
|
|
|
|
{ tableName = "author"
|
|
|
|
, tableSchema = Nothing
|
|
|
|
, tableColumns = Author
|
|
|
|
{ authorId = "author_id"
|
|
|
|
, authorName = "name"
|
|
|
|
, authorUrl = "url"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
And likewise for ``project`` and ``Project``::
|
|
|
|
|
|
|
|
projectSchema :: TableSchema (Project Name)
|
|
|
|
projectSchema = TableSchema
|
|
|
|
{ tableName = "project"
|
|
|
|
, tableSchema = Nothing
|
|
|
|
, tableColumns = Project
|
|
|
|
{ projectAuthorId = "author_id"
|
|
|
|
, projectName = "name"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
.. note::
|
|
|
|
|
|
|
|
You might be wondering why this information isn't in the definitions of
|
|
|
|
``Author`` and ``Project`` above. Rel8 decouples ``TableSchema`` from the data
|
|
|
|
types themselves, as not all tables you define will necessarily have a schema.
|
|
|
|
For example, Rel8 allows you to define helper types to simplify the types of
|
|
|
|
queries - these tables only exist at query time, but there is no corresponding
|
|
|
|
base table. We'll see more on this idea later!
|
|
|
|
|
|
|
|
With these table definitions, we can now start writing some queries!
|
|
|
|
|
|
|
|
Writing Queries
|
|
|
|
---------------
|
|
|
|
|
|
|
|
Simple Queries
|
|
|
|
~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
First, we'll take a look at ``SELECT`` statements - usually the bulk of most
|
|
|
|
database heavy applications.
|
|
|
|
|
|
|
|
In Rel8, ``SELECT`` statements are built using the ``Query`` monad. You can
|
|
|
|
think of this monad like the ordinary ``[]`` (List) monad - but this isn't
|
|
|
|
required knowledge.
|
|
|
|
|
|
|
|
To start, we'll look at one of the simplest queries possible - a basic ``SELECT
|
|
|
|
* FROM`` statement. To select all rows from a table, we use ``each``, and
|
|
|
|
supply a ``TableSchema``. So to select all ``project`` rows, we can write::
|
|
|
|
|
|
|
|
>>> :t each projectSchema
|
|
|
|
each projectSchema :: Query (Project Expr)
|
|
|
|
|
|
|
|
Notice that ``each`` gives us a ``Query`` that yields ``Project Expr`` rows. To
|
|
|
|
see what this means, let's have a look at a single field of a ``Project Expr``::
|
|
|
|
|
|
|
|
>>> let aProjectExpr = undefined :: Project Expr
|
|
|
|
>>> :t projectAuthorId aProjectExpr
|
|
|
|
projectAuthorId aProjectExpr :: Expr AuthorId
|
|
|
|
|
|
|
|
Recall we defined ``projectAuthorId`` as ``Column f AuthorId``. Now we have
|
|
|
|
``f`` is ``Expr``, and ``Column Expr AuthorId`` reduces to ``Expr AuthorId``.
|
|
|
|
We'll see more about ``Expr`` soon, but you can think of ``Expr a`` as "SQL
|
|
|
|
expressions of type ``a``\".
|
|
|
|
|
|
|
|
To execute this ``Query``, we pass it to ``select``::
|
|
|
|
|
|
|
|
>>> :t select c (each projectSchema)
|
|
|
|
select c (each projectSchema) :: MonadIO m => m [Project Identity]
|
|
|
|
|
|
|
|
When we ``select`` things containing ``Expr``s, Rel8 builds a new response
|
|
|
|
table with the ``Identity`` interpretation. This means you'll get back plain
|
|
|
|
Haskell values. Studying ``projectAuthorId`` again, we have::
|
|
|
|
|
|
|
|
>>> let aProjectIdentity = undefined :: Project Identity
|
|
|
|
>>> :t projectAuthorId aProjectIdentity
|
|
|
|
projectAuthorId aProjectIdentity :: AuthorId
|
|
|
|
|
|
|
|
Here ``Column Identity AuthorId`` reduces to just ``AuthorId``, with no
|
|
|
|
wrappping type at all.
|
|
|
|
|
|
|
|
Putting this all together, we can run our first query::
|
|
|
|
|
|
|
|
>>> select c (each projectSchema) >>= mapM_ print
|
|
|
|
Project {projectAuthorId = 1, projectName = "rel8"}
|
|
|
|
Project {projectAuthorId = 2, projectName = "aeson"}
|
|
|
|
Project {projectAuthorId = 2, projectName = "text"}
|
|
|
|
|
|
|
|
We now know that ``each`` is the equivalent of a ``SELECT *`` query, but
|
|
|
|
sometimes we're only interested in a subset of the columns of a table. To
|
|
|
|
restrict the returned columns, we can specify a projection by using ``Query``\s
|
|
|
|
``Functor`` instance::
|
|
|
|
|
|
|
|
>>> select c $ projectName <$> each projectSchema
|
|
|
|
["rel8","aeson","text"]
|
|
|
|
|
|
|
|
Joins
|
|
|
|
~~~~~
|
|
|
|
|
|
|
|
Another common operation in relational databases is to take the ``JOIN`` of
|
|
|
|
multiple tables. Rel8 doesn't have a specific join operation, but we can
|
|
|
|
recover the functionality of a join by selecting all rows of two tables, and
|
|
|
|
then using ``where_`` to filter them.
|
|
|
|
|
|
|
|
To see how this works, first let's look at taking the product of two tables.
|
|
|
|
We can do this by simply calling ``each`` twice, and then returning a tuple of
|
|
|
|
their results::
|
|
|
|
|
|
|
|
>>> :{
|
|
|
|
mapM_ print =<< select c do
|
|
|
|
author <- each authorSchema
|
|
|
|
project <- each projectSchema
|
|
|
|
return (projectName project, authorName author)
|
|
|
|
:}
|
|
|
|
("rel8","Ollie")
|
|
|
|
("rel8","Bryan O'Sullivan")
|
|
|
|
("rel8","Emily Pillmore")
|
|
|
|
("aeson","Ollie")
|
|
|
|
("aeson","Bryan O'Sullivan")
|
|
|
|
("aeson","Emily Pillmore")
|
|
|
|
("text","Ollie")
|
|
|
|
("text","Bryan O'Sullivan")
|
|
|
|
("text","Emily Pillmore")
|
|
|
|
|
|
|
|
This isn't quite right, though, as we have ended up pairing up the wrong
|
|
|
|
projects and authors. To fix this, we can use ``where_`` to restrict the
|
|
|
|
returned rows. We could write::
|
|
|
|
|
|
|
|
select c $ do
|
|
|
|
author <- each authorSchema
|
|
|
|
project <- each projectSchema
|
|
|
|
where_ $ projectAuthorId project ==. authorId author
|
|
|
|
return (project, author)
|
|
|
|
|
|
|
|
but doing this every time you need a join can obscure the meaning of the
|
|
|
|
query you're writing. A good practice is to introduce specialised functions
|
|
|
|
for the particular joins in your database. In our case, this would be::
|
|
|
|
|
|
|
|
projectsForAuthor :: Author Expr -> Query (Project Expr)
|
|
|
|
projectsForAuthor a = each projectSchema >>= filter \p ->
|
|
|
|
projectAuthorId p ==. authorId a
|
|
|
|
|
|
|
|
Our final query is then::
|
|
|
|
|
|
|
|
>>> :{
|
|
|
|
mapM_ print =<< select c do
|
|
|
|
author <- each authorSchema
|
|
|
|
project <- projectsForAuthor author
|
|
|
|
return (projectName project, authorName author)
|
|
|
|
:}
|
|
|
|
("rel8","Ollie")
|
|
|
|
("aeson","Bryan O'Sullivan")
|
|
|
|
("text","Bryan O'Sullivan")
|
|
|
|
|
|
|
|
Left Joins
|
|
|
|
~~~~~~~~~~
|
|
|
|
|
2021-04-02 13:42:27 +03:00
|
|
|
Rel8 is also capable of performing ``LEFT JOIN``\s. To perform ``LEFT JOIN``\s,
|
2021-04-01 23:17:50 +03:00
|
|
|
we follow the same approach as before, but use the ``optional`` query
|
|
|
|
transformer to allow for the possibility of the join to fail.
|
|
|
|
|
|
|
|
In our test database, we can see that there's another author who we haven't
|
|
|
|
seen yet::
|
|
|
|
|
|
|
|
>>> select c $ authorName <$> each authorSchema
|
|
|
|
["Ollie","Bryan O'Sullivan","Emily Pillmore"]
|
|
|
|
|
|
|
|
Emily wasn't returned in our earlier query because - in our database - she
|
|
|
|
doesn't have any registered projects. We can account for this partiality in our
|
|
|
|
original query by wrapping the ``projectsForAuthor`` call with ``optional``::
|
|
|
|
|
|
|
|
>>> :{
|
|
|
|
mapM_ print =<< select c do
|
|
|
|
author <- each authorSchema
|
|
|
|
mproject <- optional $ projectsForAuthor author
|
|
|
|
return (authorName author, projectName <$> mproject)
|
|
|
|
:}
|
|
|
|
("Ollie",Just "rel8")
|
|
|
|
("Bryan O'Sullivan",Just "aeson")
|
|
|
|
("Bryan O'Sullivan",Just "text")
|
|
|
|
("Emily Pillmore",Nothing)
|
|
|
|
|
|
|
|
|
|
|
|
Aggregation
|
|
|
|
~~~~~~~~~~~
|
|
|
|
|
|
|
|
Aggregations are operations like ``sum`` and ``count`` - operations that reduce
|
|
|
|
multiple rows to single values. To perform aggregations in Rel8, we can use the
|
|
|
|
``aggregate`` function, which takes a ``Query`` of aggregated expressions, runs
|
|
|
|
the aggregation, and returns aggregated rows.
|
|
|
|
|
|
|
|
To start, let's look at a simple aggregation that tells us how many projects
|
|
|
|
exist::
|
|
|
|
|
|
|
|
>>> error "TODO"
|
|
|
|
|
|
|
|
Rel8 is also capable of aggregating multiple rows into a single row by
|
|
|
|
concatenating all rows as a list. This aggregation allows us to break free of
|
|
|
|
the row-orientated nature of SQL and write queries that return tree-like
|
|
|
|
structures. Earlier we saw an example of returning authors with their projects,
|
|
|
|
but the query didn't do a great job of describing the one-to-many relationship
|
|
|
|
between authors and their projects.
|
|
|
|
|
|
|
|
Let's look again at a query that returns authors and their projects, and
|
|
|
|
focus on the /type/ of that query::
|
|
|
|
|
|
|
|
projectsForAuthor a = each projectSchema >>= filter \p ->
|
|
|
|
projectAuthorId p ==. authorId a
|
|
|
|
|
|
|
|
let authorsAndProjects = do
|
|
|
|
author <- each authorSchema
|
|
|
|
project <- projectsForAuthor author
|
|
|
|
return (author, project)
|
|
|
|
where
|
|
|
|
|
|
|
|
>>> :t select c authorsAndProjects
|
|
|
|
select c authorsAndProjects
|
|
|
|
:: MonadIO m => m [(Author Identity, Project Identity)]
|
|
|
|
|
|
|
|
|
|
|
|
Our query gives us a single list of pairs of authors and projects. However,
|
|
|
|
with our domain knowledge of the schema, this isn't a great type - what we'd
|
|
|
|
rather have is a list of pairs of authors and /lists/ of projects. That is,
|
|
|
|
what we'd like is::
|
|
|
|
|
|
|
|
[(Author Identity, [Project Identity])]
|
|
|
|
|
|
|
|
This would be a much better type! Rel8 can produce a query with this type by
|
|
|
|
simply wrapping the call to ``projectsForAuthor`` with either ``some`` or
|
|
|
|
``many``. Here we'll use ``many``, which allows for the possibility of an
|
|
|
|
author to have no projects::
|
|
|
|
|
|
|
|
>>> :{
|
|
|
|
mapM_ print =<< select c do
|
|
|
|
author <- each authorSchema
|
|
|
|
projectNames <- many $ projectName <$> projectsForAuthor author
|
|
|
|
return (authorName author, projectNames)
|
|
|
|
:}
|
|
|
|
("Ollie",["rel8"])
|
|
|
|
("Bryan O'Sullivan",["aeson","text"])
|
|
|
|
("Emily Pillmore",[])
|