mirror of
https://github.com/circuithub/rel8.git
synced 2024-10-27 02:08:37 +03:00
360 lines
13 KiB
ReStructuredText
360 lines
13 KiB
ReStructuredText
.. highlight:: haskell
|
|
|
|
Getting Started
|
|
===============
|
|
|
|
In this article, we'll take a tour through the basics of Rel8. We'll learn how
|
|
to define base tables, write simple queries, and execute these queries against a
|
|
real database.
|
|
|
|
|
|
Required language extensions and imports
|
|
----------------------------------------
|
|
|
|
::
|
|
|
|
{-# LANGUAGE Arrows, DataKinds, DeriveGeneric, FlexibleInstances,
|
|
MultiParamTypeClasses, OverloadedStrings #-}
|
|
|
|
import Control.Applicative
|
|
import Control.Arrow
|
|
import Rel8
|
|
|
|
To use Rel8, you will need a few language extensions:
|
|
|
|
* ``Arrows`` is necessary to use ``proc`` notation - like ``do`` notation
|
|
for monads. As with Opaleye, Rel8 uses arrows to guarantee queries are valid.
|
|
|
|
* ``DataKinds`` is used to promote values to the type level when defining
|
|
table/column metadata.
|
|
|
|
* ``DeriveGeneric`` is used to derive functions from schema
|
|
information.
|
|
|
|
The other extensions can be considered as "necessary evil" to provide the type
|
|
system extensions needed by Rel8.
|
|
|
|
|
|
Defining base tables
|
|
--------------------
|
|
|
|
To query a database of existing tables, we need to let Rel8 know
|
|
about these tables, and the schema for each table. This is done by defining a
|
|
Haskell *record* for each table in the database. These records should have a
|
|
type of the form ``C f name hasDefault t``. Let's see how that looks with some
|
|
example tables::
|
|
|
|
data Part f = Part
|
|
{ partId :: C f "PID" 'HasDefault Int
|
|
, partName :: C f "PName" 'NoDefault String
|
|
, partColor :: C f "Color" 'NoDefault Int
|
|
, partWeight :: C f "Weight" 'NoDefault Double
|
|
, partCity :: C f "City" 'NoDefault String
|
|
} deriving (Generic)
|
|
|
|
instance BaseTable Part where tableName = "part"
|
|
instance Table (Part Expr) (Part QueryResult)
|
|
|
|
The ``Part`` table has 5 columns, each defined with the ``C f ..`` pattern. For
|
|
each column, we are specifying:
|
|
|
|
1. The column name.
|
|
2. Whether this column has a default value when inserting new rows. In
|
|
this case ``partId`` does, as this is an auto-incremented primary key managed
|
|
by the database.
|
|
3. The type of the column.
|
|
|
|
After defining the table, we finally need to make instances of ``BaseTable`` and
|
|
``Table`` so Rel8 can query this table. By using ``deriving (Generic)``, we
|
|
need to write ``instance BaseTable Part where tableName = "part"``. The
|
|
``Table`` instance demonstrates that a ``Part Expr`` value can be selected from
|
|
the database as ``Part QueryResult``.
|
|
|
|
|
|
Querying tables
|
|
---------------
|
|
|
|
With tables defined, we are now ready to write some queries. All ``BaseTable`` s
|
|
give rise to a query - the query of all rows in that table::
|
|
|
|
allParts :: O.Query (Part Expr)
|
|
allParts = queryTable
|
|
|
|
Notice the type of ``allParts`` specifies that we're working with ``Part Expr``.
|
|
This means that the contents of the ``Part`` record will contain expressions -
|
|
one for each column in the table. As ``O.Query`` is a ``Functor``, we can derive
|
|
a new query for all part cities in the database::
|
|
|
|
allPartCities :: O.Query (Expr String)
|
|
allPartCities = partCity <$> allParts
|
|
|
|
Now we have a query containing one column - expressions of type ``String``.
|
|
|
|
``WHERE`` clauses
|
|
-----------------
|
|
|
|
Usually when we are querying database, we are querying for subsets of
|
|
information. In SQL, we apply predicates using ``WHERE`` - and Rel8 supports
|
|
this too, in two forms.
|
|
|
|
We use ``filterQuery`` as we would use ``filter``::
|
|
|
|
londonParts :: O.Query (Part Expr)
|
|
londonParts = filterQuery (\p -> partCity p ==. "London") allParts
|
|
|
|
``filterQuery`` takes a function from rows in a query to a predicate. In this
|
|
case we can use ``==.`` to compare to expressions for equality. On the left,
|
|
``partCity p :: Expr String``, and on the right ``"London" :: Expr String``
|
|
(the literal string ``London``).
|
|
Alternatively, we can use ``where_`` with arrow notation, which is like
|
|
using ``guard`` with ``MonadPlus``::
|
|
|
|
heavyParts :: O.Query (Part Expr)
|
|
heavyParts = proc _ -> do
|
|
part <- queryTable -< ()
|
|
where_ -< partWeight part >. 5
|
|
returnA -< part
|
|
|
|
Joining Queries
|
|
---------------
|
|
|
|
Rel8 supports joining multiple queries into one, in a few different ways.
|
|
|
|
Products and Inner Joins
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
We can take the product of two queries - each row of the first query paired with
|
|
each row of the second query - by sequencing queries inside a ``O.Query``. Let's
|
|
introduce another table::
|
|
|
|
data Supplier f = Supplier
|
|
{ supplierId :: C f "SID" 'HasDefault Int
|
|
, supplierName :: C f "SName" 'NoDefault String
|
|
, supplierStatus :: C f "Status" 'NoDefault Int
|
|
, supplierCity :: C f "City" 'NoDefault String
|
|
} deriving (Generic)
|
|
|
|
instance BaseTable Supplier where tableName = "supplier"
|
|
instance Table (Supplier Expr) (Supplier QueryResult)
|
|
|
|
We can take the product of all parts paired against all suppliers by simplying
|
|
selecting from both tables and returning a tuple::
|
|
|
|
allPartsAndSuppliers :: O.Query (Part Expr, Supplier Expr)
|
|
allPartsAndSuppliers = proc _ -> do
|
|
part <- queryTable -< ()
|
|
supplier <- queryTable -< ()
|
|
returnA -< (part, supplier)
|
|
|
|
We could write this a little more succinctly using using the ``Applicative``
|
|
instance for ``O.Query``, as ``<*>`` corresponds to products::
|
|
|
|
allPartsAndSuppliers2 :: O.Query (Part Expr, Supplier Expr)
|
|
allPartsAndSuppliers2 = liftA2 (,) queryTable queryTable
|
|
|
|
In both queries, we've used ``queryTable`` to select the necessary rows.
|
|
``queryTable`` is overloaded, but by knowing the type of rows to select, it will
|
|
change which table it queries from.
|
|
|
|
We can combine products with the techniques we've seen to produce
|
|
the inner join of two tables. For example, here is a query to pair up each part
|
|
with all suppliers in the same city::
|
|
|
|
partsAndSuppliers :: Query (Part Expr, Supplier Expr)
|
|
partsAndSuppliers =
|
|
filterQuery
|
|
(\(part, supplier) -> partCity part ==. supplierCity supplier)
|
|
allPartsAndSuppliers
|
|
|
|
Left Joins
|
|
^^^^^^^^^^
|
|
|
|
The previous query gave us parts with *at least one* supplier in the same city.
|
|
If a part has no suppliers in the same city, it will be omitted from the
|
|
results. But what if we needed this information? In SQL we can capture this with
|
|
a ``LEFT JOIN``, and Rel8 supports this.
|
|
|
|
Left joins can be introduced with the ``leftJoin``, which takes two queries, or
|
|
using arrow notation with ``leftJoinA``. Let's look at the latter, as it is
|
|
often more concise::
|
|
|
|
partsAndSuppliersLJ :: Query (Part Expr, MaybeTable (Supplier Expr))
|
|
partsAndSuppliersLJ = proc _ -> do
|
|
part <- queryTable -< ()
|
|
maybeSupplier
|
|
<- leftJoinA queryTable
|
|
-< \supplier -> partCity part ==. supplierCity supplier
|
|
returnA -< (part, maybeSupplier)
|
|
|
|
This is a little different to anything we've seen so far, so let's break it
|
|
down. ``leftJoinA`` takes as its first argument the query to join in. In this
|
|
case we use ``queryTable`` to select all supplier rows. ``LEFT JOIN`` s also
|
|
require a predicate, and we supply this as *input* to ``leftJoinA``. The input
|
|
is itself a function, a function from rows in the to-be-joined table to
|
|
booleans. Notice that in this predicate, we are free to refer to tables and
|
|
columns already in the query (as ``partCity part`` is not part of the supplier
|
|
table).
|
|
|
|
Left joins themselves can be filtered, as they are just another query. However,
|
|
the results of a left join are wrapped in ``MaybeTable``, which indicates that
|
|
*all* of the columns in this table might be ``null``, if the join failed to
|
|
match any rows. We can use this information with our ``partsAndSuppliersLJ``
|
|
query to find parts where there are no suppliers in the same city::
|
|
|
|
partsWithoutSuppliersInCity :: Query (Part Expr)
|
|
partsWithoutSuppliersInCity = proc _ -> do
|
|
(part, maybeSupplier) <- partsAndSuppliersLJ -< ()
|
|
where_ -< isNull (maybeSupplier $? supplierId)
|
|
returnA -< part
|
|
|
|
.. note::
|
|
|
|
This type of query is what is known as an *antijoin*. A more efficient way to
|
|
write the above is by using the ``notExists`` function. For more information,
|
|
see :ref:`antijoins` in :doc:`concepts`.
|
|
|
|
We are filtering our query for suppliers where the id is null. Ordinarily this
|
|
would be a type error - we declared that ``supplierId`` contains ``Int``, rather
|
|
than ``Maybe Int``. However, because suppliers come from a left join, when we
|
|
project out from ``MaybeTable`` *all* columns become nullable. It may help to
|
|
think of ``($?)`` as having the type:::
|
|
|
|
($?) :: (a -> Expr b) -> MaybeTable a -> Expr (Maybe b)
|
|
|
|
though in Rel8 we're a little bit more general.
|
|
|
|
|
|
Aggregation
|
|
-----------
|
|
|
|
To aggregate a series of rows, use the ``aggregate`` query transform.
|
|
``aggregate`` takes a ``Query`` that returns any ``AggregateTable`` as a result.
|
|
``AggregateTable`` s are like ``Tables``, except that all expressions describe a
|
|
way to aggregate data. While tuples are instances of ``AggregateTable``, it's
|
|
recommended to introduce new data types to represent aggregations for clarity.
|
|
|
|
As an example of aggregation, let's start with a table modelling all users in
|
|
our application::
|
|
|
|
data User f = User
|
|
{ userId :: Col f "id" 'HasDefault Int64
|
|
, userLastLoggedIn :: Col f "last_logged_in_at" 'NoDefault UTCTime
|
|
, userType :: Col f "user_type" 'NoDefault Text
|
|
} deriving (Generic)
|
|
|
|
instance Table (User Expr) (User QueryResult)
|
|
instance BaseTable User where tableName = "users"
|
|
|
|
We would like to aggregate over this table, grouped by user type, learning how
|
|
many users we have and the latest login time in that group. First, let's
|
|
introduce a record to be able to refer to this data::
|
|
|
|
data UserInfo f = UserInfo
|
|
{ userCount :: Anon f Int64
|
|
, latestLogin :: Anon f UTCTime
|
|
, uType :: Anon f Text
|
|
} deriving (Generic)
|
|
|
|
instance AggregateTable (UserInfo Aggregate) (UserInfo Expr)
|
|
instange Table (UserInfo Expr) (UserInfo QueryResult)
|
|
|
|
This record is defined in a similar pattern to tables we've seen before,
|
|
but this time we're using the ``Anon`` decorator, rather than ``C``. ``Anon``
|
|
can be used for tables that aren't base tables, and means we don't have to
|
|
provide metadata about the column name and whether it has a default
|
|
value. In this case, ``UserInfo`` doesn't model a base table, it models a
|
|
derived table.
|
|
|
|
Also, notice that we derived a new type class instance that we haven't seen yet.
|
|
``UserInfo`` will be used with ``Aggregate`` expressions, and the
|
|
``AggregateTable`` instance states we can aggregate the ``UserInfo`` data type.
|
|
|
|
With this, aggregation can be written as a concise query::
|
|
|
|
userInfo :: Query (UserInfo Expr)
|
|
userInfo = aggregate $ proc _ -> do
|
|
user <- queryTable -< ()
|
|
returnA -< UserInfo { userCount = count (userId user)
|
|
, latestLogin = max (userLastLoggedIn user)
|
|
, uType = groupBy (userType user)
|
|
}
|
|
|
|
Running Queries
|
|
---------------
|
|
|
|
So far we've written various queries, but we haven't actually seen how to
|
|
perform any IO with them. Rel8 gives you entry points into the main ways of
|
|
interacting with a relational database - ``DELETE``, ``INSERT``, ``SELECT`` and
|
|
``UPDATE``. ``SELECT`` is arguably the most common type of query, so we'll begin
|
|
with that.
|
|
|
|
You can run any query that returns results using the ``select`` function from
|
|
``Rel8.IO``. ``select`` needs to be given a ``QueryRunner``, which is a type of
|
|
function for actually performing the IO. There are two default query runners,
|
|
``stream`` and ``streamCursor``. It's beyond the scope of this tutorial to
|
|
discuss the difference, curious users are encouraged to check the API
|
|
documentation. ``stream`` is often enough, so let's look at a program that
|
|
queries the ``part`` table from earlier
|
|
|
|
Select
|
|
^^^^^^
|
|
|
|
::
|
|
|
|
import Database.PostgreSQL.Simple
|
|
import Control.Monad.Trans.Resource (runResourceT)
|
|
import qualified Streaming.Prelude as Stream
|
|
|
|
selectAllParts :: IO [Part QueryResult]
|
|
selectAllParts = do
|
|
databaseConnection <- connect defaultConnectInfo
|
|
runResourceT . Stream.toList_ $
|
|
select (stream databaseConnection) allParts
|
|
|
|
We use ``select`` with a ``stream`` ``QueryRunner`` built from our
|
|
``databaseConnection``. This returns a ``Stream`` of results - in this case we
|
|
immediately flatten that stream into a concrete list with ``toList_``. Finally,
|
|
we need to deal with resource handling on that query, which can be done with
|
|
``runResourceT``.
|
|
|
|
|
|
Data Modification
|
|
^^^^^^^^^^^^^^^^^
|
|
|
|
Data modification queries are queries that use ``DELETE``, ``INSERT`` or
|
|
``UPDATE``, and Rel8 gives two interfaces to these queries - one that
|
|
runs the query, and another than runs the query and returns a ``Stream`` of
|
|
results (the ``Returning`` family of functions).
|
|
|
|
For ``update``, we specify a database connection, a predicate to select rows to
|
|
update, and a function that transforms each row. The following will change the
|
|
colour of part 5 to red::
|
|
|
|
update databaseConnection
|
|
(\part -> partId part ==. lit 5)
|
|
(\part -> part { partColor = lit "red" })
|
|
|
|
For ``insert``, we have some extra syntax for fields that can contain default
|
|
values. Note that we marked ``partId`` as having a default value::
|
|
|
|
partId :: C f "PID" 'HasDefault Int
|
|
|
|
This means the database can provide a default value for this column when we
|
|
insert rows (usually automatically incrementing a sequence)::
|
|
|
|
insert databaseConnection
|
|
[Part { partId = InsertDefault
|
|
, partName = lit "New part"
|
|
, partColor = lit "Gold"
|
|
, partWeight = lit 3.14
|
|
, partCity = lit "London"
|
|
}]
|
|
|
|
Using ``insertReturning`` you can immediately witness what these default values
|
|
are.
|
|
|
|
Finally, there is ``delete`` which requires only a predicate to choose which
|
|
rows should be deleted::
|
|
|
|
delete databaseConnection (\p -> partId p >=. 10)
|