bloodhound/README.org

* Bloodhound

#+CAPTION: Bloodhound (dog)
[[./bloodhound.jpg]]

* Elasticsearch client and query DSL for Haskell

** Why?

Because you're tired of obnoxious errors like [[http://i.imgur.com/FKtZYIP.png][this]] and want types to guide your use of the API.

** Stability

Bloodhound is alpha at the moment. The library works fine, but I don't want to mislead anyone into thinking the API is final or stable. I wouldn't call the library "complete" or representative of everything you can do in Elasticsearch but being compared to clients in other languages the story here so far is good.

* Examples

** Index Operations

*** Create Index

#+BEGIN_SRC haskell

-- Formatted for use in ghci, so there are "let"s in front of the decls.

-- if you see :{ and :}, they're so you can copy-paste
-- the multi-line examples into your ghci REPL.

:set -XDeriveGeneric
import Database.Bloodhound
import Data.Aeson
import Data.Either (Either(..))
import Data.Maybe (fromJust)
import Data.Time.Calendar (Day(..))
import Data.Time.Clock (secondsToDiffTime, UTCTime(..))
import Data.Text (Text)
import GHC.Generics (Generic)
import Network.HTTP.Conduit
import qualified Network.HTTP.Types.Status as NHTS

-- no trailing slashes in servers, library handles building the path.
let testServer = (Server "http://localhost:9200")
let testIndex = IndexName "twitter"
let testMapping = MappingName "tweet"

-- defaultIndexSettings is exported by Database.Bloodhound as well
let defaultIndexSettings = IndexSettings (ShardCount 3) (ReplicaCount 2)

-- createIndex returns IO Reply

-- response :: Reply, Reply is a synonym for Network.HTTP.Conduit.Response
response <- createIndex testServer defaultIndexSettings testIndex

#+END_SRC

*** Delete Index

#+BEGIN_SRC haskell

-- response :: Reply
response <- deleteIndex testServer testIndex

-- print response if it was a success
Response {responseStatus = Status {statusCode = 200, statusMessage = "OK"}
        , responseVersion = HTTP/1.1
        , responseHeaders = [("Content-Type", "application/json; charset=UTF-8")
                           , ("Content-Length", "21")]
        , responseBody = "{\"acknowledged\":true}"
        , responseCookieJar = CJ {expose = []}
        , responseClose' = ResponseClose}

-- if the index to be deleted didn't exist anyway
Response {responseStatus = Status {statusCode = 404, statusMessage = "Not Found"}
        , responseVersion = HTTP/1.1
        , responseHeaders = [("Content-Type", "application/json; charset=UTF-8")
                           , ("Content-Length","65")]
        , responseBody = "{\"error\":\"IndexMissingException[[twitter] missing]\",\"status\":404}"
        , responseCookieJar = CJ {expose = []}
        , responseClose' = ResponseClose}

#+END_SRC

*** Refresh Index

**** Note, you *have* to do this if you expect to read what you just wrote

#+BEGIN_SRC haskell

resp <- refreshIndex testServer testIndex

-- print resp on success
Response {responseStatus = Status {statusCode = 200, statusMessage = "OK"}
        , responseVersion = HTTP/1.1
        , responseHeaders = [("Content-Type", "application/json; charset=UTF-8")
                           , ("Content-Length","50")]
        , responseBody = "{\"_shards\":{\"total\":10,\"successful\":5,\"failed\":0}}"
        , responseCookieJar = CJ {expose = []}
        , responseClose' = ResponseClose}

#+END_SRC

** Mapping Operations

*** Create Mapping

#+BEGIN_SRC haskell

-- don't forget imports and the like at the top.

data TweetMapping = TweetMapping deriving (Eq, Show)

-- I know writing the JSON manually sucks.
-- I don't have a proper data type for Mappings yet.
-- Let me know if this is something you need.

:{
instance ToJSON TweetMapping where
  toJSON TweetMapping =
    object ["tweet" .=
      object ["properties" .=
        object ["location" .=
          object ["type" .= ("geo_point" :: Text)]]]]
:}

resp <- createMapping testServer testIndex testMapping TweetMapping

#+END_SRC

*** Delete Mapping

#+BEGIN_SRC haskell

resp <- deleteMapping testServer testIndex testMapping

#+END_SRC

** Document Operations

*** Indexing Documents

#+BEGIN_SRC haskell

-- don't forget the imports and derive generic setting for ghci
-- at the beginning of the examples.

:{
data Location = Location { lat :: Double
                         , lon :: Double } deriving (Eq, Generic, Show)

data Tweet = Tweet { user     :: Text
                   , postDate :: UTCTime
                   , message  :: Text
                   , age      :: Int
                   , location :: Location } deriving (Eq, Generic, Show)

exampleTweet = Tweet { user     = "bitemyapp"
                     , postDate = UTCTime
                                  (ModifiedJulianDay 55000)
                                  (secondsToDiffTime 10)
                     , message  = "Use haskell!"
                     , age      = 10000
                     , location = Location 40.12 (-71.34) }

-- automagic (generic) derivation of instances because we're lazy.
instance ToJSON   Tweet
instance FromJSON Tweet
instance ToJSON   Location
instance FromJSON Location
:}

-- Should be able to toJSON and encode the data structures like this:
-- λ> toJSON $ Location 10.0 10.0
-- Object fromList [("lat",Number 10.0),("lon",Number 10.0)]
-- λ> encode $ Location 10.0 10.0
-- "{\"lat\":10,\"lon\":10}"

resp <- indexDocument testServer testIndex testMapping exampleTweet (DocId "1")

-- print resp on success
Response {responseStatus =
  Status {statusCode = 200, statusMessage = "OK"}
    , responseVersion = HTTP/1.1, responseHeaders = 
    [("Content-Type","application/json; charset=UTF-8"),
     ("Content-Length","75")]
    , responseBody = "{\"_index\":\"twitter\",\"_type\":\"tweet\",\"_id\":\"1\",\"_version\":2,\"created\":false}"
    , responseCookieJar = CJ {expose = []}, responseClose' = ResponseClose}

#+END_SRC

*** Deleting Documents

#+BEGIN_SRC haskell

resp <- deleteDocument testServer testIndex testMapping (DocId "1")

#+END_SRC

*** Getting Documents

#+BEGIN_SRC haskell

-- n.b., you'll need the earlier imports. responseBody is from http-conduit

resp <- getDocument testServer testIndex testMapping (DocId "1")

-- responseBody :: Response body -> body
let body = responseBody resp

-- you have two options, you use decode and just get Maybe (EsResult Tweet)
-- or you can use eitherDecode and get Either String (EsResult Tweet)

let maybeResult = decode body :: Maybe (EsResult Tweet)
-- the explicit typing is so Aeson knows how to parse the JSON.

-- use either if you want to know why something failed to parse.
-- (string errors, sadly)
let eitherResult = decode body :: Either String (EsResult Tweet)

-- print eitherResult should look like:
Right (EsResult {_index = "twitter"
               , _type = "tweet"
               , _id = "1"
               , _version = 2
               , found = Just True
               , _source = Tweet {user = "bitemyapp"
               , postDate = 2009-06-18 00:00:10 UTC
               , message = "Use haskell!"
               , age = 10000
               , location = Location {lat = 40.12, lon = -71.34}}})

-- _source in EsResult is parametric, we dispatch the type by passing in what we expect (Tweet) as a parameter to EsResult.

-- use the _source record accessor to get at your document
λ> fmap _source result
Right (Tweet {user = "bitemyapp"
            , postDate = 2009-06-18 00:00:10 UTC
            , message = "Use haskell!"
            , age = 10000
            , location = Location {lat = 40.12, lon = -71.34}})

#+END_SRC

** Search

*** Querying

**** Term Query

#+BEGIN_SRC haskell

-- exported by the Client module, just defaults some stuff.
-- mkSearch :: Maybe Query -> Maybe Filter -> Search
-- mkSearch query filter = Search query filter Nothing False 0 10

let query = TermQuery (Term "user" "bitemyapp") Nothing

-- AND'ing identity filter with itself and then tacking it onto a query
-- search should be a null-operation. I include it for the sake of example.
-- <||> (or/plus) should make it into a search that returns everything.

let filter = IdentityFilter <&&> IdentityFilter

-- constructing the search object the searchByIndex function dispatches on.
let search = mkSearch (Just query) (Just filter)

-- you can also searchByType and specify the mapping name.
reply <- searchByIndex testServer testIndex search

let result = eitherDecode (responseBody reply) :: Either String (SearchResult Tweet)

λ> fmap (hits . searchHits) result
Right [Hit {hitIndex = IndexName "twitter"
          , hitType = MappingName "tweet"
          , hitDocId = DocId "1"
          , hitScore = 0.30685282
          , hitSource = Tweet {user = "bitemyapp"
                             , postDate = 2009-06-18 00:00:10 UTC
                             , message = "Use haskell!"
                             , age = 10000
                             , location = Location {lat = 40.12, lon = -71.34}}}]

#+END_SRC

*** Sorting

#+BEGIN_SRC haskell

#+END_SRC

*** Filtering

**** And, Not, and Or filters

Filters form a monoid and seminearring.

#+BEGIN_SRC haskell

instance Monoid Filter where
  mempty = IdentityFilter
  mappend a b = AndFilter [a, b] defaultCache

instance Seminearring Filter where
  a <||> b = OrFilter [a, b] defaultCache

-- AndFilter and OrFilter take [Filter] as an argument.

-- This will return anything, because IdentityFilter returns everything
OrFilter [IdentityFilter, someOtherFilter] False

-- This will return exactly what someOtherFilter returns
AndFilter [IdentityFilter, someOtherFilter] False

-- Thanks to the seminearring and monoid, the above can be expressed as:

-- "and"
IdentityFilter <&&> someOtherFilter

-- "or"
IdentityFilter <||> someOtherFilter

-- Also there is a NotFilter, it only accepts a single filter, not a list.

NotFilter someOtherFilter False

#+END_SRC

**** Identity Filter

#+BEGIN_SRC haskell

-- And'ing two Identity
let queryFilter = IdentityFilter <&&> IdentityFilter

let search = mkSearch Nothing (Just queryFilter)

reply <- searchByType testServer testIndex testMapping search

#+END_SRC

**** Boolean Filter

Similar to boolean queries.

#+BEGIN_SRC haskell

-- Will return only items whose "user" field contains the term "bitemyapp"
let queryFilter = BoolFilter (MustMatch (Term "user" "bitemyapp") False)

-- Will return only items whose "user" field does not contain the term "bitemyapp"
let queryFilter = BoolFilter (MustNotMatch (Term "user" "bitemyapp") False)

-- The clause (query) should appear in the matching document.
-- In a boolean query with no must clauses, one or more should
-- clauses must match a document. The minimum number of should
-- clauses to match can be set using the minimum_should_match parameter.
let queryFilter = BoolFilter (ShouldMatch [(Term "user" "bitemyapp")] False)

#+END_SRC

**** Exists Filter

#+BEGIN_SRC haskell

-- Will filter for documents that have the field "user"
let existsFilter = ExistsFilter (FieldName "user")

#+END_SRC

**** Geo BoundingBox Filter

#+BEGIN_SRC haskell

-- topLeft and bottomRight
let box = GeoBoundingBox (LatLon 40.73 (-74.1)) (LatLon 40.10 (-71.12))

let constraint = GeoBoundingBoxConstraint (FieldName "tweet.location") box False

-- second argument is GeoFilterType, memory or indexed.
let geoFilter = GeoBoundingBoxFilter constraint GeoFilterMemory

#+END_SRC

**** Geo Distance Filter

#+BEGIN_SRC haskell

let geoPoint = GeoPoint (FieldName "tweet.location") (LatLon 40.12 (-71.34))

-- coefficient and units
let distance = Distance 10.0 Miles

-- GeoFilterType or NoOptimizeBbox
let optimizeBbox = OptimizeGeoFilterType GeoFilterMemory

-- SloppyArc is the usual/default optimization in Elasticsearch today
-- but pre-1.0 versions will need to pick Arc or Plane.

let geoFilter = GeoDistanceFilter geoPoint distance SloppyArc optimizeBbox False

#+END_SRC

**** Geo Distance Range Filter

Think of a donut and you won't be far off.

#+BEGIN_SRC haskell

let geoPoint = GeoPoint (FieldName "tweet.location") (LatLon 40.12 (-71.34))

let distanceRange = DistanceRange (Distance 0.0 Miles) (Distance 10.0 Miles)

let geoFilter = GeoDistanceRangeFilter geoPoint distanceRange

#+END_SRC

**** Geo Polygon Filter

#+BEGIN_SRC haskell

-- I think I drew a square here.
let points = [LatLon 40.0 (-70.00),
              LatLon 40.0 (-72.00),
              LatLon 41.0 (-70.00),
              LatLon 41.0 (-72.00)]

let geoFilter = GeoPolygonFilter (FieldName "tweet.location") points

#+END_SRC

**** Document IDs filter

#+BEGIN_SRC haskell

-- takes a mapping name and a list of DocIds
IdsFilter (MappingName "tweet") [DocId "1"]

#+END_SRC

**** Range Filter

***** Full Range

#+BEGIN_SRC haskell

-- RangeFilter :: FieldName
--                -> Either HalfRange Range
--                -> RangeExecution
--                -> Cache -> Filter

let filter = RangeFilter (FieldName "age")
             (Right (RangeLtGt (LessThan 100000.0) (GreaterThan 1000.0)))
             RangeExecutionIndex False

#+END_SRC

***** Half Range

#+BEGIN_SRC haskell

let filter = RangeFilter (FieldName "age")
             (Left (HalfRangeLt (LessThan 100000.0)))
             RangeExecutionIndex False

#+END_SRC

**** Regexp Filter

#+BEGIN_SRC haskell

-- RegexpFilter
--   :: FieldName
--      -> Regexp
--      -> RegexpFlags
--      -> CacheName
--      -> Cache
--      -> CacheKey
--      -> Filter
let filter = RegexpFilter (FieldName "user") (Regexp "bite.*app")
             RegexpAll (CacheName "test") False (CacheKey "key")

-- RegexpFlags can be a combination of RegexpAll, Complement,
-- Interval, Intersection, AnyString, and a combination of two options thereof.

#+END_SRC

* Possible future functionality

** Node discovery and failover

Might require TCP support.

** Support for TCP access to Elasticsearch

Pretend to be a transport client?

** Bulk cluster-join merge

Might require making a lucene index on disk with the appropriate format.

** GeoShapeFilter

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-geo-shape-filter.html

** Geohash cell filter

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-geohash-cell-filter.html

** HasChild Filter

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-has-child-filter.html

** HasParent Filter

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-has-parent-filter.html

** Indices Filter

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-indices-filter.html

** Query Filter

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-filter.html

** Script based sorting

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-sort.html#_script_based_sorting

** Collapsing redundantly nested and/or structures

The Seminearring instance, if deeply nested can possibly produce nested structure that is redundant. Depending on how this affects ES perforamnce, reducing this structure might be valuable.

** Runtime checking for cycles in data structures

check for n > 1 occurrences in DFS:

http://hackage.haskell.org/package/stable-maps-0.0.5/docs/System-Mem-StableName-Dynamic.html

http://hackage.haskell.org/package/stable-maps-0.0.5/docs/System-Mem-StableName-Dynamic-Map.html

* Photo Origin

Photo from HA! Designs: https://www.flickr.com/photos/hadesigns/
docs 2014-04-07 22:24:58 +04:00			`* Bloodhound`

i'm a bleedin' idiot 2014-04-12 14:14:27 +04:00			`#+CAPTION: Bloodhound (dog)`
			`[[./bloodhound.jpg]]`
range filters seem to be behaving themselves 2014-04-11 09:48:29 +04:00
changes 2014-04-12 23:09:36 +04:00			`* Elasticsearch client and query DSL for Haskell`

killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00			`** Why?`

link fix 2014-04-15 04:24:41 +04:00			`Because you're tired of obnoxious errors like [[http://i.imgur.com/FKtZYIP.png][this]] and want types to guide your use of the API.`
killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00
changes 2014-04-12 23:09:36 +04:00			`** Stability`

killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00			`Bloodhound is alpha at the moment. The library works fine, but I don't want to mislead anyone into thinking the API is final or stable. I wouldn't call the library "complete" or representative of everything you can do in Elasticsearch but being compared to clients in other languages the story here so far is good.`
changes 2014-04-12 23:09:36 +04:00
			`* Examples`

killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00			`** Index Operations`

			`*** Create Index`

			`#+BEGIN_SRC haskell`

			`-- Formatted for use in ghci, so there are "let"s in front of the decls.`

note about :{ :} and added category 2014-04-14 05:34:01 +04:00			`-- if you see :{ and :}, they're so you can copy-paste`
			`-- the multi-line examples into your ghci REPL.`

query documentation 2014-04-14 05:16:44 +04:00			`:set -XDeriveGeneric`
positive validation of sorting 2014-04-15 12:10:47 +04:00			`import Database.Bloodhound`
killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00			`import Data.Aeson`
			`import Data.Either (Either(..))`
			`import Data.Maybe (fromJust)`
			`import Data.Time.Calendar (Day(..))`
			`import Data.Time.Clock (secondsToDiffTime, UTCTime(..))`
			`import Data.Text (Text)`
			`import GHC.Generics (Generic)`
			`import Network.HTTP.Conduit`
			`import qualified Network.HTTP.Types.Status as NHTS`

			`-- no trailing slashes in servers, library handles building the path.`
			`let testServer = (Server "http://localhost:9200")`
			`let testIndex = IndexName "twitter"`
			`let testMapping = MappingName "tweet"`

positive validation of sorting 2014-04-15 12:10:47 +04:00			`-- defaultIndexSettings is exported by Database.Bloodhound as well`
killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00			`let defaultIndexSettings = IndexSettings (ShardCount 3) (ReplicaCount 2)`

			`-- createIndex returns IO Reply`

			`-- response :: Reply, Reply is a synonym for Network.HTTP.Conduit.Response`
			`response <- createIndex testServer defaultIndexSettings testIndex`

			`#+END_SRC`

			`*** Delete Index`

			`#+BEGIN_SRC haskell`

			`-- response :: Reply`
			`response <- deleteIndex testServer testIndex`

			`-- print response if it was a success`
formatting 2014-04-14 05:23:03 +04:00			`Response {responseStatus = Status {statusCode = 200, statusMessage = "OK"}`
			`, responseVersion = HTTP/1.1`
			`, responseHeaders = [("Content-Type", "application/json; charset=UTF-8")`
			`, ("Content-Length", "21")]`
			`, responseBody = "{\"acknowledged\":true}"`
			`, responseCookieJar = CJ {expose = []}`
			`, responseClose' = ResponseClose}`
killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00
			`-- if the index to be deleted didn't exist anyway`
formatting 2014-04-14 05:23:03 +04:00			`Response {responseStatus = Status {statusCode = 404, statusMessage = "Not Found"}`
			`, responseVersion = HTTP/1.1`
			`, responseHeaders = [("Content-Type", "application/json; charset=UTF-8")`
			`, ("Content-Length","65")]`
			`, responseBody = "{\"error\":\"IndexMissingException[[twitter] missing]\",\"status\":404}"`
			`, responseCookieJar = CJ {expose = []}`
			`, responseClose' = ResponseClose}`
killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00
			`#+END_SRC`

			`*** Refresh Index`

			`**** Note, you have to do this if you expect to read what you just wrote`

			`#+BEGIN_SRC haskell`

			`resp <- refreshIndex testServer testIndex`

			`-- print resp on success`
formatting 2014-04-14 05:23:03 +04:00			`Response {responseStatus = Status {statusCode = 200, statusMessage = "OK"}`
			`, responseVersion = HTTP/1.1`
			`, responseHeaders = [("Content-Type", "application/json; charset=UTF-8")`
			`, ("Content-Length","50")]`
			`, responseBody = "{\"_shards\":{\"total\":10,\"successful\":5,\"failed\":0}}"`
			`, responseCookieJar = CJ {expose = []}`
			`, responseClose' = ResponseClose}`
killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00
			`#+END_SRC`

			`** Mapping Operations`

			`*** Create Mapping`

			`#+BEGIN_SRC haskell`

			`-- don't forget imports and the like at the top.`

			`data TweetMapping = TweetMapping deriving (Eq, Show)`

formatting 2014-04-14 05:24:23 +04:00			`-- I know writing the JSON manually sucks.`
			`-- I don't have a proper data type for Mappings yet.`
killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00			`-- Let me know if this is something you need.`

			`:{`
			`instance ToJSON TweetMapping where`
			`toJSON TweetMapping =`
			`object ["tweet" .=`
			`object ["properties" .=`
			`object ["location" .=`
			`object ["type" .= ("geo_point" :: Text)]]]]`
			`:}`

			`resp <- createMapping testServer testIndex testMapping TweetMapping`

			`#+END_SRC`

			`*** Delete Mapping`

			`#+BEGIN_SRC haskell`

			`resp <- deleteMapping testServer testIndex testMapping`

			`#+END_SRC`

			`** Document Operations`

			`*** Indexing Documents`

			`#+BEGIN_SRC haskell`

formatting 2014-04-14 05:19:18 +04:00			`-- don't forget the imports and derive generic setting for ghci`
			`-- at the beginning of the examples.`
killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00
			`:{`
			`data Location = Location { lat :: Double`
			`, lon :: Double } deriving (Eq, Generic, Show)`

			`data Tweet = Tweet { user :: Text`
			`, postDate :: UTCTime`
			`, message :: Text`
			`, age :: Int`
			`, location :: Location } deriving (Eq, Generic, Show)`

			`exampleTweet = Tweet { user = "bitemyapp"`
			`, postDate = UTCTime`
			`(ModifiedJulianDay 55000)`
			`(secondsToDiffTime 10)`
			`, message = "Use haskell!"`
			`, age = 10000`
			`, location = Location 40.12 (-71.34) }`

			`-- automagic (generic) derivation of instances because we're lazy.`
			`instance ToJSON Tweet`
			`instance FromJSON Tweet`
			`instance ToJSON Location`
			`instance FromJSON Location`
			`:}`

			`-- Should be able to toJSON and encode the data structures like this:`
			`-- λ> toJSON $ Location 10.0 10.0`
			`-- Object fromList [("lat",Number 10.0),("lon",Number 10.0)]`
			`-- λ> encode $ Location 10.0 10.0`
			`-- "{\"lat\":10,\"lon\":10}"`

			`resp <- indexDocument testServer testIndex testMapping exampleTweet (DocId "1")`

			`-- print resp on success`
formatting 2014-04-14 05:19:18 +04:00			`Response {responseStatus =`
			`Status {statusCode = 200, statusMessage = "OK"}`
			`, responseVersion = HTTP/1.1, responseHeaders =`
			`[("Content-Type","application/json; charset=UTF-8"),`
			`("Content-Length","75")]`
			`, responseBody = "{\"_index\":\"twitter\",\"_type\":\"tweet\",\"_id\":\"1\",\"_version\":2,\"created\":false}"`
			`, responseCookieJar = CJ {expose = []}, responseClose' = ResponseClose}`
killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00
			`#+END_SRC`

			`*** Deleting Documents`

			`#+BEGIN_SRC haskell`

			`resp <- deleteDocument testServer testIndex testMapping (DocId "1")`

			`#+END_SRC`

			`*** Getting Documents`

			`#+BEGIN_SRC haskell`

			`-- n.b., you'll need the earlier imports. responseBody is from http-conduit`

			`resp <- getDocument testServer testIndex testMapping (DocId "1")`

			`-- responseBody :: Response body -> body`
			`let body = responseBody resp`

			`-- you have two options, you use decode and just get Maybe (EsResult Tweet)`
			`-- or you can use eitherDecode and get Either String (EsResult Tweet)`

			`let maybeResult = decode body :: Maybe (EsResult Tweet)`
			`-- the explicit typing is so Aeson knows how to parse the JSON.`

			`-- use either if you want to know why something failed to parse.`
			`-- (string errors, sadly)`
			`let eitherResult = decode body :: Either String (EsResult Tweet)`

			`-- print eitherResult should look like:`
formatting 2014-04-14 05:20:47 +04:00			`Right (EsResult {_index = "twitter"`
			`, _type = "tweet"`
			`, _id = "1"`
			`, _version = 2`
			`, found = Just True`
			`, _source = Tweet {user = "bitemyapp"`
			`, postDate = 2009-06-18 00:00:10 UTC`
			`, message = "Use haskell!"`
			`, age = 10000`
			`, location = Location {lat = 40.12, lon = -71.34}}})`
killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00
			`-- _source in EsResult is parametric, we dispatch the type by passing in what we expect (Tweet) as a parameter to EsResult.`

query documentation 2014-04-14 05:16:44 +04:00			`-- use the _source record accessor to get at your document`
			`λ> fmap _source result`
formatting 2014-04-14 05:20:47 +04:00			`Right (Tweet {user = "bitemyapp"`
			`, postDate = 2009-06-18 00:00:10 UTC`
			`, message = "Use haskell!"`
			`, age = 10000`
			`, location = Location {lat = 40.12, lon = -71.34}})`
query documentation 2014-04-14 05:16:44 +04:00
killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00			`#+END_SRC`

			`** Search`

			`*** Querying`

query documentation 2014-04-14 05:16:44 +04:00			`**** Term Query`

			`#+BEGIN_SRC haskell`

			`-- exported by the Client module, just defaults some stuff.`
			`-- mkSearch :: Maybe Query -> Maybe Filter -> Search`
			`-- mkSearch query filter = Search query filter Nothing False 0 10`

			`let query = TermQuery (Term "user" "bitemyapp") Nothing`

			`-- AND'ing identity filter with itself and then tacking it onto a query`
			`-- search should be a null-operation. I include it for the sake of example.`
			`-- <\|\|> (or/plus) should make it into a search that returns everything.`

			`let filter = IdentityFilter <&&> IdentityFilter`
formatting 2014-04-14 05:27:31 +04:00
			`-- constructing the search object the searchByIndex function dispatches on.`
query documentation 2014-04-14 05:16:44 +04:00			`let search = mkSearch (Just query) (Just filter)`
formatting 2014-04-14 05:27:31 +04:00
			`-- you can also searchByType and specify the mapping name.`
query documentation 2014-04-14 05:16:44 +04:00			`reply <- searchByIndex testServer testIndex search`
formatting 2014-04-14 05:27:31 +04:00
query documentation 2014-04-14 05:16:44 +04:00			`let result = eitherDecode (responseBody reply) :: Either String (SearchResult Tweet)`

			`λ> fmap (hits . searchHits) result`
formatting 2014-04-14 05:25:26 +04:00			`Right [Hit {hitIndex = IndexName "twitter"`
			`, hitType = MappingName "tweet"`
			`, hitDocId = DocId "1"`
			`, hitScore = 0.30685282`
			`, hitSource = Tweet {user = "bitemyapp"`
			`, postDate = 2009-06-18 00:00:10 UTC`
			`, message = "Use haskell!"`
			`, age = 10000`
			`, location = Location {lat = 40.12, lon = -71.34}}}]`
query documentation 2014-04-14 05:16:44 +04:00
			`#+END_SRC`
killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00
positive validation of sorting 2014-04-15 12:10:47 +04:00			`*** Sorting`

			`#+BEGIN_SRC haskell`

			`#+END_SRC`

killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00			`*** Filtering`

moar docs 2014-04-14 08:30:21 +04:00			`**** And, Not, and Or filters`

			`Filters form a monoid and seminearring.`

			`#+BEGIN_SRC haskell`

			`instance Monoid Filter where`
			`mempty = IdentityFilter`
			`mappend a b = AndFilter [a, b] defaultCache`

			`instance Seminearring Filter where`
			`a <\|\|> b = OrFilter [a, b] defaultCache`

			`-- AndFilter and OrFilter take [Filter] as an argument.`

			`-- This will return anything, because IdentityFilter returns everything`
			`OrFilter [IdentityFilter, someOtherFilter] False`

			`-- This will return exactly what someOtherFilter returns`
			`AndFilter [IdentityFilter, someOtherFilter] False`

			`-- Thanks to the seminearring and monoid, the above can be expressed as:`

			`-- "and"`
			`IdentityFilter <&&> someOtherFilter`

			`-- "or"`
			`IdentityFilter <\|\|> someOtherFilter`

			`-- Also there is a NotFilter, it only accepts a single filter, not a list.`

			`NotFilter someOtherFilter False`

			`#+END_SRC`

			`**** Identity Filter`

			`#+BEGIN_SRC haskell`

			`-- And'ing two Identity`
			`let queryFilter = IdentityFilter <&&> IdentityFilter`

			`let search = mkSearch Nothing (Just queryFilter)`

			`reply <- searchByType testServer testIndex testMapping search`

			`#+END_SRC`

			`**** Boolean Filter`

			`Similar to boolean queries.`

			`#+BEGIN_SRC haskell`

			`-- Will return only items whose "user" field contains the term "bitemyapp"`
			`let queryFilter = BoolFilter (MustMatch (Term "user" "bitemyapp") False)`

			`-- Will return only items whose "user" field does not contain the term "bitemyapp"`
			`let queryFilter = BoolFilter (MustNotMatch (Term "user" "bitemyapp") False)`

			`-- The clause (query) should appear in the matching document.`
			`-- In a boolean query with no must clauses, one or more should`
			`-- clauses must match a document. The minimum number of should`
			`-- clauses to match can be set using the minimum_should_match parameter.`
			`let queryFilter = BoolFilter (ShouldMatch [(Term "user" "bitemyapp")] False)`

			`#+END_SRC`

			`**** Exists Filter`

			`#+BEGIN_SRC haskell`

			`-- Will filter for documents that have the field "user"`
			`let existsFilter = ExistsFilter (FieldName "user")`

			`#+END_SRC`

			`**** Geo BoundingBox Filter`

			`#+BEGIN_SRC haskell`

			`-- topLeft and bottomRight`
			`let box = GeoBoundingBox (LatLon 40.73 (-74.1)) (LatLon 40.10 (-71.12))`

			`let constraint = GeoBoundingBoxConstraint (FieldName "tweet.location") box False`

			`-- second argument is GeoFilterType, memory or indexed.`
			`let geoFilter = GeoBoundingBoxFilter constraint GeoFilterMemory`

			`#+END_SRC`

			`**** Geo Distance Filter`

			`#+BEGIN_SRC haskell`

			`let geoPoint = GeoPoint (FieldName "tweet.location") (LatLon 40.12 (-71.34))`

			`-- coefficient and units`
			`let distance = Distance 10.0 Miles`

			`-- GeoFilterType or NoOptimizeBbox`
			`let optimizeBbox = OptimizeGeoFilterType GeoFilterMemory`

			`-- SloppyArc is the usual/default optimization in Elasticsearch today`
			`-- but pre-1.0 versions will need to pick Arc or Plane.`

			`let geoFilter = GeoDistanceFilter geoPoint distance SloppyArc optimizeBbox False`

			`#+END_SRC`

			`**** Geo Distance Range Filter`

			`Think of a donut and you won't be far off.`

			`#+BEGIN_SRC haskell`

			`let geoPoint = GeoPoint (FieldName "tweet.location") (LatLon 40.12 (-71.34))`

			`let distanceRange = DistanceRange (Distance 0.0 Miles) (Distance 10.0 Miles)`

			`let geoFilter = GeoDistanceRangeFilter geoPoint distanceRange`

			`#+END_SRC`

			`**** Geo Polygon Filter`

			`#+BEGIN_SRC haskell`

			`-- I think I drew a square here.`
			`let points = [LatLon 40.0 (-70.00),`
			`LatLon 40.0 (-72.00),`
			`LatLon 41.0 (-70.00),`
			`LatLon 41.0 (-72.00)]`

			`let geoFilter = GeoPolygonFilter (FieldName "tweet.location") points`

			`#+END_SRC`

			`**** Document IDs filter`

			`#+BEGIN_SRC haskell`

			`-- takes a mapping name and a list of DocIds`
			`IdsFilter (MappingName "tweet") [DocId "1"]`

			`#+END_SRC`

			`**** Range Filter`

			`***** Full Range`

			`#+BEGIN_SRC haskell`

			`-- RangeFilter :: FieldName`
			`-- -> Either HalfRange Range`
			`-- -> RangeExecution`
			`-- -> Cache -> Filter`

			`let filter = RangeFilter (FieldName "age")`
			`(Right (RangeLtGt (LessThan 100000.0) (GreaterThan 1000.0)))`
			`RangeExecutionIndex False`

			`#+END_SRC`

			`***** Half Range`

			`#+BEGIN_SRC haskell`

			`let filter = RangeFilter (FieldName "age")`
			`(Left (HalfRangeLt (LessThan 100000.0)))`
			`RangeExecutionIndex False`

			`#+END_SRC`

fixed RegexpFilter header 2014-04-14 08:41:34 +04:00			`**** Regexp Filter`
moar docs 2014-04-14 08:30:21 +04:00
			`#+BEGIN_SRC haskell`

			`-- RegexpFilter`
			`-- :: FieldName`
			`-- -> Regexp`
			`-- -> RegexpFlags`
			`-- -> CacheName`
			`-- -> Cache`
			`-- -> CacheKey`
			`-- -> Filter`
			`let filter = RegexpFilter (FieldName "user") (Regexp "bite.*app")`
			`RegexpAll (CacheName "test") False (CacheKey "key")`

			`-- RegexpFlags can be a combination of RegexpAll, Complement,`
			`-- Interval, Intersection, AnyString, and a combination of two options thereof.`

			`#+END_SRC`
killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00
changes 2014-04-12 23:09:36 +04:00			`* Possible future functionality`
mapping, bounding box positive validation, shuffling cache around 2014-04-11 01:39:10 +04:00
killed off a redundant important, getting documentation rolling 2014-04-14 05:00:28 +04:00			`** Node discovery and failover`

			`Might require TCP support.`

			`** Support for TCP access to Elasticsearch`

			`Pretend to be a transport client?`

			`** Bulk cluster-join merge`

			`Might require making a lucene index on disk with the appropriate format.`

changes 2014-04-12 23:09:36 +04:00			`** GeoShapeFilter`
IdsFilter w/ validation 2014-04-11 05:04:24 +04:00
			`http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-geo-shape-filter.html`

changes 2014-04-12 23:09:36 +04:00			`** Geohash cell filter`
IdsFilter w/ validation 2014-04-11 05:04:24 +04:00
			`http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-geohash-cell-filter.html`

changes 2014-04-12 23:09:36 +04:00			`** HasChild Filter`
IdsFilter w/ validation 2014-04-11 05:04:24 +04:00
			`http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-has-child-filter.html`

changes 2014-04-12 23:09:36 +04:00			`** HasParent Filter`
IdsFilter w/ validation 2014-04-11 05:04:24 +04:00
			`http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-has-parent-filter.html`

changes 2014-04-12 23:09:36 +04:00			`** Indices Filter`
range filters seem to be behaving themselves 2014-04-11 09:48:29 +04:00
			`http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-indices-filter.html`

changes 2014-04-12 23:09:36 +04:00			`** Query Filter`
range filters seem to be behaving themselves 2014-04-11 09:48:29 +04:00
			`http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-filter.html`

changes 2014-04-12 23:09:36 +04:00			`** Script based sorting`
oof 2014-04-12 03:13:19 +04:00
			`http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-sort.html#_script_based_sorting`

reduction note 2014-04-15 05:23:40 +04:00			`** Collapsing redundantly nested and/or structures`

			`The Seminearring instance, if deeply nested can possibly produce nested structure that is redundant. Depending on how this affects ES perforamnce, reducing this structure might be valuable.`

changes 2014-04-12 23:09:36 +04:00			`** Runtime checking for cycles in data structures`
mapping, bounding box positive validation, shuffling cache around 2014-04-11 01:39:10 +04:00
			`check for n > 1 occurrences in DFS:`

			`http://hackage.haskell.org/package/stable-maps-0.0.5/docs/System-Mem-StableName-Dynamic.html`

			`http://hackage.haskell.org/package/stable-maps-0.0.5/docs/System-Mem-StableName-Dynamic-Map.html`
der hund 2014-04-12 14:12:17 +04:00
changes 2014-04-12 23:09:36 +04:00			`* Photo Origin`
der hund 2014-04-12 14:12:17 +04:00
formatting 2014-04-14 05:25:58 +04:00			`Photo from HA! Designs: https://www.flickr.com/photos/hadesigns/`