server: compress response bodies with libdeflate, using level 6 for l…

…arger ones …the goal being to save on data transfer costs, libdeflate being much faster than zlib for larger inputs and at higher compression levels. A few notes: In last month... - 95% of response bodies > 20kB compress below 32% (with zlib level 1) - The 10% of responses > 20kB comprise 75% egress traffic to clients - libdeflate at level 6 is comparable in performance to zlib level 1, and twice as fast as zlib level 6 - We expect compressing 20kB+ response bodies at level 6 to reduce data transfer to clients by 25% or so (although this is difficult to predict accurately) The new libdeflate bindings used here also need review: https://github.com/hasura/libdeflate-hs PR-URL: https://github.com/hasura/graphql-engine-mono/pull/10341 GitOrigin-RevId: bc7b19e0024e442d85ac0b34995610edbab13bd6
2024-12-13 19:33:55 +03:00 · 2023-09-27 04:49:13 -04:00 · 2023-09-27 04:49:13 -04:00 · 5f8820c2cf
commit 5f8820c2cf
parent f0430ef2c7
5 changed files with 36 additions and 9 deletions
--- a/cabal.project
+++ b/cabal.project
@ -127,3 +127,8 @@ source-repository-package
  tag: c1aa7b3991e669e4c6a977712b495d40a54cf819
  subdir: yaml

+source-repository-package
+  type: git
+  location: https://github.com/hasura/libdeflate-hs.git
+  tag: e6f020a1a24d07516d753fbb6f30758774f76372
+
--- a/cabal.project.freeze
+++ b/cabal.project.freeze
@ -215,6 +215,7 @@ constraints: any.Cabal ==3.8.1.0,
             any.lens-aeson ==1.2.2,
             any.lens-family ==2.1.2,
             any.lens-family-core ==2.1.2,
+             any.libdeflate-hs ==0.1.0.0,
             any.libyaml ==0.1.2,
             any.lifted-async ==0.10.2.3,
             any.lifted-base ==0.2.3.12,
--- a/server/graphql-engine.cabal
+++ b/server/graphql-engine.cabal
@ -307,6 +307,7 @@ common lib-depends
                     , incremental
                     , kan-extensions
                     , kriti-lang
+                     , libdeflate-hs
                     , lifted-base
                     , monad-control
                     , monad-loops
--- a/server/src-lib/Hasura/Server/Compression.hs
+++ b/server/src-lib/Hasura/Server/Compression.hs
@ -12,7 +12,8 @@ module Hasura.Server.Compression
  )
 where

-import Codec.Compression.GZip qualified as GZ
+import Codec.Compression.LibDeflate.GZip qualified as GZ
+import Data.ByteString qualified as BS
 import Data.ByteString.Lazy qualified as BL
 import Data.Set qualified as Set
 import Data.Text qualified as T
@ -53,10 +54,10 @@ compressResponse reqHeaders unCompressedResp
  | acceptedEncodings == Set.fromList [identityEncoding, Just CTGZip] =
      if shouldSkipCompression unCompressedResp
        then notCompressed
-        else (compressFast CTGZip unCompressedResp, Just CTGZip)
+        else (compressSmart CTGZip unCompressedResp, Just CTGZip)
  -- we MUST gzip:
  | acceptedEncodings == Set.fromList [Just CTGZip] =
-      (compressFast CTGZip unCompressedResp, Just CTGZip)
+      (compressSmart CTGZip unCompressedResp, Just CTGZip)
  -- we must ONLY return an uncompressed response:
  | acceptedEncodings == Set.fromList [identityEncoding] =
      notCompressed
@ -68,14 +69,21 @@ compressResponse reqHeaders unCompressedResp
    acceptedEncodings = getAcceptedEncodings reqHeaders
    notCompressed = (unCompressedResp, identityEncoding)

-- | Compress the bytestring preferring speed over compression ratio
+-- | Compress the lazy bytestring preferring speed over compression ratio
 compressFast :: CompressionType -> BL.ByteString -> BL.ByteString
 compressFast = \case
-  CTGZip -> GZ.compressWith gzipCompressionParams
+  -- See Note [Compression ratios]
+  CTGZip -> BL.fromStrict . flip GZ.gzipCompressSimple 1 . BL.toStrict
+
+-- | Compress the lazy bytestring choosing a compression ratio based on size of input
+compressSmart :: CompressionType -> BL.ByteString -> BL.ByteString
+compressSmart CTGZip inpLBS
+  -- See Note [Compression ratios]
+  | BS.length inpBS > 20000 = gz 6
+  | otherwise = gz 1
  where
-    gzipCompressionParams =
-      -- See Note [Compression ratios]
-      GZ.defaultCompressParams {GZ.compressLevel = GZ.compressionLevel 1}
+    inpBS = BL.toStrict inpLBS
+    gz = BL.fromStrict . GZ.gzipCompressSimple inpBS

 -- | Assuming we have the option to compress or not (i.e. client accepts
 -- identity AND gzip), should we skip compression?
@ -175,7 +183,7 @@ I didn't test higher compression levels much, but `gzip -4` for the most part
 resulted in less than 10% smaller output on random json, and ~30% on our highly
 compressible benchmark output.

-UPDATE (12/5):
+UPDATE (12/5/2022):
 ~~~~~~~~~~~~~

 Some recent data on compression ratios for graphql responsed (here as:
@ -199,4 +207,15 @@ Aggregate across responses where uncompressed > 17K bytes (90th percentile):
    p50:    0.172
    min:    0.005

+UPDATE (9/26/2023):
+~~~~~~~~~~~~~
+
+In last month...
+
+- 95% of response bodies > 20kB compress below 32%
+- The 10% of responses > 20kB comprise 75% egress traffic to clients
+- libdeflate at level 6 is comparable in performance to zlib level 1, and twice as fast as zlib level 6
+- We expect compressing 20kB+ response bodies at level 6 to reduce data transfer
+  to clients by 25% or so (although this is difficult to predict accurately)
+
 -}
--- a/server/src-test/Hasura/Server/CompressionSpec.hs
+++ b/server/src-test/Hasura/Server/CompressionSpec.hs
@ -1,5 +1,6 @@
 module Hasura.Server.CompressionSpec (spec) where

+-- reference implementation:
 import Codec.Compression.GZip qualified as GZ
 import Data.ByteString.Lazy qualified as BL
 import Data.Set qualified as Set