Merge branch 'master' of github.com:timoramsauer/WeasyPrint into HEAD

2024-09-11 20:47:56 +03:00 · 2023-04-24 15:29:58 +02:00 · 2023-04-24 15:29:58 +02:00 · b505d56199
commit b505d56199
parent e9edc43f64 0ff8692741
31 changed files with 1005 additions and 647 deletions
--- a/.github/workflows/test_samples.yml
+++ b/.github/workflows/test_samples.yml
@ -43,7 +43,7 @@ jobs:
      - name: Ticket
        run: python -m weasyprint weasyprint-samples/ticket/ticket.html ${{env.REPORTS_FOLDER}}/ticket.pdf
      - name: Archive generated PDFs
-        uses: actions/upload-artifact@v2
+        uses: actions/upload-artifact@v3
        with:
          name: generated-documents
          path: ${{env.REPORTS_FOLDER}}
--- a/.github/workflows/tests.yml
+++ b/.github/workflows/tests.yml
@ -8,11 +8,12 @@ jobs:
    strategy:
      matrix:
        os: [ubuntu-latest, macos-latest, windows-latest]
-        python-version: ['3.7', '3.8', '3.9', '3.10', '3.11', 'pypy-3.8']
-        exclude:
-          # Wheels missing for this configuration
-          - os: macos-latest
-            python-version: pypy-3.8
+        python-version: ['3.11']
+        include:
+          - os: ubuntu-latest
+            python-version: '3.7'
+          - os: ubuntu-latest
+            python-version: 'pypy-3.8'
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v4
--- a/docs/api_reference.rst
+++ b/docs/api_reference.rst
@ -57,6 +57,7 @@ Python API
 .. autoclass:: CSS(input, **kwargs)
 .. autoclass:: Attachment(input, **kwargs)
 .. autofunction:: default_url_fetcher
+.. autodata:: DEFAULT_OPTIONS

 .. module:: weasyprint.document
 .. autoclass:: Document
@ -645,6 +646,8 @@ supported.
 The ``attr()`` functional notation is allowed in the ``content`` and
 ``string-set`` properties.

+The ``calc()`` function is **not** supported.
+
 Viewport-percentage lengths (``vw``, ``vh``, ``vmin``, ``vmax``) are **not**
 supported.

--- a/docs/changelog.rst
+++ b/docs/changelog.rst
@ -2,6 +2,115 @@ Changelog
 =========


+Version 59.0b1
+--------------
+
+Released on 2023-04-14.
+
+**This version is experimental, don't use it in production. If you find bugs,
+please report them!**
+
+Command-line API:
+
+* The ``--optimize-size`` option and its short equivalent ``-O`` have been
+  deprecated. To activate or deactivate different size optimizations, you can
+  now use:
+
+  * ``--uncompressed-pdf``,
+  * ``--optimize-images``,
+  * ``--full-fonts``,
+  * ``--hinting``,
+  * ``--dpi <resolution>``, and
+  * ``--jpeg-quality <quality>``.
+
+* A new ``--cache-folder <folder>`` option has been added to store temporary
+  data in the given folder on the disk instead of keeping them in memory.
+
+Python API:
+
+* Global rendering options are now given in ``**options`` instead of dedicated
+  parameters, with slightly different names. It means that the signature of the
+  ``HTML.render()``, ``HTML.write_pdf()`` and ``Document.write_pdf()`` has
+  changed. Here are the steps to port your Python code to v59.0:
+
+  1. Use named parameters for these functions, not positioned parameters.
+  2. Rename some the parameters:
+
+     * ``image_cache`` becomes ``cache`` (see below),
+     * ``identifier`` becomes ``pdf_identifier``,
+     * ``variant`` becomes ``pdf_variant``,
+     * ``version`` becomes ``pdf_version``,
+     * ``forms`` becomes ``pdf_forms``,
+
+* The ``optimize_size`` parameter of ``HTML.render()``, ``HTML.write_pdf()``
+  and ``Document()`` has been removed and will be ignored. You can now use the
+  ``uncompressed_pdf``, ``full_fonts``, ``hinting``, ``dpi`` and
+  ``jpeg_quality`` parameters that are included in ``**options``.
+
+* The ``cache`` parameter can be included in ``**options`` to replace
+  ``image_cache``. If it is a dictionary, this dictionary will be used to store
+  temporary data in memory, and can be even shared between multiple documents.
+  If it’s a folder Path or string, WeasyPrint stores temporary data in the
+  given temporary folder on disk instead of keeping them in memory.
+
+New features:
+
+* `#1853 <https://github.com/Kozea/WeasyPrint/pull/1853>`_,
+  `#1854 <https://github.com/Kozea/WeasyPrint/issues/1854>`_:
+  Reduce PDF size, with financial support from Code & Co.
+* `#1824 <https://github.com/Kozea/WeasyPrint/issues/1824>`_,
+  `#1829 <https://github.com/Kozea/WeasyPrint/pull/1829>`_:
+  Reduce memory use for images
+* `#1858 <https://github.com/Kozea/WeasyPrint/issues/1858>`_:
+  Add an option to keep hinting information in embedded fonts
+
+Bug fixes:
+
+* `#1855 <https://github.com/Kozea/WeasyPrint/issues/1855>`_:
+  Fix position of emojis in justified text
+* `#1852 <https://github.com/Kozea/WeasyPrint/issues/1852>`_:
+  Don’t crash when line can be split before trailing spaces
+* `#1843 <https://github.com/Kozea/WeasyPrint/issues/1843>`_:
+  Fix syntax of dates in metadata
+* `#1827 <https://github.com/Kozea/WeasyPrint/issues/1827>`_,
+  `#1832 <https://github.com/Kozea/WeasyPrint/pull/1832>`_:
+  Fix word-spacing problems with nested tags
+
+Documentation:
+
+* `#1841 <https://github.com/Kozea/WeasyPrint/issues/1841>`_:
+  Add a paragraph about unsupported calc() function
+
+Contributors:
+
+* Guillaume Ayoub
+* Lucie Anglade
+* Alex Ch
+* whi_ne
+* Jonas Castro
+
+Backers and sponsors:
+
+* Castedo Ellerman
+* Kobalt
+* Spacinov
+* Grip Angebotssoftware
+* Crisp BV
+* Manuel Barkhau
+* SimonSoft
+* Menutech
+* KontextWork
+* NCC Group
+* René Fritz
+* Moritz Mahringer
+* Yanal-Yvez Fargialla
+* Piotr Horzycki
+* Healthchecks.io
+* TrainingSparkle
+* Hammerbacher
+* Synapsium
+
+
 Version 58.1
 ------------

--- a/docs/first_steps.rst
+++ b/docs/first_steps.rst
@ -11,7 +11,7 @@ WeasyPrint |version| depends on:

 * Python_ ≥ 3.7.0
 * Pango_ ≥ 1.44.0
-* pydyf_ ≥ 0.5.0
+* pydyf_ ≥ 0.6.0
 * CFFI_ ≥ 0.6
 * html5lib_ ≥ 1.1
 * tinycss2_ ≥ 1.0.0
@ -513,7 +513,8 @@ WeasyPrint provides two options to deal with images: ``optimize_size`` and

 ``optimize_size`` can enable size optimization for images, but also for fonts.
 When enabled, the generated PDF will include smaller images and fonts, but the
-rendering time may be slightly increased.
+rendering time may be slightly increased. The whole structure of the PDF can be
+compressed too.

 .. code-block:: python

@ -523,7 +524,7 @@ rendering time may be slightly increased.

    # Full size optimization, slower, but generated PDF is smaller
    HTML('https://example.org/').write_pdf(
-        'example.pdf', optimize_size=('fonts', 'images'))
+        'example.pdf', optimize_size=('fonts', 'images', 'hinting', 'pdf'))

 ``image_cache`` gives the possibility to use a cache for images, avoiding to
 download, parse and optimize them each time they are used.
@ -539,6 +540,11 @@ time when you render a lot of documents that use the same images.
        HTML(f'https://example.org/?id={i}').write_pdf(
            f'example-{i}.pdf', image_cache=cache)

+It’s also possible to cache images on disk instead of keeping them in memory.
+The ``--cache-folder`` CLI option can be used to define the folder used to
+store temporary images. You can also provide this folder path as a string for
+``image_cache``.
+

 Logging
 ~~~~~~~
--- a/pyproject.toml
+++ b/pyproject.toml
@ -12,7 +12,7 @@ requires-python = '>=3.7'
 readme = {file = 'README.rst', content-type = 'text/x-rst'}
 license = {file = 'LICENSE'}
 dependencies = [
-  'pydyf >=0.5.0',
+  'pydyf >=0.6.0',
  'cffi >=0.6',
  'html5lib >=1.1',
  'tinycss2 >=1.0.0',
--- a/tests/conftest.py
+++ b/tests/conftest.py
@ -73,14 +73,10 @@ def document_write_png(self, target=None, resolution=96, antialiasing=1,
            shutil.copyfileobj(png, fd)


-def html_write_png(self, target=None, stylesheets=None, resolution=96,
-                   presentational_hints=False, optimize_size=('fonts',),
-                   font_config=None, counter_style=None, image_cache=None):
-    return self.render(
-        stylesheets, presentational_hints=presentational_hints,
-        optimize_size=optimize_size, font_config=font_config,
-        counter_style=counter_style, image_cache=image_cache).write_png(
-            target, resolution)
+def html_write_png(self, target=None, font_config=None, counter_style=None,
+                   resolution=96, **options):
+    document = self.render(font_config, counter_style, **options)
+    return document.write_png(target, resolution)


 Document.write_png = document_write_png
--- a/tests/test_api.py
+++ b/tests/test_api.py
@ -6,6 +6,7 @@ import os
 import sys
 import unicodedata
 import zlib
+from functools import partial
 from pathlib import Path
 from urllib.parse import urljoin, uses_relative

@ -78,11 +79,8 @@ def _check_doc1(html, has_base_url=True):
 def _run(args, stdin=b''):
    stdin = io.BytesIO(stdin)
    stdout = io.BytesIO()
-    try:
-        __main__.HTML = FakeHTML
-        __main__.main(args.split(), stdin=stdin, stdout=stdout)
-    finally:
-        __main__.HTML = HTML
+    HTML = partial(FakeHTML, force_uncompressed_pdf=False)
+    __main__.main(args.split(), stdin=stdin, stdout=stdout, HTML=HTML)
    return stdout.getvalue()


@ -303,11 +301,12 @@ def test_command_line_render(tmpdir):
        tmpdir.join(name).write_binary(pattern_bytes)

    # Reference
-    html_obj = FakeHTML(string=combined, base_url='dummy.html')
+    html_obj = FakeHTML(
+        string=combined, base_url='dummy.html', force_uncompressed_pdf=False)
    pdf_bytes = html_obj.write_pdf()
    rotated_pdf_bytes = FakeHTML(
        string=combined, base_url='dummy.html',
-        media_type='screen').write_pdf()
+        media_type='screen', force_uncompressed_pdf=False).write_pdf()

    tmpdir.join('no_css.html').write_binary(html)
    tmpdir.join('combined.html').write_binary(combined)
@ -360,35 +359,34 @@ def test_command_line_render(tmpdir):

    os.environ['SOURCE_DATE_EPOCH'] = '0'
    _run('not_optimized.html out15.pdf')
-    _run('not_optimized.html out16.pdf -O images')
-    _run('not_optimized.html out17.pdf -O fonts')
-    _run('not_optimized.html out18.pdf -O fonts -O images')
-    _run('not_optimized.html out19.pdf -O all')
-    _run('not_optimized.html out20.pdf -O none')
-    _run('not_optimized.html out21.pdf -O none -O all')
-    _run('not_optimized.html out22.pdf -O all -O none')
+    _run('not_optimized.html out16.pdf --optimize-images')
+    _run('not_optimized.html out17.pdf --optimize-images -j 10')
+    _run('not_optimized.html out18.pdf --optimize-images -j 10 -D 1')
+    _run('not_optimized.html out19.pdf --hinting')
+    _run('not_optimized.html out20.pdf --full-fonts')
+    _run('not_optimized.html out21.pdf --full-fonts --uncompressed-pdf')
+    _run(f'not_optimized.html out22.pdf -c {tmpdir}')
    assert (
+        len(tmpdir.join('out18.pdf').read_binary()) <
+        len(tmpdir.join('out17.pdf').read_binary()) <
        len(tmpdir.join('out16.pdf').read_binary()) <
        len(tmpdir.join('out15.pdf').read_binary()) <
-        len(tmpdir.join('out20.pdf').read_binary()))
+        len(tmpdir.join('out19.pdf').read_binary()) <
+        len(tmpdir.join('out20.pdf').read_binary()) <
+        len(tmpdir.join('out21.pdf').read_binary()))
    assert len({
        tmpdir.join(f'out{i}.pdf').read_binary()
-        for i in (16, 18, 19, 21)}) == 1
-    assert len({
-        tmpdir.join(f'out{i}.pdf').read_binary()
-        for i in (15, 17)}) == 1
-    assert len({
-        tmpdir.join(f'out{i}.pdf').read_binary()
-        for i in (20, 22)}) == 1
+        for i in (15, 22)}) == 1
    os.environ.pop('SOURCE_DATE_EPOCH')

-    stdout = _run('combined.html -')
+    stdout = _run('combined.html --uncompressed-pdf -')
    assert stdout.count(b'attachment') == 0
-    stdout = _run('combined.html -')
+    stdout = _run('combined.html --uncompressed-pdf -')
    assert stdout.count(b'attachment') == 0
-    stdout = _run('-a pattern.png combined.html -')
+    stdout = _run('-a pattern.png --uncompressed-pdf combined.html -')
    assert stdout.count(b'attachment') == 1
-    stdout = _run('-a style.css -a pattern.png combined.html -')
+    stdout = _run(
+        '-a style.css -a pattern.png --uncompressed-pdf combined.html -')
    assert stdout.count(b'attachment') == 2

    os.mkdir('subdirectory')
@ -423,42 +421,59 @@ def test_command_line_render(tmpdir):
    (4, '2.0'),
 ))
 def test_pdfa(version, pdf_version):
-    stdout = _run(f'--pdf-variant=pdf/a-{version}b - -', b'test')
+    stdout = _run(
+        f'--pdf-variant=pdf/a-{version}b --uncompressed-pdf - -', b'test')
    assert f'PDF-{pdf_version}'.encode() in stdout
    assert f'part="{version}"'.encode() in stdout


+@pytest.mark.parametrize('version, pdf_version', (
+    (1, '1.4'),
+    (2, '1.7'),
+    (3, '1.7'),
+    (4, '2.0'),
+))
+def test_pdfa_compressed(version, pdf_version):
+    _run(f'--pdf-variant=pdf/a-{version}b - -', b'test')
+
+
 def test_pdfua():
-    stdout = _run('--pdf-variant=pdf/ua-1 - -', b'test')
+    stdout = _run('--pdf-variant=pdf/ua-1 --uncompressed-pdf - -', b'test')
    assert b'part="1"' in stdout


+def test_pdfua_compressed():
+    _run('--pdf-variant=pdf/ua-1 - -', b'test')
+
+
 def test_pdf_identifier():
-    stdout = _run('--pdf-identifier=abc - -', b'test')
+    stdout = _run('--pdf-identifier=abc --uncompressed-pdf - -', b'test')
    assert b'abc' in stdout


 def test_pdf_version():
-    stdout = _run('--pdf-version=1.4 - -', b'test')
+    stdout = _run('--pdf-version=1.4 --uncompressed-pdf - -', b'test')
    assert b'PDF-1.4' in stdout


 def test_pdf_custom_metadata():
-    stdout = _run('--custom-metadata - -', b'<meta name=key content=value />')
+    stdout = _run(
+        '--custom-metadata --uncompressed-pdf - -',
+        b'<meta name=key content=value />')
    assert b'/key' in stdout
    assert b'value' in stdout


 def test_bad_pdf_custom_metadata():
    stdout = _run(
-        '--custom-metadata - -',
+        '--custom-metadata --uncompressed-pdf - -',
        '<meta name=é content=value />'.encode('latin1'))
    assert b'value' not in stdout


 def test_partial_pdf_custom_metadata():
    stdout = _run(
-        '--custom-metadata - -',
+        '--custom-metadata --uncompressed-pdf - -',
        '<meta name=a.b/céd0 content=value />'.encode('latin1'))
    assert b'/abcd0' in stdout
    assert b'value' in stdout
@ -470,10 +485,10 @@ def test_partial_pdf_custom_metadata():
    (b'<textarea></textarea>', b'/Tx'),
 ))
 def test_pdf_inputs(html, field):
-    stdout = _run('--pdf-forms - -', html)
+    stdout = _run('--pdf-forms --uncompressed-pdf - -', html)
    assert b'AcroForm' in stdout
    assert field in stdout
-    stdout = _run('- -', html)
+    stdout = _run('--uncompressed-pdf - -', html)
    assert b'AcroForm' not in stdout


@ -484,8 +499,10 @@ def test_pdf_inputs(html, field):
 ))
 def test_appearance(css, with_forms, without_forms):
    html = f'<input style="{css}">'.encode()
-    assert (b'AcroForm' in _run('--pdf-forms - -', html)) is with_forms
-    assert (b'AcroForm' in _run('- -', html)) is without_forms
+    assert with_forms is (
+        b'AcroForm' in _run('--pdf-forms --uncompressed-pdf - -', html))
+    assert without_forms is (
+        b'AcroForm' in _run(' --uncompressed-pdf - -', html))


 def test_reproducible():
@ -541,20 +558,20 @@ def test_low_level_api(assert_pixels_equal):
    assert pdf_bytes.startswith(b'%PDF')

    png_bytes = html.write_png(stylesheets=[css])
-    document = html.render([css])
+    document = html.render(stylesheets=[css])
    page, = document.pages
    assert page.width == 8
    assert page.height == 8
    assert document.write_png() == png_bytes
    assert document.copy([page]).write_png() == png_bytes

-    document = html.render([css])
+    document = html.render(stylesheets=[css])
    page, = document.pages
    assert (page.width, page.height) == (8, 8)
    png_bytes = document.write_png(resolution=192)
    check_png_pattern(assert_pixels_equal, png_bytes, x2=True)

-    document = html.render([css])
+    document = html.render(stylesheets=[css])
    page, = document.pages
    assert (page.width, page.height) == (8, 8)
    # A resolution that is not multiple of 96:
--- a/tests/test_pdf.py
+++ b/tests/test_pdf.py
@ -26,7 +26,7 @@ RIGHT = round(210 * 72 / 25.4, 6)
 def test_page_size_zoom(zoom):
    pdf = FakeHTML(string='<style>@page{size:3in 4in').write_pdf(zoom=zoom)
    width, height = int(216 * zoom), int(288 * zoom)
-    assert f'/MediaBox [ 0 0 {width} {height} ]'.encode() in pdf
+    assert f'/MediaBox [0 0 {width} {height}]'.encode() in pdf


@assert_no_logs
@ -57,7 +57,7 @@ def test_bookmarks_2():
@assert_no_logs
 def test_bookmarks_3():
    pdf = FakeHTML(string='<h1>a nbsp…</h1>').write_pdf()
-    assert re.findall(b'/Title <(.*)>', pdf) == [
+    assert re.findall(b'/Title <(\\w*)>', pdf) == [
        b'feff006100a0006e0062007300702026']


@ -327,11 +327,11 @@ def test_links():
    ''', base_url=resource_filename('<inline HTML>')).write_pdf()

    uris = re.findall(b'/URI \\((.*)\\)', pdf)
-    types = re.findall(b'/S (.*)', pdf)
-    subtypes = re.findall(b'/Subtype (.*)', pdf)
+    types = re.findall(b'/S (/\\w*)', pdf)
+    subtypes = re.findall(b'/Subtype (/\\w*)', pdf)
    rects = [
        [float(number) for number in match.split()] for match in re.findall(
-            b'/Rect \\[ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+) \\]', pdf)]
+            b'/Rect \\[([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+)\\]', pdf)]

    # 30pt wide (like the image), 20pt high (like line-height)
    assert uris.pop(0) == b'https://weasyprint.org'
@ -349,7 +349,7 @@ def test_links():
    assert subtypes.pop(0) == b'/Link'
    assert b'/Dest (lipsum)' in pdf
    link = re.search(
-        b'\\(lipsum\\) \\[ \\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+) ]',
+        b'\\(lipsum\\) \\[\\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+)]',
        pdf).group(1)
    assert [float(number) for number in link.split()] == [0, TOP, 0]
    assert rects.pop(0) == [10, TOP - 100, 10 + 32, TOP - 100 - 20]
@ -362,7 +362,7 @@ def test_links():
    assert subtypes.pop(0) == b'/Link'
    assert b'/Dest (hello)' in pdf
    link = re.search(
-        b'\\(hello\\) \\[ \\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+) ]',
+        b'\\(hello\\) \\[\\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+)]',
        pdf).group(1)
    assert [float(number) for number in link.split()] == [0, TOP - 200, 0]
    assert rects.pop(0) == [0, TOP, RIGHT, TOP - 30]
@ -387,7 +387,7 @@ def test_relative_links_no_height():
        string='<a href="../lipsum" style="display: block"></a>a',
        base_url='https://weasyprint.org/foo/bar/').write_pdf()
    assert b'/S /URI\n/URI (https://weasyprint.org/foo/lipsum)'
-    assert f'/Rect [ 0 {TOP} {RIGHT} {TOP} ]'.encode() in pdf
+    assert f'/Rect [0 {TOP} {RIGHT} {TOP}]'.encode() in pdf


@assert_no_logs
@ -397,7 +397,7 @@ def test_relative_links_missing_base():
        string='<a href="../lipsum" style="display: block"></a>a',
        base_url=None).write_pdf()
    assert b'/S /URI\n/URI (../lipsum)'
-    assert f'/Rect [ 0 {TOP} {RIGHT} {TOP} ]'.encode() in pdf
+    assert f'/Rect [0 {TOP} {RIGHT} {TOP}]'.encode() in pdf


@assert_no_logs
@ -421,11 +421,11 @@ def test_relative_links_internal():
        base_url=None).write_pdf()
    assert b'/Dest (lipsum)' in pdf
    link = re.search(
-        b'\\(lipsum\\) \\[ \\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+) ]',
+        b'\\(lipsum\\) \\[\\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+)]',
        pdf).group(1)
    assert [float(number) for number in link.split()] == [0, TOP, 0]
    rect = re.search(
-        b'/Rect \\[ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+) \\]',
+        b'/Rect \\[([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+)\\]',
        pdf).group(1)
    assert [float(number) for number in rect.split()] == [0, TOP, RIGHT, TOP]

@ -437,11 +437,11 @@ def test_relative_links_anchors():
        base_url=None).write_pdf()
    assert b'/Dest (lipsum)' in pdf
    link = re.search(
-        b'\\(lipsum\\) \\[ \\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+) ]',
+        b'\\(lipsum\\) \\[\\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+)]',
        pdf).group(1)
    assert [float(number) for number in link.split()] == [0, TOP, 0]
    rect = re.search(
-        b'/Rect \\[ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+) \\]',
+        b'/Rect \\[([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+)\\]',
        pdf).group(1)
    assert [float(number) for number in rect.split()] == [0, TOP, RIGHT, TOP]

@ -474,11 +474,11 @@ def test_missing_links():
    assert b'/Dest (lipsum)' in pdf
    assert len(logs) == 1
    link = re.search(
-        b'\\(lipsum\\) \\[ \\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+) ]',
+        b'\\(lipsum\\) \\[\\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+)]',
        pdf).group(1)
    assert [float(number) for number in link.split()] == [0, TOP - 15, 0]
    rect = re.search(
-        b'/Rect \\[ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+) \\]',
+        b'/Rect \\[([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+)\\]',
        pdf).group(1)
    assert [float(number) for number in rect.split()] == [
        0, TOP, RIGHT, TOP - 15]
@ -495,8 +495,8 @@ def test_anchor_multiple_pages():
        <a href="#lipsum"></a>
      </div>
    ''', base_url=None).write_pdf()
-    first_page, = re.findall(b'/Kids \\[ (\\d+) 0 R', pdf)
-    assert b'/Names [ (lipsum) [ ' + first_page in pdf
+    first_page, = re.findall(b'/Kids \\[(\\d+) 0 R', pdf)
+    assert b'/Names [(lipsum) [' + first_page in pdf


@assert_no_logs
@ -537,8 +537,7 @@ def test_embed_images_from_pages():
        string='<img src="not-optimized.jpg">').render().pages
    document = Document(
        (page1, page2), metadata=DocumentMetadata(),
-        font_config=FontConfiguration(), url_fetcher=None,
-        optimize_size=()).write_pdf()
+        font_config=FontConfiguration(), url_fetcher=None).write_pdf()
    assert document.count(b'/Filter /DCTDecode') == 2


@ -562,8 +561,8 @@ def test_document_info():
        b'006600740065007200a00061006c006c>') in pdf
    assert b'/Keywords (html, css, pdf)' in pdf
    assert b'/Subject <feff0042006c0061006820260020>' in pdf
-    assert b'/CreationDate (20110421230000Z)' in pdf
-    assert b"/ModDate (20130721234600+01'00)" in pdf
+    assert b'/CreationDate (D:20110421230000Z)' in pdf
+    assert b"/ModDate (D:20130721234600+01'00)" in pdf


@assert_no_logs
@ -717,6 +716,6 @@ def test_bleed(style, media, bleed, trim):
      <style>@page { %s }</style>
      <body>test
    ''' % style).write_pdf()
-    assert '/MediaBox [ {} {} {} {} ]'.format(*media).encode() in pdf
-    assert '/BleedBox [ {} {} {} {} ]'.format(*bleed).encode() in pdf
-    assert '/TrimBox [ {} {} {} {} ]'.format(*trim).encode() in pdf
+    assert '/MediaBox [{} {} {} {}]'.format(*media).encode() in pdf
+    assert '/BleedBox [{} {} {} {}]'.format(*bleed).encode() in pdf
+    assert '/TrimBox [{} {} {} {}]'.format(*trim).encode() in pdf
--- a/tests/test_text.py
+++ b/tests/test_text.py
@ -458,8 +458,16 @@ def test_text_align_justify_no_break_between_children():
    assert span_3.position_x == 5 * 16  # (3 + 1) characters + 1 space


+@pytest.mark.parametrize('text', (
+    'Lorem ipsum dolor<em>sit amet</em>',
+    'Lorem ipsum <em>dolorsit</em> amet',
+    'Lorem ipsum <em></em>dolorsit amet',
+    'Lorem ipsum<em> </em>dolorsit amet',
+    'Lorem ipsum<em> dolorsit</em> amet',
+    'Lorem ipsum <em>dolorsit </em>amet',
+))
@assert_no_logs
-def test_word_spacing():
+def test_word_spacing(text):
    # keep the empty <style> as a regression test: element.text is None
    # (Not a string.)
    page, = render_pages('''
@ -470,15 +478,14 @@ def test_word_spacing():
    line, = body.children
    strong_1, = line.children

-    # TODO: Pango gives only half of word-spacing to a space at the end
-    # of a TextBox. Is this what we want?
    page, = render_pages('''
      <style>strong { word-spacing: 11px }</style>
-      <body><strong>Lorem ipsum dolor<em>sit amet</em></strong>''')
+      <body><strong>%s</strong>''' % text)
    html, = page.children
    body, = html.children
    line, = body.children
    strong_2, = line.children
+
    assert strong_2.width - strong_1.width == 33


@ -1018,6 +1025,19 @@ def test_overflow_wrap_trailing_space(wrap, text, body_width, expected_width):
    assert td.width == expected_width


+def test_line_break_before_trailing_space():
+    # Test regression: https://github.com/Kozea/WeasyPrint/issues/1852
+    page, = render_pages('''
+        <p style="display: inline-block">test\u2028 </p>a
+        <p style="display: inline-block">test\u2028</p>a
+    ''')
+    html, = page.children
+    body, = html.children
+    line, = body.children
+    p1, space1, p2, space2 = line.children
+    assert p1.width == p2.width
+
+
 def white_space_lines(width, space):
    page, = render_pages('''
      <style>
--- a/tests/testing_utils.py
+++ b/tests/testing_utils.py
@ -8,7 +8,7 @@ import sys
 import threading
 import wsgiref.simple_server

-from weasyprint import CSS, HTML, images
+from weasyprint import CSS, DEFAULT_OPTIONS, HTML, images
 from weasyprint.css import get_all_computed_styles
 from weasyprint.css.counters import CounterStyle
 from weasyprint.css.targets import TargetCollector
@ -29,30 +29,40 @@ TEST_UA_STYLESHEET = CSS(filename=os.path.join(
    os.path.dirname(__file__), '..', 'weasyprint', 'css', 'tests_ua.css'
 ))

-PROPER_CHILDREN = dict((key, tuple(map(tuple, value))) for key, value in {
+PROPER_CHILDREN = {
    # Children can be of *any* type in *one* of the lists.
-    boxes.BlockContainerBox: [[boxes.BlockLevelBox], [boxes.LineBox]],
-    boxes.LineBox: [[boxes.InlineLevelBox]],
-    boxes.InlineBox: [[boxes.InlineLevelBox]],
-    boxes.TableBox: [[boxes.TableCaptionBox,
-                      boxes.TableColumnGroupBox, boxes.TableColumnBox,
-                      boxes.TableRowGroupBox, boxes.TableRowBox]],
-    boxes.InlineTableBox: [[boxes.TableCaptionBox,
-                            boxes.TableColumnGroupBox, boxes.TableColumnBox,
-                            boxes.TableRowGroupBox, boxes.TableRowBox]],
-    boxes.TableColumnGroupBox: [[boxes.TableColumnBox]],
-    boxes.TableRowGroupBox: [[boxes.TableRowBox]],
-    boxes.TableRowBox: [[boxes.TableCellBox]],
-}.items())
+    boxes.BlockContainerBox: ((boxes.BlockLevelBox,), (boxes.LineBox,)),
+    boxes.LineBox: ((boxes.InlineLevelBox,),),
+    boxes.InlineBox: ((boxes.InlineLevelBox,),),
+    boxes.TableBox: ((
+        boxes.TableCaptionBox, boxes.TableColumnGroupBox, boxes.TableColumnBox,
+        boxes.TableRowGroupBox, boxes.TableRowBox),),
+    boxes.InlineTableBox: ((
+        boxes.TableCaptionBox, boxes.TableColumnGroupBox, boxes.TableColumnBox,
+        boxes.TableRowGroupBox, boxes.TableRowBox),),
+    boxes.TableColumnGroupBox: ((boxes.TableColumnBox,),),
+    boxes.TableRowGroupBox: ((boxes.TableRowBox,),),
+    boxes.TableRowBox: ((boxes.TableCellBox,),),
+}


 class FakeHTML(HTML):
    """Like weasyprint.HTML, but with a lighter UA stylesheet."""
+    def __init__(self, *args, force_uncompressed_pdf=True, **kwargs):
+        super().__init__(*args, **kwargs)
+        self._force_uncompressed_pdf = force_uncompressed_pdf
+
    def _ua_stylesheets(self, forms=False):
        return [
            TEST_UA_STYLESHEET if stylesheet == HTML5_UA_STYLESHEET
            else stylesheet for stylesheet in super()._ua_stylesheets(forms)]

+    def write_pdf(self, target=None, zoom=1, finisher=None, **options):
+        # Override function to force the generation of uncompressed PDFs
+        if self._force_uncompressed_pdf:
+            options['uncompressed_pdf'] = True
+        return super().write_pdf(target, zoom, finisher, **options)
+

 def resource_filename(basename):
    """Return the absolute path of the resource called ``basename``."""
@ -182,7 +192,7 @@ def _parse_base(html_content, base_url=BASE_URL):
    style_for = get_all_computed_styles(document, counter_style=counter_style)
    get_image_from_uri = functools.partial(
        images.get_image_from_uri, cache={}, url_fetcher=document.url_fetcher,
-        optimize_size=())
+        options=DEFAULT_OPTIONS)
    target_collector = TargetCollector()
    footnotes = []
    return (
--- a/weasyprint/init.py
+++ b/weasyprint/init.py
@ -15,11 +15,73 @@ import cssselect2
 import html5lib
 import tinycss2

-VERSION = __version__ = '58.1'
+VERSION = __version__ = '59.0b1'
+
+#: Default values for command-line and Python API options. See
+#: :func:`__main__.main` to learn more about specific options for
+#: command-line.
+#:
+#: :param list stylesheets:
+#:     An optional list of user stylesheets. The list can include
+#:     are :class:`CSS` objects, filenames, URLs, or file-like
+#:     objects. (See :ref:`Stylesheet Origins`.)
+#: :param str media_type:
+#:     Media type to use for @media.
+#: :param list attachments:
+#:     A list of additional file attachments for the generated PDF
+#:     document or :obj:`None`. The list's elements are
+#:     :class:`Attachment` objects, filenames, URLs or file-like objects.
+#: :param bytes pdf_identifier:
+#:     A bytestring used as PDF file identifier.
+#: :param str pdf_variant:
+#:     A PDF variant name.
+#: :param str pdf_version:
+#:     A PDF version number.
+#: :param bool pdf_forms:
+#:     Whether PDF forms have to be included.
+#: :param bool uncompressed_pdf:
+#:     Whether PDF content should be compressed.
+#: :param bool custom_metadata:
+#:     Whether custom HTML metadata should be stored in the generated PDF.
+#: :param bool presentational_hints:
+#:     Whether HTML presentational hints are followed.
+#: :param bool optimize_images:
+#:     Whether size of embedded images should be optimized, with no quality
+#:     loss.
+#: :param int jpeg_quality:
+#:     JPEG quality between 0 (worst) to 95 (best).
+#: :param int dpi:
+#:     Maximum resolution of images embedded in the PDF.
+#: :param bool full_fonts:
+#:     Whether unmodified font files should be embedded when possible.
+#: :param bool hinting:
+#:     Whether hinting information should be kept in embedded fonts.
+#: :type cache: :obj:`dict`, :class:`pathlib.Path` or :obj:`str`
+#: :param cache:
+#:     A dictionary used to cache images in memory, or a folder path where
+#:     images are temporarily stored.
+DEFAULT_OPTIONS = {
+    'stylesheets': None,
+    'media_type': 'print',
+    'attachments': None,
+    'pdf_identifier': None,
+    'pdf_variant': None,
+    'pdf_version': None,
+    'pdf_forms': None,
+    'uncompressed_pdf': False,
+    'custom_metadata': False,
+    'presentational_hints': False,
+    'optimize_images': False,
+    'jpeg_quality': None,
+    'dpi': None,
+    'full_fonts': False,
+    'hinting': False,
+    'cache': None,
+}

 __all__ = [
-    'HTML', 'CSS', 'Attachment', 'Document', 'Page', 'default_url_fetcher',
-    'VERSION', '__version__']
+    'HTML', 'CSS', 'DEFAULT_OPTIONS', 'Attachment', 'Document', 'Page',
+    'default_url_fetcher', 'VERSION', '__version__']


 # Import after setting the version, as the version is used in other modules
@ -55,12 +117,15 @@ class HTML:
    Alternatively, use **one** named argument so that no guessing is involved:

    :type filename: str or pathlib.Path
-    :param filename: A filename, relative to the current directory, or
-        absolute.
-    :param str url: An absolute, fully qualified URL.
+    :param filename:
+        A filename, relative to the current directory, or absolute.
+    :param str url:
+        An absolute, fully qualified URL.
    :type file_obj: :term:`file object`
-    :param file_obj: Any object with a ``read`` method.
-    :param str string: A string of HTML source.
+    :param file_obj:
+        Any object with a ``read`` method.
+    :param str string:
+        A string of HTML source.

    Specifying multiple inputs is an error:
    ``HTML(filename="foo.html", url="localhost://bar.html")``
@ -68,20 +133,22 @@ class HTML:

    You can also pass optional named arguments:

-    :param str encoding: Force the source character encoding.
-    :param str base_url: The base used to resolve relative URLs
-        (e.g. in ``<img src="../foo.png">``). If not provided, try to use
-        the input filename, URL, or ``name`` attribute of :term:`file objects
-        <file object>`.
-    :type url_fetcher: :term:`function`
-    :param url_fetcher: A function or other callable
-        with the same signature as :func:`default_url_fetcher` called to
-        fetch external resources such as stylesheets and images.
-        (See :ref:`URL Fetchers`.)
-    :param str media_type: The media type to use for ``@media``.
-        Defaults to ``'print'``. **Note:** In some cases like
-        ``HTML(string=foo)`` relative URLs will be invalid if ``base_url``
-        is not provided.
+    :param str encoding:
+        Force the source character encoding.
+    :param str base_url:
+        The base used to resolve relative URLs (e.g. in
+        ``<img src="../foo.png">``). If not provided, try to use the input
+        filename, URL, or ``name`` attribute of
+        :term:`file objects <file object>`.
+    :type url_fetcher: :term:`callable`
+    :param url_fetcher:
+        A function or other callable with the same signature as
+        :func:`default_url_fetcher` called to fetch external resources such as
+        stylesheets and images. (See :ref:`URL Fetchers`.)
+    :param str media_type:
+        The media type to use for ``@media``. Defaults to ``'print'``.
+        **Note:** In some cases like ``HTML(string=foo)`` relative URLs will be
+        invalid if ``base_url`` is not provided.

    """
    def __init__(self, guess=None, filename=None, url=None, file_obj=None,
@ -119,42 +186,32 @@ class HTML:
    def _ph_stylesheets(self):
        return [HTML5_PH_STYLESHEET]

-    def render(self, stylesheets=None, presentational_hints=False,
-               optimize_size=('fonts',), font_config=None, counter_style=None,
-               image_cache=None, forms=False):
+    def render(self, font_config=None, counter_style=None, **options):
        """Lay out and paginate the document, but do not (yet) export it.

        This returns a :class:`document.Document` object which provides
        access to individual pages and various meta-data.
        See :meth:`write_pdf` to get a PDF directly.

-        :param list stylesheets:
-            An optional list of user stylesheets. List elements are
-            :class:`CSS` objects, filenames, URLs, or file
-            objects. (See :ref:`Stylesheet Origins`.)
-        :param bool presentational_hints:
-            Whether HTML presentational hints are followed.
-        :param tuple optimize_size:
-            Optimize size of generated PDF. Can contain "images" and "fonts".
        :type font_config: :class:`text.fonts.FontConfiguration`
-        :param font_config: A font configuration handling ``@font-face`` rules.
+        :param font_config:
+            A font configuration handling ``@font-face`` rules.
        :type counter_style: :class:`css.counters.CounterStyle`
-        :param counter_style: A dictionary storing ``@counter-style`` rules.
-        :param dict image_cache: A dictionary used to cache images.
-        :param bool forms: Whether PDF forms have to be included.
+        :param counter_style:
+            A dictionary storing ``@counter-style`` rules.
+        :param options:
+            The ``options`` parameter includes by default the
+            :data:`DEFAULT_OPTIONS` values.
        :returns: A :class:`document.Document` object.

        """
-        return Document._render(
-            self, stylesheets, presentational_hints, optimize_size,
-            font_config, counter_style, image_cache, forms)
+        new_options = DEFAULT_OPTIONS.copy()
+        new_options.update(options)
+        options = new_options
+        return Document._render(self, font_config, counter_style, options)

-    def write_pdf(self, target=None, stylesheets=None, zoom=1,
-                  attachments=None, finisher=None, presentational_hints=False,
-                  optimize_size=('fonts',), font_config=None,
-                  counter_style=None, image_cache=None, identifier=None,
-                  variant=None, version=None, forms=False,
-                  custom_metadata=False):
+    def write_pdf(self, target=None, zoom=1, finisher=None,
+                  font_config=None, counter_style=None, **options):
        """Render the document to a PDF file.

        This is a shortcut for calling :meth:`render`, then
@ -165,49 +222,37 @@ class HTML:
        :param target:
            A filename where the PDF file is generated, a file object, or
            :obj:`None`.
-        :param list stylesheets:
-            An optional list of user stylesheets. The list's elements
-            are :class:`CSS` objects, filenames, URLs, or file-like
-            objects. (See :ref:`Stylesheet Origins`.)
        :param float zoom:
            The zoom factor in PDF units per CSS units.  **Warning**:
            All CSS units are affected, including physical units like
            ``cm`` and named sizes like ``A4``.  For values other than
            1, the physical CSS units will thus be "wrong".
-        :param list attachments: A list of additional file attachments for the
-            generated PDF document or :obj:`None`. The list's elements are
-            :class:`Attachment` objects, filenames, URLs or file-like objects.
-        :param finisher: A finisher function, that accepts the document and a
-            :class:`pydyf.PDF` object as parameters, can be passed to perform
+        :type finisher: :term:`callable`
+        :param finisher:
+            A finisher function or callable that accepts the document and a
+            :class:`pydyf.PDF` object as parameters. Can be passed to perform
            post-processing on the PDF right before the trailer is written.
-        :param bool presentational_hints: Whether HTML presentational hints are
-            followed.
-        :param tuple optimize_size:
-            Optimize size of generated PDF. Can contain "images" and "fonts".
        :type font_config: :class:`text.fonts.FontConfiguration`
-        :param font_config: A font configuration handling ``@font-face`` rules.
+        :param font_config:
+            A font configuration handling ``@font-face`` rules.
        :type counter_style: :class:`css.counters.CounterStyle`
-        :param counter_style: A dictionary storing ``@counter-style`` rules.
-        :param dict image_cache: A dictionary used to cache images.
-        :param bytes identifier: A bytestring used as PDF file identifier.
-        :param str variant: A PDF variant name.
-        :param str version: A PDF version number.
-        :param bool forms: Whether PDF forms have to be included.
-        :param bool custom_metadata: Whether custom HTML metadata should be
-            stored in the generated PDF.
+        :param counter_style:
+            A dictionary storing ``@counter-style`` rules.
+        :param options:
+            The ``options`` parameter includes by default the
+            :data:`DEFAULT_OPTIONS` values.
        :returns:
            The PDF as :obj:`bytes` if ``target`` is not provided or
            :obj:`None`, otherwise :obj:`None` (the PDF is written to
            ``target``).

        """
+        new_options = DEFAULT_OPTIONS.copy()
+        new_options.update(options)
+        options = new_options
        return (
-            self.render(
-                stylesheets, presentational_hints, optimize_size, font_config,
-                counter_style, image_cache, forms)
-            .write_pdf(
-                target, zoom, attachments, finisher, identifier, variant,
-                version, custom_metadata))
+            self.render(font_config, counter_style, **options)
+            .write_pdf(target, zoom, finisher, **options))


 class CSS:
@ -263,8 +308,9 @@ class Attachment:
    supported. An optional description can be provided with the ``description``
    argument.

-    :param description: A description of the attachment to be included in the
-        PDF document. May be :obj:`None`.
+    :param description:
+        A description of the attachment to be included in the PDF document.
+        May be :obj:`None`.

    """
    def __init__(self, guess=None, filename=None, url=None, file_obj=None,
--- a/weasyprint/main.py
+++ b/weasyprint/main.py
@ -4,10 +4,11 @@ import argparse
 import logging
 import platform
 import sys
+from warnings import warn

 import pydyf

-from . import HTML, LOGGER, __version__
+from . import DEFAULT_OPTIONS, HTML, LOGGER, __version__
 from .pdf import VARIANTS
 from .text.ffi import pango

@ -27,148 +28,125 @@ class PrintInfo(argparse.Action):
        sys.exit()


-def main(argv=None, stdout=None, stdin=None):
+class Parser(argparse.ArgumentParser):
+    def __init__(self, *args, **kwargs):
+        self._arguments = {}
+        super().__init__(*args, **kwargs)
+
+    def add_argument(self, *args, **kwargs):
+        super().add_argument(*args, **kwargs)
+        key = args[-1].lstrip('-')
+        kwargs['flags'] = args
+        kwargs['positional'] = args[-1][0] != '-'
+        self._arguments[key] = kwargs
+
+    @property
+    def docstring(self):
+        self._arguments['help'] = self._arguments.pop('help')
+        data = []
+        for key, args in self._arguments.items():
+            data.append('.. option:: ')
+            action = args.get('action', 'store')
+            for flag in args['flags']:
+                data.append(flag)
+                if not args['positional'] and action in ('store', 'append'):
+                    data.append(f' <{key}>')
+                data.append(', ')
+            data[-1] = '\n\n'
+            data.append(f'  {args["help"][0].upper()}{args["help"][1:]}.\n\n')
+            if 'choices' in args:
+                choices = ", ".join(args['choices'])
+                data.append(f'  Possible choices: {choices}.\n\n')
+            if action == 'append':
+                data.append('  This option can be passed multiple times.\n\n')
+        return ''.join(data)
+
+
+PARSER = Parser(
+    prog='weasyprint', description='Render web pages to PDF.')
+PARSER.add_argument(
+    'input', help='URL or filename of the HTML input, or - for stdin')
+PARSER.add_argument(
+    'output', help='filename where output is written, or - for stdout')
+PARSER.add_argument(
+    '-e', '--encoding', help='force the input character encoding')
+PARSER.add_argument(
+    '-s', '--stylesheet', action='append', dest='stylesheets',
+    help='URL or filename for a user CSS stylesheet')
+PARSER.add_argument(
+    '-m', '--media-type',
+    help='media type to use for @media, defaults to print')
+PARSER.add_argument(
+    '-u', '--base-url',
+    help='base for relative URLs in the HTML input, defaults to the '
+    'input’s own filename or URL or the current directory for stdin')
+PARSER.add_argument(
+    '-a', '--attachment', action='append', dest='attachments',
+    help='URL or filename of a file to attach to the PDF document')
+PARSER.add_argument('--pdf-identifier', help='PDF file identifier')
+PARSER.add_argument(
+    '--pdf-variant', choices=VARIANTS, help='PDF variant to generate')
+PARSER.add_argument('--pdf-version', help='PDF version number')
+PARSER.add_argument(
+    '--pdf-forms', action='store_true', help='include PDF forms')
+PARSER.add_argument(
+    '--uncompressed-pdf', action='store_true',
+    help='do not compress PDF content, mainly for debugging purpose')
+PARSER.add_argument(
+    '--custom-metadata', action='store_true',
+    help='include custom HTML meta tags in PDF metadata')
+PARSER.add_argument(
+    '-p', '--presentational-hints', action='store_true',
+    help='follow HTML presentational hints')
+PARSER.add_argument(
+    '--optimize-images', action='store_true',
+    help='optimize size of embedded images with no quality loss')
+PARSER.add_argument(
+    '-j', '--jpeg-quality', type=int,
+    help='JPEG quality between 0 (worst) to 95 (best)')
+PARSER.add_argument(
+    '--full-fonts', action='store_true',
+    help='embed unmodified font files when possible')
+PARSER.add_argument(
+    '--hinting', action='store_true',
+    help='keep hinting information in embedded fonts')
+PARSER.add_argument(
+    '-c', '--cache-folder', dest='cache',
+    help='store cache on disk instead of memory, folder is '
+    'created if needed and cleaned after the PDF is generated')
+PARSER.add_argument(
+    '-D', '--dpi', type=int,
+    help='set maximum resolution of images embedded in the PDF')
+PARSER.add_argument(
+    '-v', '--verbose', action='store_true',
+    help='show warnings and information messages')
+PARSER.add_argument(
+    '-d', '--debug', action='store_true', help='show debugging messages')
+PARSER.add_argument(
+    '-q', '--quiet', action='store_true', help='hide logging messages')
+PARSER.add_argument(
+    '--version', action='version',
+    version=f'WeasyPrint version {__version__}',
+    help='print WeasyPrint’s version number and exit')
+PARSER.add_argument(
+    '-i', '--info', action=PrintInfo, nargs=0,
+    help='print system information and exit')
+PARSER.add_argument(
+    '-O', '--optimize-size', action='append',
+    help='deprecated, use other options instead',
+    choices=('images', 'fonts', 'hinting', 'pdf', 'all', 'none'))
+PARSER.set_defaults(**DEFAULT_OPTIONS)
+
+
+def main(argv=None, stdout=None, stdin=None, HTML=HTML):
    """The ``weasyprint`` program takes at least two arguments:

    .. code-block:: sh

        weasyprint [options] <input> <output>

-    The input is a filename or URL to an HTML document, or ``-`` to read
-    HTML from stdin. The output is a filename, or ``-`` to write to stdout.
-
-    Options can be mixed anywhere before, between, or after the input and
-    output.
-
-    .. option:: -e <input_encoding>, --encoding <input_encoding>
-
-        Force the input character encoding (e.g. ``-e utf-8``).
-
-    .. option:: -s <filename_or_URL>, --stylesheet <filename_or_URL>
-
-        Filename or URL of a user cascading stylesheet (see
-        :ref:`Stylesheet Origins`) to add to the document
-        (e.g. ``-s print.css``). Multiple stylesheets are allowed.
-
-    .. option:: -m <type>, --media-type <type>
-
-        Set the media type to use for ``@media``. Defaults to ``print``.
-
-    .. option:: -u <URL>, --base-url <URL>
-
-        Set the base for relative URLs in the HTML input.
-        Defaults to the input’s own URL, or the current directory for stdin.
-
-    .. option:: -a <file>, --attachment <file>
-
-        Adds an attachment to the document. The attachment is included in the
-        PDF output. This option can be used multiple times.
-
-    .. option:: --pdf-identifier <identifier>
-
-        PDF file identifier, used to check whether two different files
-        are two different versions of the same original document.
-
-    .. option:: --pdf-variant <variant-name>
-
-        PDF variant to generate (e.g. ``--pdf-variant pdf/a-3b``).
-
-    .. option:: --pdf-version <version-number>
-
-        PDF version number (default is 1.7).
-
-    .. option:: --custom-metadata
-
-        Include custom HTML meta tags in PDF metadata.
-
-    .. option:: -p, --presentational-hints
-
-        Follow `HTML presentational hints
-        <https://www.w3.org/TR/html/rendering.html\
-        #the-css-user-agent-style-sheet-and-presentational-hints>`_.
-
-    .. option:: -O <type>, --optimize-size <type>
-
-        Optimize the size of generated documents. Supported types are
-        ``images``, ``fonts``, ``all`` and ``none``. This option can be used
-        multiple times, ``all`` adds all allowed values, ``none`` removes all
-        previously set values.
-
-    .. option:: -v, --verbose
-
-        Show warnings and information messages.
-
-    .. option:: -d, --debug
-
-        Show debugging messages.
-
-    .. option:: -q, --quiet
-
-        Hide logging messages.
-
-    .. option:: --version
-
-        Show the version number. Other options and arguments are ignored.
-
-    .. option:: -h, --help
-
-        Show the command-line usage. Other options and arguments are ignored.
-
    """
-    parser = argparse.ArgumentParser(
-        prog='weasyprint', description='Render web pages to PDF.')
-    parser.add_argument(
-        '--version', action='version',
-        version=f'WeasyPrint version {__version__}',
-        help='print WeasyPrint’s version number and exit')
-    parser.add_argument(
-        '-i', '--info', action=PrintInfo, nargs=0,
-        help='print system information and exit')
-    parser.add_argument(
-        '-e', '--encoding', help='character encoding of the input')
-    parser.add_argument(
-        '-s', '--stylesheet', action='append',
-        help='URL or filename for a user CSS stylesheet, '
-        'may be given multiple times')
-    parser.add_argument(
-        '-m', '--media-type', default='print',
-        help='media type to use for @media, defaults to print')
-    parser.add_argument(
-        '-u', '--base-url',
-        help='base for relative URLs in the HTML input, defaults to the '
-        'input’s own filename or URL or the current directory for stdin')
-    parser.add_argument(
-        '-a', '--attachment', action='append',
-        help='URL or filename of a file to attach to the PDF document')
-    parser.add_argument('--pdf-identifier', help='PDF file identifier')
-    parser.add_argument(
-        '--pdf-variant', choices=VARIANTS, help='PDF variant to generate')
-    parser.add_argument('--pdf-version', help='PDF version number')
-    parser.add_argument(
-        '--pdf-forms', action='store_true', help='Include PDF forms')
-    parser.add_argument(
-        '--custom-metadata', action='store_true',
-        help='include custom HTML meta tags in PDF metadata')
-    parser.add_argument(
-        '-p', '--presentational-hints', action='store_true',
-        help='follow HTML presentational hints')
-    parser.add_argument(
-        '-O', '--optimize-size', action='append',
-        help='optimize output size for specified features',
-        choices=('images', 'fonts', 'all', 'none'), default=['fonts'])
-    parser.add_argument(
-        '-v', '--verbose', action='store_true',
-        help='show warnings and information messages')
-    parser.add_argument(
-        '-d', '--debug', action='store_true', help='show debugging messages')
-    parser.add_argument(
-        '-q', '--quiet', action='store_true', help='hide logging messages')
-    parser.add_argument(
-        'input', help='URL or filename of the HTML input, or - for stdin')
-    parser.add_argument(
-        'output', help='filename where output is written, or - for stdout')
-
-    args = parser.parse_args(argv)
+    args = PARSER.parse_args(argv)

    if args.input == '-':
        source = stdin or sys.stdin.buffer
@ -184,26 +162,34 @@ def main(argv=None, stdout=None, stdin=None):
    else:
        output = args.output

-    optimize_size = set()
-    for arg in args.optimize_size:
-        if arg == 'none':
-            optimize_size.clear()
-        elif arg == 'all':
-            optimize_size |= {'images', 'fonts'}
-        else:
-            optimize_size.add(arg)
+    # TODO: to be removed when --optimize-size is removed
+    optimize_size = {'fonts', 'hinting', 'pdf'}
+    if args.optimize_size is not None:
+        warn(
+            'The --optimize-size option is now deprecated '
+            'and will be removed in next version. '
+            'Please use the other options available in --help instead.',
+            category=FutureWarning)
+        for arg in args.optimize_size:
+            if arg == 'none':
+                optimize_size.clear()
+            elif arg == 'all':
+                optimize_size |= {'images', 'fonts', 'hinting', 'pdf'}
+            else:
+                optimize_size.add(arg)
+    del args.optimize_size

-    kwargs = {
-        'stylesheets': args.stylesheet,
-        'presentational_hints': args.presentational_hints,
-        'optimize_size': tuple(optimize_size),
-        'attachments': args.attachment,
-        'identifier': args.pdf_identifier,
-        'variant': args.pdf_variant,
-        'version': args.pdf_version,
-        'forms': args.pdf_forms,
-        'custom_metadata': args.custom_metadata,
-    }
+    options = vars(args)
+
+    # TODO: to be removed when --optimize-size is removed
+    if 'images' in optimize_size:
+        options['optimize_images'] = True
+    if 'fonts' not in optimize_size:
+        options['full_fonts'] = True
+    if 'hinting' not in optimize_size:
+        options['hinting'] = True
+    if 'pdf' not in optimize_size:
+        options['uncompressed_pdf'] = True

    # Default to logging to stderr.
    if args.debug:
@ -218,7 +204,10 @@ def main(argv=None, stdout=None, stdin=None):
    html = HTML(
        source, base_url=args.base_url, encoding=args.encoding,
        media_type=args.media_type)
-    html.write_pdf(output, **kwargs)
+    html.write_pdf(output, **options)
+
+
+main.__doc__ += '\n\n' + PARSER.docstring


 if __name__ == '__main__':  # pragma: no cover
--- a/weasyprint/document.py
+++ b/weasyprint/document.py
@ -2,9 +2,10 @@

 import functools
 import io
-import shutil
+from hashlib import md5
+from pathlib import Path

-from . import CSS
+from . import CSS, DEFAULT_OPTIONS
 from .anchors import gather_anchors, make_page_bookmark_tree
 from .css import get_all_computed_styles
 from .css.counters import CounterStyle
@ -159,6 +160,51 @@ class DocumentMetadata:
        self.custom = custom or {}


+class DiskCache:
+    """Dict-like storing images content on disk.
+
+    Bytestring values are stored on disk. Other lightweight Python objects
+    (i.e. RasterImage instances) are still stored in memory.
+
+    """
+    def __init__(self, folder):
+        self._path = Path(folder)
+        self._path.mkdir(parents=True, exist_ok=True)
+        self._memory_cache = {}
+        self._disk_paths = set()
+
+    def _path_from_key(self, key):
+        return self._path / md5(key.encode()).hexdigest()
+
+    def __getitem__(self, key):
+        if key in self._memory_cache:
+            return self._memory_cache[key]
+        else:
+            return self._path_from_key(key).read_bytes()
+
+    def __setitem__(self, key, value):
+        if isinstance(value, bytes):
+            path = self._path_from_key(key)
+            self._disk_paths.add(path)
+            path.write_bytes(value)
+        else:
+            self._memory_cache[key] = value
+
+    def __contains__(self, key):
+        return (
+            key in self._memory_cache or
+            self._path_from_key(key).exists())
+
+    def __del__(self):
+        try:
+            for path in self._disk_paths:
+                path.unlink(missing_ok=True)
+            self._path.rmdir()
+        except Exception:
+            # Silently ignore errors while clearing cache
+            pass
+
+
 class Document:
    """A rendered document ready to be painted in a pydyf stream.

@ -171,9 +217,7 @@ class Document:
    """

    @classmethod
-    def _build_layout_context(cls, html, stylesheets, presentational_hints,
-                              optimize_size, font_config, counter_style,
-                              image_cache, forms):
+    def _build_layout_context(cls, html, font_config, counter_style, options):
        if font_config is None:
            font_config = FontConfiguration()
        if counter_style is None:
@ -181,19 +225,24 @@ class Document:
        target_collector = TargetCollector()
        page_rules = []
        user_stylesheets = []
-        image_cache = {} if image_cache is None else image_cache
-        for css in stylesheets or []:
+        cache = options['cache']
+        if cache is None:
+            cache = {}
+        elif not isinstance(cache, (dict, DiskCache)):
+            cache = DiskCache(cache)
+        for css in options['stylesheets'] or []:
            if not hasattr(css, 'matcher'):
                css = CSS(
                    guess=css, media_type=html.media_type,
                    font_config=font_config, counter_style=counter_style)
            user_stylesheets.append(css)
        style_for = get_all_computed_styles(
-            html, user_stylesheets, presentational_hints, font_config,
-            counter_style, page_rules, target_collector, forms)
+            html, user_stylesheets, options['presentational_hints'],
+            font_config, counter_style, page_rules, target_collector,
+            options['pdf_forms'])
        get_image_from_uri = functools.partial(
-            original_get_image_from_uri, cache=image_cache,
-            url_fetcher=html.url_fetcher, optimize_size=optimize_size)
+            original_get_image_from_uri, cache=cache,
+            url_fetcher=html.url_fetcher, options=options)
        PROGRESS_LOGGER.info('Step 4 - Creating formatting structure')
        context = LayoutContext(
            style_for, get_image_from_uri, font_config, counter_style,
@ -201,8 +250,7 @@ class Document:
        return context

    @classmethod
-    def _render(cls, html, stylesheets, presentational_hints, optimize_size,
-                font_config, counter_style, image_cache, forms):
+    def _render(cls, html, font_config, counter_style, options):
        if font_config is None:
            font_config = FontConfiguration()

@ -210,8 +258,7 @@ class Document:
            counter_style = CounterStyle()

        context = cls._build_layout_context(
-            html, stylesheets, presentational_hints, optimize_size,
-            font_config, counter_style, image_cache, forms)
+            html, font_config, counter_style, options)

        root_box = build_formatting_structure(
            html.etree_element, context.style_for, context.get_image_from_uri,
@ -222,12 +269,11 @@ class Document:
        rendering = cls(
            [Page(page_box) for page_box in page_boxes],
            DocumentMetadata(**get_html_metadata(html)),
-            html.url_fetcher, font_config, optimize_size)
+            html.url_fetcher, font_config)
        rendering._html = html
        return rendering

-    def __init__(self, pages, metadata, url_fetcher, font_config,
-                 optimize_size):
+    def __init__(self, pages, metadata, url_fetcher, font_config):
        #: A list of :class:`Page` objects.
        self.pages = pages
        #: A :class:`DocumentMetadata` object.
@ -246,9 +292,6 @@ class Document:
        # rendering is destroyed. This is needed as font_config.__del__ removes
        # fonts that may be used when rendering
        self.font_config = font_config
-        # Set of flags for PDF size optimization. Can contain "images" and
-        # "fonts".
-        self._optimize_size = optimize_size

    def build_element_structure(self, structure, etree_element=None):
        if etree_element is None:
@ -288,8 +331,7 @@ class Document:
        elif not isinstance(pages, list):
            pages = list(pages)
        return type(self)(
-            pages, self.metadata, self.url_fetcher, self.font_config,
-            self._optimize_size)
+            pages, self.metadata, self.url_fetcher, self.font_config)

    def make_bookmark_tree(self, scale=1, transform_pages=False):
        """Make a tree of all bookmarks in the document.
@ -324,9 +366,7 @@ class Document:
                page_number, matrix)
        return root

-    def write_pdf(self, target=None, zoom=1, attachments=None, finisher=None,
-                  identifier=None, variant=None, version=None,
-                  custom_metadata=False):
+    def write_pdf(self, target=None, zoom=1, finisher=None, **options):
        """Paint the pages in a PDF file, with metadata.

        :type target:
@ -339,40 +379,38 @@ class Document:
            All CSS units are affected, including physical units like
            ``cm`` and named sizes like ``A4``.  For values other than
            1, the physical CSS units will thus be "wrong".
-        :param list attachments: A list of additional file attachments for the
-            generated PDF document or :obj:`None`. The list's elements are
-            :class:`weasyprint.Attachment` objects, filenames, URLs or
-            file-like objects.
-        :param finisher: A finisher function, that accepts the document and a
-            :class:`pydyf.PDF` object as parameters, can be passed to perform
+        :type finisher: :term:`callable`
+        :param finisher:
+            A finisher function or callable that accepts the document and a
+            :class:`pydyf.PDF` object as parameters. Can be passed to perform
            post-processing on the PDF right before the trailer is written.
-        :param bytes identifier: A bytestring used as PDF file identifier.
-        :param str variant: A PDF variant name.
-        :param str version: A PDF version number.
-        :param bool custom_metadata: A boolean defining whether custom HTML
-            metadata should be stored in the generated PDF.
+        :param options:
+            The ``options`` parameter includes by default the
+            :data:`weasyprint.DEFAULT_OPTIONS` values.
        :returns:
            The PDF as :obj:`bytes` if ``target`` is not provided or
            :obj:`None`, otherwise :obj:`None` (the PDF is written to
            ``target``).

        """
-        pdf = generate_pdf(
-            self, target, zoom, attachments, self._optimize_size, identifier,
-            variant, version, custom_metadata)
+        new_options = DEFAULT_OPTIONS.copy()
+        new_options.update(options)
+        options = new_options
+        pdf = generate_pdf(self, target, zoom, **options)
+
+        identifier = options['pdf_identifier']
+        compress = not options['uncompressed_pdf']

        if finisher:
            finisher(self, pdf)

-        output = io.BytesIO()
-        pdf.write(output, version=pdf.version, identifier=identifier)
-
        if target is None:
+            output = io.BytesIO()
+            pdf.write(output, pdf.version, identifier, compress)
            return output.getvalue()
+
+        if hasattr(target, 'write'):
+            pdf.write(target, pdf.version, identifier, compress)
        else:
-            output.seek(0)
-            if hasattr(target, 'write'):
-                shutil.copyfileobj(output, target)
-            else:
-                with open(target, 'wb') as fd:
-                    shutil.copyfileobj(output, fd)
+            with open(target, 'wb') as fd:
+                pdf.write(fd, pdf.version, identifier, compress)
--- a/weasyprint/draw.py
+++ b/weasyprint/draw.py
@ -1052,6 +1052,10 @@ def draw_emojis(stream, font_size, x, y, emojis):
 def draw_first_line(stream, textbox, text_overflow, block_ellipsis, x, y,
                    angle=0):
    """Draw the given ``textbox`` line to the document ``stream``."""
+    # Don’t draw lines with only invisible characters
+    if not textbox.text.strip():
+        return []
+
    font_size = textbox.style['font_size']
    if font_size < 1e-6:  # Default float precision used by pydyf
        return []
@ -1198,8 +1202,7 @@ def draw_first_line(stream, textbox, text_overflow, block_ellipsis, x, y,
                    png_data = ffi.unpack(hb_data, int(stream.length[0]))
                    pillow_image = Image.open(BytesIO(png_data))
                    image_id = f'{font.hash}{glyph}'
-                    image = RasterImage(
-                        pillow_image, image_id, optimize_size=())
+                    image = RasterImage(pillow_image, image_id, png_data)
                    d = font.widths[glyph] / 1000
                    a = pillow_image.width / pillow_image.height * d
                    pango.pango_font_get_glyph_extents(
@ -1210,7 +1213,7 @@ def draw_first_line(stream, textbox, text_overflow, block_ellipsis, x, y,
                    f = f / font_size - font_size
                    emojis.append([image, font, a, d, x_advance, f])

-            x_advance += (font.widths[glyph] + offset) / 1000
+            x_advance += (font.widths[glyph] + offset - kerning) / 1000

        # Close the last glyphs list, remove if empty
        if string[-1] == '<':
--- a/weasyprint/html.py
+++ b/weasyprint/html.py
@ -45,9 +45,6 @@ HTML_SPACE_SEPARATED_TOKENS_RE = re.compile(f'[^{HTML_WHITESPACE}]+')
 def ascii_lower(string):
    r"""Transform (only) ASCII letters to lower case: A-Z is mapped to a-z.

-    :param string: An Unicode string.
-    :returns: A new Unicode string.
-
    This is used for `ASCII case-insensitive
    <https://whatwg.org/C#ascii-case-insensitive>`_ matching.

@ -66,15 +63,9 @@ def ascii_lower(string):


 def element_has_link_type(element, link_type):
-    """
-    Return whether the given element has a ``rel`` attribute with the
-    given link type.
-
-    :param link_type: Must be a lower-case string.
-
-    """
-    return any(ascii_lower(token) == link_type for token in
-               HTML_SPACE_SEPARATED_TOKENS_RE.findall(element.get('rel', '')))
+    """Return whether element has a ``rel`` attribute with given link type."""
+    tokens = HTML_SPACE_SEPARATED_TOKENS_RE.findall(element.get('rel', ''))
+    return any(ascii_lower(token) == link_type for token in tokens)


 # Maps HTML tag names to function taking an HTML element and returning a Box.
--- a/weasyprint/images.py
+++ b/weasyprint/images.py
@ -1,14 +1,21 @@
 """Fetch and decode images in various formats."""

+import io
 import math
+import struct
 from hashlib import md5
 from io import BytesIO
 from itertools import cycle
 from math import inf
+from pathlib import Path
+from urllib.parse import urlparse
+from urllib.request import url2pathname
 from xml.etree import ElementTree

+import pydyf
 from PIL import Image, ImageFile, ImageOps

+from . import DEFAULT_OPTIONS
 from .layout.percent import percentage
 from .logger import LOGGER
 from .svg import SVG
@ -33,32 +40,211 @@ class ImageLoadingError(ValueError):


 class RasterImage:
-    def __init__(self, pillow_image, image_id, optimize_size):
-        pillow_image.id = image_id
-        self._pillow_image = pillow_image
-        self._optimize_size = optimize_size
-        self._intrinsic_width = pillow_image.width
-        self._intrinsic_height = pillow_image.height
-        self._intrinsic_ratio = (
-            self._intrinsic_width / self._intrinsic_height
-            if self._intrinsic_height != 0 else inf)
+    def __init__(self, pillow_image, image_id, image_data, filename=None,
+                 cache=None, orientation='none', options=DEFAULT_OPTIONS):
+        # Transpose image
+        original_pillow_image = pillow_image
+        pillow_image = rotate_pillow_image(pillow_image, orientation)
+        if original_pillow_image is not pillow_image:
+            # Keep image format as it is discarded by transposition
+            pillow_image.format = original_pillow_image.format
+            # Discard original data, as the image has been transformed
+            image_data = filename = None

-    def get_intrinsic_size(self, image_resolution, font_size):
-        return (
-            self._intrinsic_width / image_resolution,
-            self._intrinsic_height / image_resolution,
-            self._intrinsic_ratio)
+        self.id = image_id
+        self._cache = {} if cache is None else cache
+        self._jpeg_quality = jpeg_quality = options['jpeg_quality']
+        self._dpi = options['dpi']
+
+        if 'transparency' in pillow_image.info:
+            pillow_image = pillow_image.convert('RGBA')
+        elif pillow_image.mode in ('1', 'P', 'I'):
+            pillow_image = pillow_image.convert('RGB')
+
+        self.mode = pillow_image.mode
+        self.width = pillow_image.width
+        self.height = pillow_image.height
+        self.ratio = (self.width / self.height) if self.height != 0 else inf
+        self.optimize = optimize = options['optimize_images']
+
+        if pillow_image.format in ('JPEG', 'MPO'):
+            self.format = 'JPEG'
+            if image_data is None or optimize or jpeg_quality is not None:
+                image_file = io.BytesIO()
+                options = {'format': 'JPEG', 'optimize': optimize}
+                if self._jpeg_quality is not None:
+                    options['quality'] = self._jpeg_quality
+                pillow_image.save(image_file, **options)
+                image_data = image_file.getvalue()
+                filename = None
+        else:
+            self.format = 'PNG'
+            if image_data is None or optimize or pillow_image.format != 'PNG':
+                image_file = io.BytesIO()
+                pillow_image.save(image_file, format='PNG', optimize=optimize)
+                image_data = image_file.getvalue()
+                filename = None
+        self.image_data = self.cache_image_data(image_data, filename)
+
+    def get_intrinsic_size(self, resolution, font_size):
+        return self.width / resolution, self.height / resolution, self.ratio

    def draw(self, stream, concrete_width, concrete_height, image_rendering):
-        if self._intrinsic_width <= 0 or self._intrinsic_height <= 0:
+        if self.width <= 0 or self.height <= 0:
            return

-        image_name = stream.add_image(
-            self._pillow_image, image_rendering, self._optimize_size)
+        width, height = self.width, self.height
+        if self._dpi:
+            pt_to_in = 4 / 3 / 96
+            width_inches = abs(concrete_width * stream.ctm[0][0] * pt_to_in)
+            height_inches = abs(concrete_height * stream.ctm[1][1] * pt_to_in)
+            dpi = max(self.width / width_inches, self.height / height_inches)
+            if dpi > self._dpi:
+                ratio = self._dpi / dpi
+                image = Image.open(io.BytesIO(self.image_data.data))
+                width = int(round(self.width * ratio))
+                height = int(round(self.height * ratio))
+                image.thumbnail((max(1, width), max(1, height)))
+                image_file = io.BytesIO()
+                image.save(
+                    image_file, format=image.format, optimize=self.optimize)
+                width, height = image.width, image.height
+                self.image_data = self.cache_image_data(image_file.getvalue())
+        else:
+            dpi = None
+
+        interpolate = 'true' if image_rendering == 'auto' else 'false'
+
+        image_name = stream.add_image(self, width, height, interpolate)
        stream.transform(
            concrete_width, 0, 0, -concrete_height, 0, concrete_height)
        stream.draw_x_object(image_name)

+    def cache_image_data(self, data, filename=None, alpha=False):
+        if filename:
+            return LazyLocalImage(filename)
+        else:
+            key = f'{self.id}{int(alpha)}{self._dpi or ""}'
+            return LazyImage(self._cache, key, data)
+
+    def get_xobject(self, width, height, interpolate):
+        if self.mode in ('RGB', 'RGBA'):
+            color_space = '/DeviceRGB'
+        elif self.mode in ('L', 'LA'):
+            color_space = '/DeviceGray'
+        elif self.mode == 'CMYK':
+            color_space = '/DeviceCMYK'
+        else:
+            LOGGER.warning('Unknown image mode: %s', self.mode)
+            color_space = '/DeviceRGB'
+
+        extra = pydyf.Dictionary({
+            'Type': '/XObject',
+            'Subtype': '/Image',
+            'Width': width,
+            'Height': height,
+            'ColorSpace': color_space,
+            'BitsPerComponent': 8,
+            'Interpolate': interpolate,
+        })
+
+        if self.format == 'JPEG':
+            extra['Filter'] = '/DCTDecode'
+            return pydyf.Stream([self.image_data], extra)
+
+        extra['Filter'] = '/FlateDecode'
+        extra['DecodeParms'] = pydyf.Dictionary({
+            # Predictor 15 specifies that we're providing PNG data,
+            # ostensibly using an "optimum predictor", but doesn't actually
+            # matter as long as the predictor value is 10+ according to the
+            # spec. (Other PNG predictor values assert that we're using
+            # specific predictors that we don't want to commit to, but
+            # "optimum" can vary.)
+            'Predictor': 15,
+            'Columns': width,
+        })
+        if self.mode in ('RGB', 'RGBA'):
+            # Defaults to 1.
+            extra['DecodeParms']['Colors'] = 3
+        if self.mode in ('RGBA', 'LA'):
+            # Remove alpha channel from image
+            pillow_image = Image.open(io.BytesIO(self.image_data.data))
+            alpha = pillow_image.getchannel('A')
+            pillow_image = pillow_image.convert(self.mode[:-1])
+            png_data = self._get_png_data(pillow_image)
+            # Save alpha channel as mask
+            alpha_data = self._get_png_data(alpha)
+            stream = self.cache_image_data(alpha_data, alpha=True)
+            extra['SMask'] = pydyf.Stream([stream], extra={
+                'Filter': '/FlateDecode',
+                'Type': '/XObject',
+                'Subtype': '/Image',
+                'DecodeParms': pydyf.Dictionary({
+                    'Predictor': 15,
+                    'Columns': width,
+                }),
+                'Width': width,
+                'Height': height,
+                'ColorSpace': '/DeviceGray',
+                'BitsPerComponent': 8,
+                'Interpolate': interpolate,
+            })
+        else:
+            png_data = self._get_png_data(
+                Image.open(io.BytesIO(self.image_data.data)))
+
+        return pydyf.Stream([self.cache_image_data(png_data)], extra)
+
+    @staticmethod
+    def _get_png_data(pillow_image):
+        image_file = BytesIO()
+        pillow_image.save(image_file, format='PNG')
+
+        # Read the PNG header, then discard it because we know it's a PNG. If
+        # this weren't just output from Pillow, we should actually check it.
+        image_file.seek(8)
+
+        png_data = []
+        raw_chunk_length = image_file.read(4)
+        # PNG files consist of a series of chunks.
+        while raw_chunk_length:
+            # Each chunk begins with its data length (four bytes, may be zero),
+            # then its type (four ASCII characters), then the data, then four
+            # bytes of a CRC.
+            chunk_length, = struct.unpack('!I', raw_chunk_length)
+            chunk_type = image_file.read(4)
+            if chunk_type == b'IDAT':
+                png_data.append(image_file.read(chunk_length))
+            else:
+                image_file.seek(chunk_length, io.SEEK_CUR)
+            # We aren't checking the CRC, we assume this is a valid PNG.
+            image_file.seek(4, io.SEEK_CUR)
+            raw_chunk_length = image_file.read(4)
+
+        return b''.join(png_data)
+
+
+class LazyImage(pydyf.Object):
+    def __init__(self, cache, key, data):
+        super().__init__()
+        self._key = key
+        self._cache = cache
+        cache[key] = data
+
+    @property
+    def data(self):
+        return self._cache[self._key]
+
+
+class LazyLocalImage(pydyf.Object):
+    def __init__(self, filename):
+        super().__init__()
+        self._filename = filename
+
+    @property
+    def data(self):
+        return Path(self._filename).read_bytes()
+

 class SVGImage:
    def __init__(self, tree, base_url, url_fetcher, context):
@ -91,75 +277,88 @@ class SVGImage:
            self._url_fetcher, self._context)


-def get_image_from_uri(cache, url_fetcher, optimize_size, url,
-                       forced_mime_type=None, context=None,
-                       orientation='from-image'):
+def get_image_from_uri(cache, url_fetcher, options, url, forced_mime_type=None,
+                       context=None, orientation='from-image'):
    """Get an Image instance from an image URI."""
    if url in cache:
        return cache[url]

    try:
        with fetch(url_fetcher, url) as result:
+            parsed_url = urlparse(result.get('redirected_url'))
+            if parsed_url.scheme == 'file':
+                filename = url2pathname(parsed_url.path)
+            else:
+                filename = None
            if 'string' in result:
                string = result['string']
            else:
                string = result['file_obj'].read()
            mime_type = forced_mime_type or result['mime_type']

-            image = None
-            svg_exceptions = []
-            # Try to rely on given mimetype for SVG
-            if mime_type == 'image/svg+xml':
+        image = None
+        svg_exceptions = []
+        # Try to rely on given mimetype for SVG
+        if mime_type == 'image/svg+xml':
+            try:
+                tree = ElementTree.fromstring(string)
+                image = SVGImage(tree, url, url_fetcher, context)
+            except Exception as svg_exception:
+                svg_exceptions.append(svg_exception)
+        # Try pillow for raster images, or for failing SVG
+        if image is None:
+            try:
+                pillow_image = Image.open(BytesIO(string))
+            except Exception as raster_exception:
+                if mime_type == 'image/svg+xml':
+                    # Tried SVGImage then Pillow for a SVG, abort
+                    raise ImageLoadingError.from_exception(svg_exceptions[0])
                try:
+                    # Last chance, try SVG
                    tree = ElementTree.fromstring(string)
                    image = SVGImage(tree, url, url_fetcher, context)
-                except Exception as svg_exception:
-                    svg_exceptions.append(svg_exception)
-            # Try pillow for raster images, or for failing SVG
-            if image is None:
-                try:
-                    pillow_image = Image.open(BytesIO(string))
-                except Exception as raster_exception:
-                    if mime_type == 'image/svg+xml':
-                        # Tried SVGImage then Pillow for a SVG, abort
-                        raise ImageLoadingError.from_exception(
-                            svg_exceptions[0])
-                    try:
-                        # Last chance, try SVG
-                        tree = ElementTree.fromstring(string)
-                        image = SVGImage(tree, url, url_fetcher, context)
-                    except Exception:
-                        # Tried Pillow then SVGImage for a raster, abort
-                        raise ImageLoadingError.from_exception(
-                            raster_exception)
-                else:
-                    # Store image id to enable cache in Stream.add_image
-                    image_id = md5(url.encode()).hexdigest()
-                    # Keep image format as it is discarded by transposition
-                    image_format = pillow_image.format
-                    if orientation == 'from-image':
-                        if 'exif' in pillow_image.info:
-                            pillow_image = ImageOps.exif_transpose(
-                                pillow_image)
-                    elif orientation != 'none':
-                        angle, flip = orientation
-                        if angle > 0:
-                            rotation = getattr(
-                                Image.Transpose, f'ROTATE_{angle}')
-                            pillow_image = pillow_image.transpose(rotation)
-                        if flip:
-                            pillow_image = pillow_image.transpose(
-                                Image.Transpose.FLIP_LEFT_RIGHT)
-                    pillow_image.format = image_format
-                    image = RasterImage(pillow_image, image_id, optimize_size)
+                except Exception:
+                    # Tried Pillow then SVGImage for a raster, abort
+                    raise ImageLoadingError.from_exception(raster_exception)
+            else:
+                # Store image id to enable cache in Stream.add_image
+                image_id = md5(url.encode()).hexdigest()
+                image = RasterImage(
+                    pillow_image, image_id, string, filename, cache,
+                    orientation, options)

    except (URLFetchingError, ImageLoadingError) as exception:
        LOGGER.error('Failed to load image at %r: %s', url, exception)
        image = None
+
    cache[url] = image
    return image


+def rotate_pillow_image(pillow_image, orientation):
+    """Return a copy of a Pillow image with modified orientation.
+
+    If orientation is not changed, return the same image.
+
+    """
+    image_format = pillow_image.format
+    if orientation == 'from-image':
+        if 'exif' in pillow_image.info:
+            pillow_image = ImageOps.exif_transpose(pillow_image)
+    elif orientation != 'none':
+        angle, flip = orientation
+        if angle > 0:
+            rotation = getattr(Image.Transpose, f'ROTATE_{angle}')
+            pillow_image = pillow_image.transpose(rotation)
+        if flip:
+            pillow_image = pillow_image.transpose(
+                Image.Transpose.FLIP_LEFT_RIGHT)
+
+    # Keep image format as it is discarded by transposition
+    pillow_image.format = image_format
+    return pillow_image
+
+
 def process_color_stops(vector_length, positions):
    """Give color stops positions on the gradient vector.

--- a/weasyprint/layout/init.py
+++ b/weasyprint/layout/init.py
@ -105,10 +105,12 @@ def layout_document(html, root_box, context, max_loops=8):
    This includes line breaks, page breaks, absolute size and position for all
    boxes. Page based counters might require multiple passes.

-    :param root_box: root of the box tree (formatting structure of the html)
-                     the pages' boxes are created from that tree, i.e. this
-                     structure is not lost during pagination
-    :returns: a list of laid out Page objects.
+    :param root_box:
+        Root of the box tree (formatting structure of the HTML). The page boxes
+        are created from that tree, this structure is not lost during
+        pagination.
+    :returns:
+        A list of laid out Page objects.

    """
    initialize_page_maker(context, root_box)
@ -287,13 +289,18 @@ class LayoutContext:
        Value depends on current page.
        https://drafts.csswg.org/css-gcpm/#funcdef-string

-        :param store: dictionary where the resolved value is stored.
-        :param page: current page.
-        :param name: name of the named string or running element.
-        :param keyword: indicates which value of the named string or running
-                        element to use. Default is the first assignment on the
-                        current page else the most recent assignment.
-        :returns: text for string set, box for running element
+        :param dict store:
+            Dictionary where the resolved value is stored.
+        :param page:
+            Current page.
+        :param str name:
+            Name of the named string or running element.
+        :param str keyword:
+            Indicates which value of the named string or running element to
+            use. Default is the first assignment on the current page else the
+            most recent assignment.
+        :returns:
+            Text for string set, box for running element.

        """
        if self.current_page in store[name]:
--- a/weasyprint/layout/page.py
+++ b/weasyprint/layout/page.py
@ -174,20 +174,21 @@ def compute_fixed_dimension(context, box, outer, vertical, top_or_left):


 def compute_variable_dimension(context, side_boxes, vertical, outer_sum):
-    """
-    Compute and set a margin box fixed dimension on ``box``, as described in:
-    https://drafts.csswg.org/css-page-3/#margin-dimension
+    """Compute and set a margin box fixed dimension on ``box``

-    :param side_boxes: Three boxes on a same side (as opposed to a corner.)
+    Described in: https://drafts.csswg.org/css-page-3/#margin-dimension
+
+    :param side_boxes:
+        Three boxes on a same side (as opposed to a corner).
        A list of:
        - A @*-left or @*-top margin box
        - A @*-center or @*-middle margin box
        - A @*-right or @*-bottom margin box
    :param vertical:
-        True to set height, margin-top and margin-bottom; False for width,
-        margin-left and margin-right
+        ``True`` to set height, margin-top and margin-bottom;
+        ``False`` for width, margin-left and margin-right.
    :param outer_sum:
-        The target total outer dimension (max box width or height)
+        The target total outer dimension (max box width or height).

    """
    box_class = VerticalBox if vertical else HorizontalBox
@ -310,8 +311,10 @@ def make_margin_boxes(context, page, state):

        Return ``None`` if this margin box should not be generated.

-        :param at_keyword: which margin box to return, eg. '@top-left'
-        :param containing_block: as expected by :func:`resolve_percentages`.
+        :param at_keyword:
+            Which margin box to return, e.g. '@top-left'
+        :param containing_block:
+            As expected by :func:`resolve_percentages`.

        """
        style = context.style_for(page.page_type, at_keyword)
@ -507,9 +510,11 @@ def make_page(context, root_box, page_type, resume_at, page_number,
    and ``resume_at`` indicates where in the document to start the next page,
    or is ``None`` if this was the last page.

-    :param page_number: integer, start at 1 for the first page
-    :param resume_at: as returned by ``make_page()`` for the previous page,
-                      or ``None`` for the first page.
+    :param int page_number:
+        Page number, starts at 1 for the first page.
+    :param resume_at:
+        As returned by ``make_page()`` for the previous page, or ``None`` for
+        the first page.

    """
    style = context.style_for(page_type)
--- a/weasyprint/layout/preferred.py
+++ b/weasyprint/layout/preferred.py
@ -744,9 +744,12 @@ def trailing_whitespace_size(context, box):
        stripped_box = box.copy_with_text(stripped_text)
        stripped_box, resume, _ = split_text_box(
            context, stripped_box, None, old_resume)
-        assert stripped_box is not None
-        assert resume is None
-        return old_box.width - stripped_box.width
+        if stripped_box is None:
+            # old_box split just before the trailing spaces
+            return old_box.width
+        else:
+            assert resume is None
+            return old_box.width - stripped_box.width
    else:
        _, _, _, width, _, _ = split_first_line(
            box.text, box.style, context, None, box.justification_spacing)
--- a/weasyprint/pdf/init.py
+++ b/weasyprint/pdf/init.py
@ -45,7 +45,7 @@ def _w3c_date_to_pdf(string, attr_name):
            pdf_date += f"{tz_hour:+03d}'{tz_minute:02d}"
        else:
            pdf_date += 'Z'
-    return pdf_date
+    return f'D:{pdf_date}'


 def _reference_resources(pdf, resources, images, fonts):
@ -100,8 +100,7 @@ def _use_references(pdf, resources, images):
            alpha['SMask']['G'] = alpha['SMask']['G'].reference


-def generate_pdf(document, target, zoom, attachments, optimize_size,
-                 identifier, variant, version, custom_metadata):
+def generate_pdf(document, target, zoom, **options):
    # 0.75 = 72 PDF point per inch / 96 CSS pixel per inch
    scale = zoom * 0.75

@ -109,6 +108,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,

    # Set properties according to PDF variants
    mark = False
+    variant, version = options['pdf_variant'], options['pdf_version']
    if variant:
        variant_function, properties = VARIANTS[variant]
        if 'version' in properties:
@ -116,6 +116,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
        if 'mark' in properties:
            mark = properties['mark']

+    identifier = options['pdf_identifier']
    pdf = pydyf.PDF((version or '1.7'), identifier)
    states = pydyf.Dictionary()
    x_objects = pydyf.Dictionary()
@ -136,6 +137,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,

    annot_files = {}
    pdf_pages, page_streams = [], []
+    compress = not options['uncompressed_pdf']
    for page_number, (page, links_and_anchors) in enumerate(
            zip(document.pages, page_links_and_anchors)):
        # Draw from the top-left corner
@ -155,7 +157,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
            (right - left) / scale, (bottom - top) / scale)
        stream = Stream(
            document.fonts, page_rectangle, states, x_objects, patterns,
-            shadings, images, mark)
+            shadings, images, mark, compress=compress)
        stream.transform(d=-1, f=(page.height * scale))
        pdf.add_object(stream)
        page_streams.append(stream)
@ -175,10 +177,11 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,

        add_links(links_and_anchors, matrix, pdf, pdf_page, pdf_names, mark)
        add_annotations(
-            links_and_anchors[0], matrix, document, pdf, pdf_page, annot_files)
+            links_and_anchors[0], matrix, document, pdf, pdf_page, annot_files,
+            compress)
        add_inputs(
            page.inputs, matrix, pdf, pdf_page, resources, stream,
-            document.font_config.font_map)
+            document.font_config.font_map, compress)
        page.paint(stream, scale=scale)

        # Bleed
@ -227,7 +230,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
            _w3c_date_to_pdf(metadata.modified, 'modified'))
    if metadata.lang:
        pdf.catalog['Lang'] = pydyf.String(metadata.lang)
-    if custom_metadata:
+    if options['custom_metadata']:
        for key, value in metadata.custom.items():
            key = ''.join(char for char in key if char.isalnum())
            key = key.encode('ascii', errors='ignore').decode()
@ -235,7 +238,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
                pdf.info[key] = pydyf.String(value)

    # Embedded files
-    attachments = metadata.attachments + (attachments or [])
+    attachments = metadata.attachments + (options['attachments'] or [])
    pdf_attachments = []
    for attachment in attachments:
        pdf_attachment = write_pdf_attachment(
@ -256,7 +259,10 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
        pdf.catalog['Names']['EmbeddedFiles'] = content.reference

    # Embedded fonts
-    pdf_fonts = build_fonts_dictionary(pdf, document.fonts, optimize_size)
+    subset = not options['full_fonts']
+    hinting = options['hinting']
+    pdf_fonts = build_fonts_dictionary(
+        pdf, document.fonts, compress, subset, hinting)
    pdf.add_object(pdf_fonts)
    if 'AcroForm' in pdf.catalog:
        # Include Dingbats for forms
@ -284,6 +290,6 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,

    # Apply PDF variants functions
    if variant:
-        variant_function(pdf, metadata, document, page_streams)
+        variant_function(pdf, metadata, document, page_streams, compress)

    return pdf
--- a/weasyprint/pdf/anchors.py
+++ b/weasyprint/pdf/anchors.py
@ -92,7 +92,8 @@ def add_outlines(pdf, bookmarks, parent=None):
    return outlines, count


-def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map):
+def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map,
+               compress):
    """Include form inputs in PDF."""
    if not inputs:
        return
@ -119,7 +120,7 @@ def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map):
        input_name = pydyf.String(element.attrib.get('name', default_name))
        # TODO: where does this 0.75 scale come from?
        font_size = style['font_size'] * 0.75
-        field_stream = pydyf.Stream()
+        field_stream = pydyf.Stream(compress=compress)
        field_stream.set_color_rgb(*style['color'][:3])
        if input_type == 'checkbox':
            # Checkboxes
@ -130,7 +131,7 @@ def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map):
                'Type': '/XObject',
                'Subtype': '/Form',
                'BBox': pydyf.Array((0, 0, width, height)),
-            })
+            }, compress=compress)
            checked_stream.push_state()
            checked_stream.begin_text()
            checked_stream.set_color_rgb(*style['color'][:3])
@ -138,9 +139,8 @@ def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map):
            # Center (let’s assume that Dingbat’s check has a 0.8em size)
            x = (width - font_size * 0.8) / 2
            y = (height - font_size * 0.8) / 2
-            # TODO: we should have these operators in pydyf
-            checked_stream.stream.append(f'{x} {y} Td')
-            checked_stream.stream.append('(4) Tj')
+            checked_stream.move_text_to(x, y)
+            checked_stream.show_text_string('4')
            checked_stream.end_text()
            checked_stream.pop_state()
            pdf.add_object(checked_stream)
@ -195,7 +195,7 @@ def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map):
        pdf.catalog['AcroForm']['Fields'].append(field.reference)


-def add_annotations(links, matrix, document, pdf, page, annot_files):
+def add_annotations(links, matrix, document, pdf, page, annot_files, compress):
    """Include annotations in PDF."""
    # TODO: splitting a link into multiple independent rectangular
    # annotations works well for pure links, but rather mediocre for
@ -226,8 +226,7 @@ def add_annotations(links, matrix, document, pdf, page, annot_files):
            'Type': '/XObject',
            'Subtype': '/Form',
            'BBox': pydyf.Array(rectangle),
-            'Length': 0,
-        })
+        }, compress)
        pdf.add_object(stream)
        annot = pydyf.Dictionary({
            'Type': '/Annot',
@ -286,7 +285,7 @@ def write_pdf_attachment(pdf, attachment, url_fetcher):
                    'ModDate': attachment.modified,
                })
            })
-            file_stream = pydyf.Stream([stream], file_extra)
+            file_stream = pydyf.Stream([stream], file_extra, compress)
            pdf.add_object(file_stream)

    except URLFetchingError as exception:
--- a/weasyprint/pdf/fonts.py
+++ b/weasyprint/pdf/fonts.py
@ -7,7 +7,7 @@ import pydyf
 from ..logger import LOGGER


-def build_fonts_dictionary(pdf, fonts, optimize_size):
+def build_fonts_dictionary(pdf, fonts, compress_pdf, subset, hinting):
    pdf_fonts = pydyf.Dictionary()
    fonts_by_file_hash = {}
    for font in fonts.values():
@ -21,10 +21,10 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):

        # Clean font, optimize and handle emojis
        cmap = {}
-        if 'fonts' in optimize_size and not font.used_in_forms:
+        if subset and not font.used_in_forms:
            for file_font in file_fonts:
                cmap = {**cmap, **file_font.cmap}
-        font.clean(cmap)
+        font.clean(cmap, hinting)

        # Include font
        if font.type == 'otf':
@ -32,28 +32,24 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):
        else:
            font_extra = pydyf.Dictionary({'Length1': len(font.file_content)})
        font_stream = pydyf.Stream(
-            [font.file_content], font_extra, compress=True)
+            [font.file_content], font_extra, compress=compress_pdf)
        pdf.add_object(font_stream)
        font_references_by_file_hash[file_hash] = font_stream.reference

    for font in fonts.values():
-        optimize = (
-            'fonts' in optimize_size and
-            font.ttfont and not font.used_in_forms)
-        if optimize:
+        if subset and font.ttfont and not font.used_in_forms:
            # Only store widths and map for used glyphs
            font_widths = font.widths
            cmap = font.cmap
        else:
            # Store width and Unicode map for all glyphs
            font_widths, cmap = {}, {}
-            ratio = 1024 / font.ttfont['head'].unitsPerEm
            for letter, key in font.ttfont.getBestCmap().items():
                glyph = font.ttfont.getGlyphID(key)
                if glyph not in cmap:
                    cmap[glyph] = chr(letter)
                width = font.ttfont.getGlyphSet()[key].width
-                font_widths[glyph] = width * ratio
+                font_widths[glyph] = width * 1000 / font.upem

        max_x = max(font_widths.values()) if font_widths else 0
        bbox = (0, font.descent, max_x, font.ascent)
@ -81,7 +77,7 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):
            b'1 begincodespacerange',
            b'<0000> <ffff>',
            b'endcodespacerange',
-            f'{len(cmap)} beginbfchar'.encode()])
+            f'{len(cmap)} beginbfchar'.encode()], compress=compress_pdf)
        for glyph, text in cmap.items():
            unicode_codepoints = ''.join(
                f'{letter.encode("utf-16-be").hex()}' for letter in text)
@ -103,7 +99,7 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):

        if font.bitmap:
            _build_bitmap_font_dictionary(
-                font_dictionary, pdf, font, widths, optimize_size)
+                font_dictionary, pdf, font, widths, compress_pdf, subset)
        else:
            font_descriptor = pydyf.Dictionary({
                'Type': '/FontDescriptor',
@ -126,7 +122,8 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):
                for cid in cids:
                    bits[cid] = '1'
                stream = pydyf.Stream(
-                    (int(''.join(bits), 2).to_bytes(padded_width, 'big'),))
+                    (int(''.join(bits), 2).to_bytes(padded_width, 'big'),),
+                    compress=compress_pdf)
                pdf.add_object(stream)
                font_descriptor['CIDSet'] = stream.reference
            if font.type == 'otf':
@ -156,11 +153,11 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):


 def _build_bitmap_font_dictionary(font_dictionary, pdf, font, widths,
-                                  optimize_size):
+                                  compress_pdf, subset):
    # https://docs.microsoft.com/typography/opentype/spec/ebdt
    font_dictionary['FontBBox'] = pydyf.Array([0, 0, 1, 1])
    font_dictionary['FontMatrix'] = pydyf.Array([1, 0, 0, 1, 0, 0])
-    if 'fonts' in optimize_size:
+    if subset:
        chars = tuple(sorted(font.cmap))
    else:
        chars = tuple(range(256))
@ -309,7 +306,7 @@ def _build_bitmap_font_dictionary(font_dictionary, pdf, font, widths,
            b'/BPC 1',
            b'/D [1 0]',
            b'ID', bitmap, b'EI'
-        ])
+        ], compress=compress_pdf)
        pdf.add_object(bitmap_stream)
        char_procs[glyph_id] = bitmap_stream.reference

--- a/weasyprint/pdf/metadata.py
+++ b/weasyprint/pdf/metadata.py
@ -20,7 +20,7 @@ for key, value in NS.items():
    register_namespace(key, value)


-def add_metadata(pdf, metadata, variant, version, conformance):
+def add_metadata(pdf, metadata, variant, version, conformance, compress):
    """Add PDF stream of metadata.

    Described in ISO-32000-1:2008, 14.3.2.
@ -88,6 +88,6 @@ def add_metadata(pdf, metadata, variant, version, conformance):
    footer = b'<?xpacket end="r"?>'
    stream_content = b'\n'.join((header, xml, footer))
    extra = {'Type': '/Metadata', 'Subtype': '/XML'}
-    metadata = pydyf.Stream([stream_content], extra=extra)
+    metadata = pydyf.Stream([stream_content], extra, compress)
    pdf.add_object(metadata)
    pdf.catalog['Metadata'] = metadata.reference
--- a/weasyprint/pdf/pdfa.py
+++ b/weasyprint/pdf/pdfa.py
@ -18,7 +18,7 @@ from ..logger import LOGGER
 from .metadata import add_metadata


-def pdfa(pdf, metadata, document, page_streams, version):
+def pdfa(pdf, metadata, document, page_streams, compress, version):
    """Set metadata for PDF/A documents."""
    LOGGER.warning(
        'PDF/A support is experimental, '
@ -29,7 +29,7 @@ def pdfa(pdf, metadata, document, page_streams, version):
    profile = pydyf.Stream(
        [read_binary(__package__, 'sRGB2014.icc')],
        pydyf.Dictionary({'N': 3, 'Alternate': '/DeviceRGB'}),
-        compress=True)
+        compress=compress)
    pdf.add_object(profile)
    pdf.catalog['OutputIntents'] = pydyf.Array([
        pydyf.Dictionary({
@ -46,7 +46,7 @@ def pdfa(pdf, metadata, document, page_streams, version):
            pdf_object['F'] = 2 ** (3 - 1)

    # Common PDF metadata stream
-    add_metadata(pdf, metadata, 'a', version, 'B')
+    add_metadata(pdf, metadata, 'a', version, 'B', compress)


 VARIANTS = {
--- a/weasyprint/pdf/pdfua.py
+++ b/weasyprint/pdf/pdfua.py
@ -6,7 +6,7 @@ from ..logger import LOGGER
 from .metadata import add_metadata


-def pdfua(pdf, metadata, document, page_streams):
+def pdfua(pdf, metadata, document, page_streams, compress):
    """Set metadata for PDF/UA documents."""
    LOGGER.warning(
        'PDF/UA support is experimental, '
@ -117,7 +117,7 @@ def pdfua(pdf, metadata, document, page_streams):
        annotation['F'] = 2 ** (2 - 1)

    # Common PDF metadata stream
-    add_metadata(pdf, metadata, 'ua', version=1, conformance=None)
+    add_metadata(pdf, metadata, 'ua', 1, conformance=None, compress=compress)

    # PDF document extra metadata
    if 'Lang' not in pdf.catalog:
--- a/weasyprint/pdf/stream.py
+++ b/weasyprint/pdf/stream.py
@ -1,7 +1,6 @@
 """PDF stream."""

 import io
-import struct
 from functools import lru_cache
 from hashlib import md5

@ -98,7 +97,7 @@ class Font:
        if len(widths) > 1 and len(set(widths)) == 1:
            self.flags += 2 ** (1 - 1)  # FixedPitch

-    def clean(self, cmap):
+    def clean(self, cmap, hinting):
        if self.ttfont is None:
            return

@ -107,7 +106,7 @@ class Font:
            optimized_font = io.BytesIO()
            options = subset.Options(
                retain_gids=True, passthrough_tables=True,
-                ignore_missing_glyphs=True, hinting=False,
+                ignore_missing_glyphs=True, hinting=hinting,
                desubroutinize=True)
            options.drop_tables += ['GSUB', 'GPOS', 'SVG']
            subsetter = subset.Subsetter(options)
@ -196,7 +195,6 @@ class Stream(pydyf.Stream):
    def __init__(self, fonts, page_rectangle, states, x_objects, patterns,
                 shadings, images, mark, *args, **kwargs):
        super().__init__(*args, **kwargs)
-        self.compress = True
        self.page_rectangle = page_rectangle
        self.marked = []
        self._fonts = fonts
@ -357,113 +355,20 @@ class Stream(pydyf.Stream):
        })
        group = Stream(
            self._fonts, self.page_rectangle, states, x_objects, patterns,
-            shadings, self._images, self._mark, extra=extra)
+            shadings, self._images, self._mark, extra=extra,
+            compress=self.compress)
        group.id = f'x{len(self._x_objects)}'
        self._x_objects[group.id] = group
        return group

-    def _get_png_data(self, pillow_image, optimize):
-        image_file = io.BytesIO()
-        pillow_image.save(image_file, format='PNG', optimize=optimize)
-
-        # Read the PNG header, then discard it because we know it's a PNG. If
-        # this weren't just output from Pillow, we should actually check it.
-        image_file.seek(8)
-
-        png_data = b''
-        raw_chunk_length = image_file.read(4)
-        # PNG files consist of a series of chunks.
-        while len(raw_chunk_length) > 0:
-            # Each chunk begins with its data length (four bytes, may be zero),
-            # then its type (four ASCII characters), then the data, then four
-            # bytes of a CRC.
-            chunk_len, = struct.unpack('!I', raw_chunk_length)
-            chunk_type = image_file.read(4)
-            if chunk_type == b'IDAT':
-                png_data += image_file.read(chunk_len)
-            else:
-                image_file.seek(chunk_len, io.SEEK_CUR)
-            # We aren't checking the CRC, we assume this is a valid PNG.
-            image_file.seek(4, io.SEEK_CUR)
-            raw_chunk_length = image_file.read(4)
-
-        return png_data
-
-    def add_image(self, pillow_image, image_rendering, optimize_size):
-        image_name = f'i{pillow_image.id}'
+    def add_image(self, image, width, height, interpolate):
+        image_name = f'i{image.id}{width}{height}{interpolate}'
        self._x_objects[image_name] = None  # Set by write_pdf
        if image_name in self._images:
            # Reuse image already stored in document
            return image_name

-        if 'transparency' in pillow_image.info:
-            pillow_image = pillow_image.convert('RGBA')
-        elif pillow_image.mode in ('1', 'P', 'I'):
-            pillow_image = pillow_image.convert('RGB')
-
-        if pillow_image.mode in ('RGB', 'RGBA'):
-            color_space = '/DeviceRGB'
-        elif pillow_image.mode in ('L', 'LA'):
-            color_space = '/DeviceGray'
-        elif pillow_image.mode == 'CMYK':
-            color_space = '/DeviceCMYK'
-        else:
-            LOGGER.warning('Unknown image mode: %s', pillow_image.mode)
-            color_space = '/DeviceRGB'
-
-        interpolate = 'true' if image_rendering == 'auto' else 'false'
-        extra = pydyf.Dictionary({
-            'Type': '/XObject',
-            'Subtype': '/Image',
-            'Width': pillow_image.width,
-            'Height': pillow_image.height,
-            'ColorSpace': color_space,
-            'BitsPerComponent': 8,
-            'Interpolate': interpolate,
-        })
-
-        optimize = 'images' in optimize_size
-        if pillow_image.format in ('JPEG', 'MPO'):
-            extra['Filter'] = '/DCTDecode'
-            image_file = io.BytesIO()
-            pillow_image.save(image_file, format='JPEG', optimize=optimize)
-            stream = [image_file.getvalue()]
-        else:
-            extra['Filter'] = '/FlateDecode'
-            extra['DecodeParms'] = pydyf.Dictionary({
-                # Predictor 15 specifies that we're providing PNG data,
-                # ostensibly using an "optimum predictor", but doesn't actually
-                # matter as long as the predictor value is 10+ according to the
-                # spec. (Other PNG predictor values assert that we're using
-                # specific predictors that we don't want to commit to, but
-                # "optimum" can vary.)
-                'Predictor': 15,
-                'Columns': pillow_image.width,
-            })
-            if pillow_image.mode in ('RGB', 'RGBA'):
-                # Defaults to 1.
-                extra['DecodeParms']['Colors'] = 3
-            if pillow_image.mode in ('RGBA', 'LA'):
-                alpha = pillow_image.getchannel('A')
-                pillow_image = pillow_image.convert(pillow_image.mode[:-1])
-                alpha_data = self._get_png_data(alpha, optimize)
-                extra['SMask'] = pydyf.Stream([alpha_data], extra={
-                    'Filter': '/FlateDecode',
-                    'Type': '/XObject',
-                    'Subtype': '/Image',
-                    'DecodeParms': pydyf.Dictionary({
-                        'Predictor': 15,
-                        'Columns': pillow_image.width,
-                    }),
-                    'Width': pillow_image.width,
-                    'Height': pillow_image.height,
-                    'ColorSpace': '/DeviceGray',
-                    'BitsPerComponent': 8,
-                    'Interpolate': interpolate,
-                    })
-            stream = [self._get_png_data(pillow_image, optimize)]
-
-        xobject = pydyf.Stream(stream, extra=extra)
+        xobject = image.get_xobject(width, height, interpolate)
        self._images[image_name] = xobject
        return image_name

@ -493,7 +398,8 @@ class Stream(pydyf.Stream):
        })
        pattern = Stream(
            self._fonts, self.page_rectangle, states, x_objects, patterns,
-            shadings, self._images, self._mark, extra=extra)
+            shadings, self._images, self._mark, extra=extra,
+            compress=self.compress)
        pattern.id = f'p{len(self._patterns)}'
        self._patterns[pattern.id] = pattern
        return pattern
--- a/weasyprint/svg/images.py
+++ b/weasyprint/svg/images.py
@ -14,7 +14,7 @@ def svg(svg, node, font_size):
            node.get('width'), node.get('height'), font_size)
    scale_x, scale_y, translate_x, translate_y = preserve_ratio(
        svg, node, font_size, width, height)
-    if svg.tree != node:
+    if svg.tree != node and node.get('overflow', 'hidden') == 'hidden':
        svg.stream.rectangle(0, 0, width, height)
        svg.stream.clip()
        svg.stream.end()
--- a/weasyprint/svg/text.py
+++ b/weasyprint/svg/text.py
@ -12,6 +12,10 @@ class TextBox:
        self.pango_layout = pango_layout
        self.style = style

+    @property
+    def text(self):
+        return self.pango_layout.text
+

 def text(svg, node, font_size):
    """Draw text node."""
--- a/weasyprint/text/line_break.py
+++ b/weasyprint/text/line_break.py
@ -182,11 +182,20 @@ class Layout:
                add_attr(0, len(bytestring), letter_spacing)

            if word_spacing:
+                if bytestring == b' ':
+                    # We need more than one space to set word spacing
+                    self.text = ' \u200b'  # Space + zero-width space
+                    text, bytestring = unicode_to_char_p(self.text)
+                    pango.pango_layout_set_text(self.layout, text, -1)
+
                space_spacing = (
                    units_from_double(word_spacing) + letter_spacing)
                position = bytestring.find(b' ')
+                # Pango gives only half of word-spacing on boundaries
+                boundary_positions = (0, len(bytestring) - 1)
                while position != -1:
-                    add_attr(position, position + 1, space_spacing)
+                    factor = 1 + (position in boundary_positions)
+                    add_attr(position, position + 1, factor * space_spacing)
                    position = bytestring.find(b' ', position + 1)

            if word_breaking:
@ -226,15 +235,7 @@ class Layout:


 def create_layout(text, style, context, max_width, justification_spacing):
-    """Return an opaque Pango layout with default Pango line-breaks.
-
-    :param text: Unicode
-    :param style: a style dict of computed values
-    :param max_width:
-        The maximum available width in the same unit as ``style['font_size']``,
-        or ``None`` for unlimited width.
-
-    """
+    """Return an opaque Pango layout with default Pango line-breaks."""
    layout = Layout(context, style, justification_spacing, max_width)

    # Make sure that max_width * Pango.SCALE == max_width * 1024 fits in a
--- a/weasyprint/urls.py
+++ b/weasyprint/urls.py
@ -175,9 +175,12 @@ def default_url_fetcher(url, timeout=10, ssl_context=None):
    ``url_fetcher`` argument to :class:`HTML` or :class:`CSS`.
    (See :ref:`URL Fetchers`.)

-    :param str url: The URL of the resource to fetch.
-    :param int timeout: The number of seconds before HTTP requests are dropped.
-    :param ssl.SSLContext ssl_context: An SSL context used for HTTP requests.
+    :param str url:
+        The URL of the resource to fetch.
+    :param int timeout:
+        The number of seconds before HTTP requests are dropped.
+    :param ssl.SSLContext ssl_context:
+        An SSL context used for HTTP requests.
    :raises: An exception indicating failure, e.g. :obj:`ValueError` on
        syntactically invalid URL.
    :returns: A :obj:`dict` with the following keys: