1
1
mirror of https://github.com/Kozea/WeasyPrint.git synced 2024-09-11 20:47:56 +03:00

Merge branch 'master' of github.com:timoramsauer/WeasyPrint into HEAD

This commit is contained in:
Timo Ramsauer 2023-04-24 15:29:58 +02:00
commit b505d56199
31 changed files with 1005 additions and 647 deletions

View File

@ -43,7 +43,7 @@ jobs:
- name: Ticket
run: python -m weasyprint weasyprint-samples/ticket/ticket.html ${{env.REPORTS_FOLDER}}/ticket.pdf
- name: Archive generated PDFs
uses: actions/upload-artifact@v2
uses: actions/upload-artifact@v3
with:
name: generated-documents
path: ${{env.REPORTS_FOLDER}}

View File

@ -8,11 +8,12 @@ jobs:
strategy:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
python-version: ['3.7', '3.8', '3.9', '3.10', '3.11', 'pypy-3.8']
exclude:
# Wheels missing for this configuration
- os: macos-latest
python-version: pypy-3.8
python-version: ['3.11']
include:
- os: ubuntu-latest
python-version: '3.7'
- os: ubuntu-latest
python-version: 'pypy-3.8'
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4

View File

@ -57,6 +57,7 @@ Python API
.. autoclass:: CSS(input, **kwargs)
.. autoclass:: Attachment(input, **kwargs)
.. autofunction:: default_url_fetcher
.. autodata:: DEFAULT_OPTIONS
.. module:: weasyprint.document
.. autoclass:: Document
@ -645,6 +646,8 @@ supported.
The ``attr()`` functional notation is allowed in the ``content`` and
``string-set`` properties.
The ``calc()`` function is **not** supported.
Viewport-percentage lengths (``vw``, ``vh``, ``vmin``, ``vmax``) are **not**
supported.

View File

@ -2,6 +2,115 @@ Changelog
=========
Version 59.0b1
--------------
Released on 2023-04-14.
**This version is experimental, don't use it in production. If you find bugs,
please report them!**
Command-line API:
* The ``--optimize-size`` option and its short equivalent ``-O`` have been
deprecated. To activate or deactivate different size optimizations, you can
now use:
* ``--uncompressed-pdf``,
* ``--optimize-images``,
* ``--full-fonts``,
* ``--hinting``,
* ``--dpi <resolution>``, and
* ``--jpeg-quality <quality>``.
* A new ``--cache-folder <folder>`` option has been added to store temporary
data in the given folder on the disk instead of keeping them in memory.
Python API:
* Global rendering options are now given in ``**options`` instead of dedicated
parameters, with slightly different names. It means that the signature of the
``HTML.render()``, ``HTML.write_pdf()`` and ``Document.write_pdf()`` has
changed. Here are the steps to port your Python code to v59.0:
1. Use named parameters for these functions, not positioned parameters.
2. Rename some the parameters:
* ``image_cache`` becomes ``cache`` (see below),
* ``identifier`` becomes ``pdf_identifier``,
* ``variant`` becomes ``pdf_variant``,
* ``version`` becomes ``pdf_version``,
* ``forms`` becomes ``pdf_forms``,
* The ``optimize_size`` parameter of ``HTML.render()``, ``HTML.write_pdf()``
and ``Document()`` has been removed and will be ignored. You can now use the
``uncompressed_pdf``, ``full_fonts``, ``hinting``, ``dpi`` and
``jpeg_quality`` parameters that are included in ``**options``.
* The ``cache`` parameter can be included in ``**options`` to replace
``image_cache``. If it is a dictionary, this dictionary will be used to store
temporary data in memory, and can be even shared between multiple documents.
If its a folder Path or string, WeasyPrint stores temporary data in the
given temporary folder on disk instead of keeping them in memory.
New features:
* `#1853 <https://github.com/Kozea/WeasyPrint/pull/1853>`_,
`#1854 <https://github.com/Kozea/WeasyPrint/issues/1854>`_:
Reduce PDF size, with financial support from Code & Co.
* `#1824 <https://github.com/Kozea/WeasyPrint/issues/1824>`_,
`#1829 <https://github.com/Kozea/WeasyPrint/pull/1829>`_:
Reduce memory use for images
* `#1858 <https://github.com/Kozea/WeasyPrint/issues/1858>`_:
Add an option to keep hinting information in embedded fonts
Bug fixes:
* `#1855 <https://github.com/Kozea/WeasyPrint/issues/1855>`_:
Fix position of emojis in justified text
* `#1852 <https://github.com/Kozea/WeasyPrint/issues/1852>`_:
Dont crash when line can be split before trailing spaces
* `#1843 <https://github.com/Kozea/WeasyPrint/issues/1843>`_:
Fix syntax of dates in metadata
* `#1827 <https://github.com/Kozea/WeasyPrint/issues/1827>`_,
`#1832 <https://github.com/Kozea/WeasyPrint/pull/1832>`_:
Fix word-spacing problems with nested tags
Documentation:
* `#1841 <https://github.com/Kozea/WeasyPrint/issues/1841>`_:
Add a paragraph about unsupported calc() function
Contributors:
* Guillaume Ayoub
* Lucie Anglade
* Alex Ch
* whi_ne
* Jonas Castro
Backers and sponsors:
* Castedo Ellerman
* Kobalt
* Spacinov
* Grip Angebotssoftware
* Crisp BV
* Manuel Barkhau
* SimonSoft
* Menutech
* KontextWork
* NCC Group
* René Fritz
* Moritz Mahringer
* Yanal-Yvez Fargialla
* Piotr Horzycki
* Healthchecks.io
* TrainingSparkle
* Hammerbacher
* Synapsium
Version 58.1
------------

View File

@ -11,7 +11,7 @@ WeasyPrint |version| depends on:
* Python_ ≥ 3.7.0
* Pango_ ≥ 1.44.0
* pydyf_ ≥ 0.5.0
* pydyf_ ≥ 0.6.0
* CFFI_ ≥ 0.6
* html5lib_ ≥ 1.1
* tinycss2_ ≥ 1.0.0
@ -513,7 +513,8 @@ WeasyPrint provides two options to deal with images: ``optimize_size`` and
``optimize_size`` can enable size optimization for images, but also for fonts.
When enabled, the generated PDF will include smaller images and fonts, but the
rendering time may be slightly increased.
rendering time may be slightly increased. The whole structure of the PDF can be
compressed too.
.. code-block:: python
@ -523,7 +524,7 @@ rendering time may be slightly increased.
# Full size optimization, slower, but generated PDF is smaller
HTML('https://example.org/').write_pdf(
'example.pdf', optimize_size=('fonts', 'images'))
'example.pdf', optimize_size=('fonts', 'images', 'hinting', 'pdf'))
``image_cache`` gives the possibility to use a cache for images, avoiding to
download, parse and optimize them each time they are used.
@ -539,6 +540,11 @@ time when you render a lot of documents that use the same images.
HTML(f'https://example.org/?id={i}').write_pdf(
f'example-{i}.pdf', image_cache=cache)
Its also possible to cache images on disk instead of keeping them in memory.
The ``--cache-folder`` CLI option can be used to define the folder used to
store temporary images. You can also provide this folder path as a string for
``image_cache``.
Logging
~~~~~~~

View File

@ -12,7 +12,7 @@ requires-python = '>=3.7'
readme = {file = 'README.rst', content-type = 'text/x-rst'}
license = {file = 'LICENSE'}
dependencies = [
'pydyf >=0.5.0',
'pydyf >=0.6.0',
'cffi >=0.6',
'html5lib >=1.1',
'tinycss2 >=1.0.0',

View File

@ -73,14 +73,10 @@ def document_write_png(self, target=None, resolution=96, antialiasing=1,
shutil.copyfileobj(png, fd)
def html_write_png(self, target=None, stylesheets=None, resolution=96,
presentational_hints=False, optimize_size=('fonts',),
font_config=None, counter_style=None, image_cache=None):
return self.render(
stylesheets, presentational_hints=presentational_hints,
optimize_size=optimize_size, font_config=font_config,
counter_style=counter_style, image_cache=image_cache).write_png(
target, resolution)
def html_write_png(self, target=None, font_config=None, counter_style=None,
resolution=96, **options):
document = self.render(font_config, counter_style, **options)
return document.write_png(target, resolution)
Document.write_png = document_write_png

View File

@ -6,6 +6,7 @@ import os
import sys
import unicodedata
import zlib
from functools import partial
from pathlib import Path
from urllib.parse import urljoin, uses_relative
@ -78,11 +79,8 @@ def _check_doc1(html, has_base_url=True):
def _run(args, stdin=b''):
stdin = io.BytesIO(stdin)
stdout = io.BytesIO()
try:
__main__.HTML = FakeHTML
__main__.main(args.split(), stdin=stdin, stdout=stdout)
finally:
__main__.HTML = HTML
HTML = partial(FakeHTML, force_uncompressed_pdf=False)
__main__.main(args.split(), stdin=stdin, stdout=stdout, HTML=HTML)
return stdout.getvalue()
@ -303,11 +301,12 @@ def test_command_line_render(tmpdir):
tmpdir.join(name).write_binary(pattern_bytes)
# Reference
html_obj = FakeHTML(string=combined, base_url='dummy.html')
html_obj = FakeHTML(
string=combined, base_url='dummy.html', force_uncompressed_pdf=False)
pdf_bytes = html_obj.write_pdf()
rotated_pdf_bytes = FakeHTML(
string=combined, base_url='dummy.html',
media_type='screen').write_pdf()
media_type='screen', force_uncompressed_pdf=False).write_pdf()
tmpdir.join('no_css.html').write_binary(html)
tmpdir.join('combined.html').write_binary(combined)
@ -360,35 +359,34 @@ def test_command_line_render(tmpdir):
os.environ['SOURCE_DATE_EPOCH'] = '0'
_run('not_optimized.html out15.pdf')
_run('not_optimized.html out16.pdf -O images')
_run('not_optimized.html out17.pdf -O fonts')
_run('not_optimized.html out18.pdf -O fonts -O images')
_run('not_optimized.html out19.pdf -O all')
_run('not_optimized.html out20.pdf -O none')
_run('not_optimized.html out21.pdf -O none -O all')
_run('not_optimized.html out22.pdf -O all -O none')
_run('not_optimized.html out16.pdf --optimize-images')
_run('not_optimized.html out17.pdf --optimize-images -j 10')
_run('not_optimized.html out18.pdf --optimize-images -j 10 -D 1')
_run('not_optimized.html out19.pdf --hinting')
_run('not_optimized.html out20.pdf --full-fonts')
_run('not_optimized.html out21.pdf --full-fonts --uncompressed-pdf')
_run(f'not_optimized.html out22.pdf -c {tmpdir}')
assert (
len(tmpdir.join('out18.pdf').read_binary()) <
len(tmpdir.join('out17.pdf').read_binary()) <
len(tmpdir.join('out16.pdf').read_binary()) <
len(tmpdir.join('out15.pdf').read_binary()) <
len(tmpdir.join('out20.pdf').read_binary()))
len(tmpdir.join('out19.pdf').read_binary()) <
len(tmpdir.join('out20.pdf').read_binary()) <
len(tmpdir.join('out21.pdf').read_binary()))
assert len({
tmpdir.join(f'out{i}.pdf').read_binary()
for i in (16, 18, 19, 21)}) == 1
assert len({
tmpdir.join(f'out{i}.pdf').read_binary()
for i in (15, 17)}) == 1
assert len({
tmpdir.join(f'out{i}.pdf').read_binary()
for i in (20, 22)}) == 1
for i in (15, 22)}) == 1
os.environ.pop('SOURCE_DATE_EPOCH')
stdout = _run('combined.html -')
stdout = _run('combined.html --uncompressed-pdf -')
assert stdout.count(b'attachment') == 0
stdout = _run('combined.html -')
stdout = _run('combined.html --uncompressed-pdf -')
assert stdout.count(b'attachment') == 0
stdout = _run('-a pattern.png combined.html -')
stdout = _run('-a pattern.png --uncompressed-pdf combined.html -')
assert stdout.count(b'attachment') == 1
stdout = _run('-a style.css -a pattern.png combined.html -')
stdout = _run(
'-a style.css -a pattern.png --uncompressed-pdf combined.html -')
assert stdout.count(b'attachment') == 2
os.mkdir('subdirectory')
@ -423,42 +421,59 @@ def test_command_line_render(tmpdir):
(4, '2.0'),
))
def test_pdfa(version, pdf_version):
stdout = _run(f'--pdf-variant=pdf/a-{version}b - -', b'test')
stdout = _run(
f'--pdf-variant=pdf/a-{version}b --uncompressed-pdf - -', b'test')
assert f'PDF-{pdf_version}'.encode() in stdout
assert f'part="{version}"'.encode() in stdout
@pytest.mark.parametrize('version, pdf_version', (
(1, '1.4'),
(2, '1.7'),
(3, '1.7'),
(4, '2.0'),
))
def test_pdfa_compressed(version, pdf_version):
_run(f'--pdf-variant=pdf/a-{version}b - -', b'test')
def test_pdfua():
stdout = _run('--pdf-variant=pdf/ua-1 - -', b'test')
stdout = _run('--pdf-variant=pdf/ua-1 --uncompressed-pdf - -', b'test')
assert b'part="1"' in stdout
def test_pdfua_compressed():
_run('--pdf-variant=pdf/ua-1 - -', b'test')
def test_pdf_identifier():
stdout = _run('--pdf-identifier=abc - -', b'test')
stdout = _run('--pdf-identifier=abc --uncompressed-pdf - -', b'test')
assert b'abc' in stdout
def test_pdf_version():
stdout = _run('--pdf-version=1.4 - -', b'test')
stdout = _run('--pdf-version=1.4 --uncompressed-pdf - -', b'test')
assert b'PDF-1.4' in stdout
def test_pdf_custom_metadata():
stdout = _run('--custom-metadata - -', b'<meta name=key content=value />')
stdout = _run(
'--custom-metadata --uncompressed-pdf - -',
b'<meta name=key content=value />')
assert b'/key' in stdout
assert b'value' in stdout
def test_bad_pdf_custom_metadata():
stdout = _run(
'--custom-metadata - -',
'--custom-metadata --uncompressed-pdf - -',
'<meta name=é content=value />'.encode('latin1'))
assert b'value' not in stdout
def test_partial_pdf_custom_metadata():
stdout = _run(
'--custom-metadata - -',
'--custom-metadata --uncompressed-pdf - -',
'<meta name=a.b/céd0 content=value />'.encode('latin1'))
assert b'/abcd0' in stdout
assert b'value' in stdout
@ -470,10 +485,10 @@ def test_partial_pdf_custom_metadata():
(b'<textarea></textarea>', b'/Tx'),
))
def test_pdf_inputs(html, field):
stdout = _run('--pdf-forms - -', html)
stdout = _run('--pdf-forms --uncompressed-pdf - -', html)
assert b'AcroForm' in stdout
assert field in stdout
stdout = _run('- -', html)
stdout = _run('--uncompressed-pdf - -', html)
assert b'AcroForm' not in stdout
@ -484,8 +499,10 @@ def test_pdf_inputs(html, field):
))
def test_appearance(css, with_forms, without_forms):
html = f'<input style="{css}">'.encode()
assert (b'AcroForm' in _run('--pdf-forms - -', html)) is with_forms
assert (b'AcroForm' in _run('- -', html)) is without_forms
assert with_forms is (
b'AcroForm' in _run('--pdf-forms --uncompressed-pdf - -', html))
assert without_forms is (
b'AcroForm' in _run(' --uncompressed-pdf - -', html))
def test_reproducible():
@ -541,20 +558,20 @@ def test_low_level_api(assert_pixels_equal):
assert pdf_bytes.startswith(b'%PDF')
png_bytes = html.write_png(stylesheets=[css])
document = html.render([css])
document = html.render(stylesheets=[css])
page, = document.pages
assert page.width == 8
assert page.height == 8
assert document.write_png() == png_bytes
assert document.copy([page]).write_png() == png_bytes
document = html.render([css])
document = html.render(stylesheets=[css])
page, = document.pages
assert (page.width, page.height) == (8, 8)
png_bytes = document.write_png(resolution=192)
check_png_pattern(assert_pixels_equal, png_bytes, x2=True)
document = html.render([css])
document = html.render(stylesheets=[css])
page, = document.pages
assert (page.width, page.height) == (8, 8)
# A resolution that is not multiple of 96:

View File

@ -26,7 +26,7 @@ RIGHT = round(210 * 72 / 25.4, 6)
def test_page_size_zoom(zoom):
pdf = FakeHTML(string='<style>@page{size:3in 4in').write_pdf(zoom=zoom)
width, height = int(216 * zoom), int(288 * zoom)
assert f'/MediaBox [ 0 0 {width} {height} ]'.encode() in pdf
assert f'/MediaBox [0 0 {width} {height}]'.encode() in pdf
@assert_no_logs
@ -57,7 +57,7 @@ def test_bookmarks_2():
@assert_no_logs
def test_bookmarks_3():
pdf = FakeHTML(string='<h1>a nbsp…</h1>').write_pdf()
assert re.findall(b'/Title <(.*)>', pdf) == [
assert re.findall(b'/Title <(\\w*)>', pdf) == [
b'feff006100a0006e0062007300702026']
@ -327,11 +327,11 @@ def test_links():
''', base_url=resource_filename('<inline HTML>')).write_pdf()
uris = re.findall(b'/URI \\((.*)\\)', pdf)
types = re.findall(b'/S (.*)', pdf)
subtypes = re.findall(b'/Subtype (.*)', pdf)
types = re.findall(b'/S (/\\w*)', pdf)
subtypes = re.findall(b'/Subtype (/\\w*)', pdf)
rects = [
[float(number) for number in match.split()] for match in re.findall(
b'/Rect \\[ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+) \\]', pdf)]
b'/Rect \\[([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+)\\]', pdf)]
# 30pt wide (like the image), 20pt high (like line-height)
assert uris.pop(0) == b'https://weasyprint.org'
@ -349,7 +349,7 @@ def test_links():
assert subtypes.pop(0) == b'/Link'
assert b'/Dest (lipsum)' in pdf
link = re.search(
b'\\(lipsum\\) \\[ \\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+) ]',
b'\\(lipsum\\) \\[\\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+)]',
pdf).group(1)
assert [float(number) for number in link.split()] == [0, TOP, 0]
assert rects.pop(0) == [10, TOP - 100, 10 + 32, TOP - 100 - 20]
@ -362,7 +362,7 @@ def test_links():
assert subtypes.pop(0) == b'/Link'
assert b'/Dest (hello)' in pdf
link = re.search(
b'\\(hello\\) \\[ \\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+) ]',
b'\\(hello\\) \\[\\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+)]',
pdf).group(1)
assert [float(number) for number in link.split()] == [0, TOP - 200, 0]
assert rects.pop(0) == [0, TOP, RIGHT, TOP - 30]
@ -387,7 +387,7 @@ def test_relative_links_no_height():
string='<a href="../lipsum" style="display: block"></a>a',
base_url='https://weasyprint.org/foo/bar/').write_pdf()
assert b'/S /URI\n/URI (https://weasyprint.org/foo/lipsum)'
assert f'/Rect [ 0 {TOP} {RIGHT} {TOP} ]'.encode() in pdf
assert f'/Rect [0 {TOP} {RIGHT} {TOP}]'.encode() in pdf
@assert_no_logs
@ -397,7 +397,7 @@ def test_relative_links_missing_base():
string='<a href="../lipsum" style="display: block"></a>a',
base_url=None).write_pdf()
assert b'/S /URI\n/URI (../lipsum)'
assert f'/Rect [ 0 {TOP} {RIGHT} {TOP} ]'.encode() in pdf
assert f'/Rect [0 {TOP} {RIGHT} {TOP}]'.encode() in pdf
@assert_no_logs
@ -421,11 +421,11 @@ def test_relative_links_internal():
base_url=None).write_pdf()
assert b'/Dest (lipsum)' in pdf
link = re.search(
b'\\(lipsum\\) \\[ \\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+) ]',
b'\\(lipsum\\) \\[\\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+)]',
pdf).group(1)
assert [float(number) for number in link.split()] == [0, TOP, 0]
rect = re.search(
b'/Rect \\[ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+) \\]',
b'/Rect \\[([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+)\\]',
pdf).group(1)
assert [float(number) for number in rect.split()] == [0, TOP, RIGHT, TOP]
@ -437,11 +437,11 @@ def test_relative_links_anchors():
base_url=None).write_pdf()
assert b'/Dest (lipsum)' in pdf
link = re.search(
b'\\(lipsum\\) \\[ \\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+) ]',
b'\\(lipsum\\) \\[\\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+)]',
pdf).group(1)
assert [float(number) for number in link.split()] == [0, TOP, 0]
rect = re.search(
b'/Rect \\[ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+) \\]',
b'/Rect \\[([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+)\\]',
pdf).group(1)
assert [float(number) for number in rect.split()] == [0, TOP, RIGHT, TOP]
@ -474,11 +474,11 @@ def test_missing_links():
assert b'/Dest (lipsum)' in pdf
assert len(logs) == 1
link = re.search(
b'\\(lipsum\\) \\[ \\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+) ]',
b'\\(lipsum\\) \\[\\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+)]',
pdf).group(1)
assert [float(number) for number in link.split()] == [0, TOP - 15, 0]
rect = re.search(
b'/Rect \\[ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+) \\]',
b'/Rect \\[([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+)\\]',
pdf).group(1)
assert [float(number) for number in rect.split()] == [
0, TOP, RIGHT, TOP - 15]
@ -495,8 +495,8 @@ def test_anchor_multiple_pages():
<a href="#lipsum"></a>
</div>
''', base_url=None).write_pdf()
first_page, = re.findall(b'/Kids \\[ (\\d+) 0 R', pdf)
assert b'/Names [ (lipsum) [ ' + first_page in pdf
first_page, = re.findall(b'/Kids \\[(\\d+) 0 R', pdf)
assert b'/Names [(lipsum) [' + first_page in pdf
@assert_no_logs
@ -537,8 +537,7 @@ def test_embed_images_from_pages():
string='<img src="not-optimized.jpg">').render().pages
document = Document(
(page1, page2), metadata=DocumentMetadata(),
font_config=FontConfiguration(), url_fetcher=None,
optimize_size=()).write_pdf()
font_config=FontConfiguration(), url_fetcher=None).write_pdf()
assert document.count(b'/Filter /DCTDecode') == 2
@ -562,8 +561,8 @@ def test_document_info():
b'006600740065007200a00061006c006c>') in pdf
assert b'/Keywords (html, css, pdf)' in pdf
assert b'/Subject <feff0042006c0061006820260020>' in pdf
assert b'/CreationDate (20110421230000Z)' in pdf
assert b"/ModDate (20130721234600+01'00)" in pdf
assert b'/CreationDate (D:20110421230000Z)' in pdf
assert b"/ModDate (D:20130721234600+01'00)" in pdf
@assert_no_logs
@ -717,6 +716,6 @@ def test_bleed(style, media, bleed, trim):
<style>@page { %s }</style>
<body>test
''' % style).write_pdf()
assert '/MediaBox [ {} {} {} {} ]'.format(*media).encode() in pdf
assert '/BleedBox [ {} {} {} {} ]'.format(*bleed).encode() in pdf
assert '/TrimBox [ {} {} {} {} ]'.format(*trim).encode() in pdf
assert '/MediaBox [{} {} {} {}]'.format(*media).encode() in pdf
assert '/BleedBox [{} {} {} {}]'.format(*bleed).encode() in pdf
assert '/TrimBox [{} {} {} {}]'.format(*trim).encode() in pdf

View File

@ -458,8 +458,16 @@ def test_text_align_justify_no_break_between_children():
assert span_3.position_x == 5 * 16 # (3 + 1) characters + 1 space
@pytest.mark.parametrize('text', (
'Lorem ipsum dolor<em>sit amet</em>',
'Lorem ipsum <em>dolorsit</em> amet',
'Lorem ipsum <em></em>dolorsit amet',
'Lorem ipsum<em> </em>dolorsit amet',
'Lorem ipsum<em> dolorsit</em> amet',
'Lorem ipsum <em>dolorsit </em>amet',
))
@assert_no_logs
def test_word_spacing():
def test_word_spacing(text):
# keep the empty <style> as a regression test: element.text is None
# (Not a string.)
page, = render_pages('''
@ -470,15 +478,14 @@ def test_word_spacing():
line, = body.children
strong_1, = line.children
# TODO: Pango gives only half of word-spacing to a space at the end
# of a TextBox. Is this what we want?
page, = render_pages('''
<style>strong { word-spacing: 11px }</style>
<body><strong>Lorem ipsum dolor<em>sit amet</em></strong>''')
<body><strong>%s</strong>''' % text)
html, = page.children
body, = html.children
line, = body.children
strong_2, = line.children
assert strong_2.width - strong_1.width == 33
@ -1018,6 +1025,19 @@ def test_overflow_wrap_trailing_space(wrap, text, body_width, expected_width):
assert td.width == expected_width
def test_line_break_before_trailing_space():
# Test regression: https://github.com/Kozea/WeasyPrint/issues/1852
page, = render_pages('''
<p style="display: inline-block">test\u2028 </p>a
<p style="display: inline-block">test\u2028</p>a
''')
html, = page.children
body, = html.children
line, = body.children
p1, space1, p2, space2 = line.children
assert p1.width == p2.width
def white_space_lines(width, space):
page, = render_pages('''
<style>

View File

@ -8,7 +8,7 @@ import sys
import threading
import wsgiref.simple_server
from weasyprint import CSS, HTML, images
from weasyprint import CSS, DEFAULT_OPTIONS, HTML, images
from weasyprint.css import get_all_computed_styles
from weasyprint.css.counters import CounterStyle
from weasyprint.css.targets import TargetCollector
@ -29,30 +29,40 @@ TEST_UA_STYLESHEET = CSS(filename=os.path.join(
os.path.dirname(__file__), '..', 'weasyprint', 'css', 'tests_ua.css'
))
PROPER_CHILDREN = dict((key, tuple(map(tuple, value))) for key, value in {
PROPER_CHILDREN = {
# Children can be of *any* type in *one* of the lists.
boxes.BlockContainerBox: [[boxes.BlockLevelBox], [boxes.LineBox]],
boxes.LineBox: [[boxes.InlineLevelBox]],
boxes.InlineBox: [[boxes.InlineLevelBox]],
boxes.TableBox: [[boxes.TableCaptionBox,
boxes.TableColumnGroupBox, boxes.TableColumnBox,
boxes.TableRowGroupBox, boxes.TableRowBox]],
boxes.InlineTableBox: [[boxes.TableCaptionBox,
boxes.TableColumnGroupBox, boxes.TableColumnBox,
boxes.TableRowGroupBox, boxes.TableRowBox]],
boxes.TableColumnGroupBox: [[boxes.TableColumnBox]],
boxes.TableRowGroupBox: [[boxes.TableRowBox]],
boxes.TableRowBox: [[boxes.TableCellBox]],
}.items())
boxes.BlockContainerBox: ((boxes.BlockLevelBox,), (boxes.LineBox,)),
boxes.LineBox: ((boxes.InlineLevelBox,),),
boxes.InlineBox: ((boxes.InlineLevelBox,),),
boxes.TableBox: ((
boxes.TableCaptionBox, boxes.TableColumnGroupBox, boxes.TableColumnBox,
boxes.TableRowGroupBox, boxes.TableRowBox),),
boxes.InlineTableBox: ((
boxes.TableCaptionBox, boxes.TableColumnGroupBox, boxes.TableColumnBox,
boxes.TableRowGroupBox, boxes.TableRowBox),),
boxes.TableColumnGroupBox: ((boxes.TableColumnBox,),),
boxes.TableRowGroupBox: ((boxes.TableRowBox,),),
boxes.TableRowBox: ((boxes.TableCellBox,),),
}
class FakeHTML(HTML):
"""Like weasyprint.HTML, but with a lighter UA stylesheet."""
def __init__(self, *args, force_uncompressed_pdf=True, **kwargs):
super().__init__(*args, **kwargs)
self._force_uncompressed_pdf = force_uncompressed_pdf
def _ua_stylesheets(self, forms=False):
return [
TEST_UA_STYLESHEET if stylesheet == HTML5_UA_STYLESHEET
else stylesheet for stylesheet in super()._ua_stylesheets(forms)]
def write_pdf(self, target=None, zoom=1, finisher=None, **options):
# Override function to force the generation of uncompressed PDFs
if self._force_uncompressed_pdf:
options['uncompressed_pdf'] = True
return super().write_pdf(target, zoom, finisher, **options)
def resource_filename(basename):
"""Return the absolute path of the resource called ``basename``."""
@ -182,7 +192,7 @@ def _parse_base(html_content, base_url=BASE_URL):
style_for = get_all_computed_styles(document, counter_style=counter_style)
get_image_from_uri = functools.partial(
images.get_image_from_uri, cache={}, url_fetcher=document.url_fetcher,
optimize_size=())
options=DEFAULT_OPTIONS)
target_collector = TargetCollector()
footnotes = []
return (

View File

@ -15,11 +15,73 @@ import cssselect2
import html5lib
import tinycss2
VERSION = __version__ = '58.1'
VERSION = __version__ = '59.0b1'
#: Default values for command-line and Python API options. See
#: :func:`__main__.main` to learn more about specific options for
#: command-line.
#:
#: :param list stylesheets:
#: An optional list of user stylesheets. The list can include
#: are :class:`CSS` objects, filenames, URLs, or file-like
#: objects. (See :ref:`Stylesheet Origins`.)
#: :param str media_type:
#: Media type to use for @media.
#: :param list attachments:
#: A list of additional file attachments for the generated PDF
#: document or :obj:`None`. The list's elements are
#: :class:`Attachment` objects, filenames, URLs or file-like objects.
#: :param bytes pdf_identifier:
#: A bytestring used as PDF file identifier.
#: :param str pdf_variant:
#: A PDF variant name.
#: :param str pdf_version:
#: A PDF version number.
#: :param bool pdf_forms:
#: Whether PDF forms have to be included.
#: :param bool uncompressed_pdf:
#: Whether PDF content should be compressed.
#: :param bool custom_metadata:
#: Whether custom HTML metadata should be stored in the generated PDF.
#: :param bool presentational_hints:
#: Whether HTML presentational hints are followed.
#: :param bool optimize_images:
#: Whether size of embedded images should be optimized, with no quality
#: loss.
#: :param int jpeg_quality:
#: JPEG quality between 0 (worst) to 95 (best).
#: :param int dpi:
#: Maximum resolution of images embedded in the PDF.
#: :param bool full_fonts:
#: Whether unmodified font files should be embedded when possible.
#: :param bool hinting:
#: Whether hinting information should be kept in embedded fonts.
#: :type cache: :obj:`dict`, :class:`pathlib.Path` or :obj:`str`
#: :param cache:
#: A dictionary used to cache images in memory, or a folder path where
#: images are temporarily stored.
DEFAULT_OPTIONS = {
'stylesheets': None,
'media_type': 'print',
'attachments': None,
'pdf_identifier': None,
'pdf_variant': None,
'pdf_version': None,
'pdf_forms': None,
'uncompressed_pdf': False,
'custom_metadata': False,
'presentational_hints': False,
'optimize_images': False,
'jpeg_quality': None,
'dpi': None,
'full_fonts': False,
'hinting': False,
'cache': None,
}
__all__ = [
'HTML', 'CSS', 'Attachment', 'Document', 'Page', 'default_url_fetcher',
'VERSION', '__version__']
'HTML', 'CSS', 'DEFAULT_OPTIONS', 'Attachment', 'Document', 'Page',
'default_url_fetcher', 'VERSION', '__version__']
# Import after setting the version, as the version is used in other modules
@ -55,12 +117,15 @@ class HTML:
Alternatively, use **one** named argument so that no guessing is involved:
:type filename: str or pathlib.Path
:param filename: A filename, relative to the current directory, or
absolute.
:param str url: An absolute, fully qualified URL.
:param filename:
A filename, relative to the current directory, or absolute.
:param str url:
An absolute, fully qualified URL.
:type file_obj: :term:`file object`
:param file_obj: Any object with a ``read`` method.
:param str string: A string of HTML source.
:param file_obj:
Any object with a ``read`` method.
:param str string:
A string of HTML source.
Specifying multiple inputs is an error:
``HTML(filename="foo.html", url="localhost://bar.html")``
@ -68,20 +133,22 @@ class HTML:
You can also pass optional named arguments:
:param str encoding: Force the source character encoding.
:param str base_url: The base used to resolve relative URLs
(e.g. in ``<img src="../foo.png">``). If not provided, try to use
the input filename, URL, or ``name`` attribute of :term:`file objects
<file object>`.
:type url_fetcher: :term:`function`
:param url_fetcher: A function or other callable
with the same signature as :func:`default_url_fetcher` called to
fetch external resources such as stylesheets and images.
(See :ref:`URL Fetchers`.)
:param str media_type: The media type to use for ``@media``.
Defaults to ``'print'``. **Note:** In some cases like
``HTML(string=foo)`` relative URLs will be invalid if ``base_url``
is not provided.
:param str encoding:
Force the source character encoding.
:param str base_url:
The base used to resolve relative URLs (e.g. in
``<img src="../foo.png">``). If not provided, try to use the input
filename, URL, or ``name`` attribute of
:term:`file objects <file object>`.
:type url_fetcher: :term:`callable`
:param url_fetcher:
A function or other callable with the same signature as
:func:`default_url_fetcher` called to fetch external resources such as
stylesheets and images. (See :ref:`URL Fetchers`.)
:param str media_type:
The media type to use for ``@media``. Defaults to ``'print'``.
**Note:** In some cases like ``HTML(string=foo)`` relative URLs will be
invalid if ``base_url`` is not provided.
"""
def __init__(self, guess=None, filename=None, url=None, file_obj=None,
@ -119,42 +186,32 @@ class HTML:
def _ph_stylesheets(self):
return [HTML5_PH_STYLESHEET]
def render(self, stylesheets=None, presentational_hints=False,
optimize_size=('fonts',), font_config=None, counter_style=None,
image_cache=None, forms=False):
def render(self, font_config=None, counter_style=None, **options):
"""Lay out and paginate the document, but do not (yet) export it.
This returns a :class:`document.Document` object which provides
access to individual pages and various meta-data.
See :meth:`write_pdf` to get a PDF directly.
:param list stylesheets:
An optional list of user stylesheets. List elements are
:class:`CSS` objects, filenames, URLs, or file
objects. (See :ref:`Stylesheet Origins`.)
:param bool presentational_hints:
Whether HTML presentational hints are followed.
:param tuple optimize_size:
Optimize size of generated PDF. Can contain "images" and "fonts".
:type font_config: :class:`text.fonts.FontConfiguration`
:param font_config: A font configuration handling ``@font-face`` rules.
:param font_config:
A font configuration handling ``@font-face`` rules.
:type counter_style: :class:`css.counters.CounterStyle`
:param counter_style: A dictionary storing ``@counter-style`` rules.
:param dict image_cache: A dictionary used to cache images.
:param bool forms: Whether PDF forms have to be included.
:param counter_style:
A dictionary storing ``@counter-style`` rules.
:param options:
The ``options`` parameter includes by default the
:data:`DEFAULT_OPTIONS` values.
:returns: A :class:`document.Document` object.
"""
return Document._render(
self, stylesheets, presentational_hints, optimize_size,
font_config, counter_style, image_cache, forms)
new_options = DEFAULT_OPTIONS.copy()
new_options.update(options)
options = new_options
return Document._render(self, font_config, counter_style, options)
def write_pdf(self, target=None, stylesheets=None, zoom=1,
attachments=None, finisher=None, presentational_hints=False,
optimize_size=('fonts',), font_config=None,
counter_style=None, image_cache=None, identifier=None,
variant=None, version=None, forms=False,
custom_metadata=False):
def write_pdf(self, target=None, zoom=1, finisher=None,
font_config=None, counter_style=None, **options):
"""Render the document to a PDF file.
This is a shortcut for calling :meth:`render`, then
@ -165,49 +222,37 @@ class HTML:
:param target:
A filename where the PDF file is generated, a file object, or
:obj:`None`.
:param list stylesheets:
An optional list of user stylesheets. The list's elements
are :class:`CSS` objects, filenames, URLs, or file-like
objects. (See :ref:`Stylesheet Origins`.)
:param float zoom:
The zoom factor in PDF units per CSS units. **Warning**:
All CSS units are affected, including physical units like
``cm`` and named sizes like ``A4``. For values other than
1, the physical CSS units will thus be "wrong".
:param list attachments: A list of additional file attachments for the
generated PDF document or :obj:`None`. The list's elements are
:class:`Attachment` objects, filenames, URLs or file-like objects.
:param finisher: A finisher function, that accepts the document and a
:class:`pydyf.PDF` object as parameters, can be passed to perform
:type finisher: :term:`callable`
:param finisher:
A finisher function or callable that accepts the document and a
:class:`pydyf.PDF` object as parameters. Can be passed to perform
post-processing on the PDF right before the trailer is written.
:param bool presentational_hints: Whether HTML presentational hints are
followed.
:param tuple optimize_size:
Optimize size of generated PDF. Can contain "images" and "fonts".
:type font_config: :class:`text.fonts.FontConfiguration`
:param font_config: A font configuration handling ``@font-face`` rules.
:param font_config:
A font configuration handling ``@font-face`` rules.
:type counter_style: :class:`css.counters.CounterStyle`
:param counter_style: A dictionary storing ``@counter-style`` rules.
:param dict image_cache: A dictionary used to cache images.
:param bytes identifier: A bytestring used as PDF file identifier.
:param str variant: A PDF variant name.
:param str version: A PDF version number.
:param bool forms: Whether PDF forms have to be included.
:param bool custom_metadata: Whether custom HTML metadata should be
stored in the generated PDF.
:param counter_style:
A dictionary storing ``@counter-style`` rules.
:param options:
The ``options`` parameter includes by default the
:data:`DEFAULT_OPTIONS` values.
:returns:
The PDF as :obj:`bytes` if ``target`` is not provided or
:obj:`None`, otherwise :obj:`None` (the PDF is written to
``target``).
"""
new_options = DEFAULT_OPTIONS.copy()
new_options.update(options)
options = new_options
return (
self.render(
stylesheets, presentational_hints, optimize_size, font_config,
counter_style, image_cache, forms)
.write_pdf(
target, zoom, attachments, finisher, identifier, variant,
version, custom_metadata))
self.render(font_config, counter_style, **options)
.write_pdf(target, zoom, finisher, **options))
class CSS:
@ -263,8 +308,9 @@ class Attachment:
supported. An optional description can be provided with the ``description``
argument.
:param description: A description of the attachment to be included in the
PDF document. May be :obj:`None`.
:param description:
A description of the attachment to be included in the PDF document.
May be :obj:`None`.
"""
def __init__(self, guess=None, filename=None, url=None, file_obj=None,

View File

@ -4,10 +4,11 @@ import argparse
import logging
import platform
import sys
from warnings import warn
import pydyf
from . import HTML, LOGGER, __version__
from . import DEFAULT_OPTIONS, HTML, LOGGER, __version__
from .pdf import VARIANTS
from .text.ffi import pango
@ -27,148 +28,125 @@ class PrintInfo(argparse.Action):
sys.exit()
def main(argv=None, stdout=None, stdin=None):
class Parser(argparse.ArgumentParser):
def __init__(self, *args, **kwargs):
self._arguments = {}
super().__init__(*args, **kwargs)
def add_argument(self, *args, **kwargs):
super().add_argument(*args, **kwargs)
key = args[-1].lstrip('-')
kwargs['flags'] = args
kwargs['positional'] = args[-1][0] != '-'
self._arguments[key] = kwargs
@property
def docstring(self):
self._arguments['help'] = self._arguments.pop('help')
data = []
for key, args in self._arguments.items():
data.append('.. option:: ')
action = args.get('action', 'store')
for flag in args['flags']:
data.append(flag)
if not args['positional'] and action in ('store', 'append'):
data.append(f' <{key}>')
data.append(', ')
data[-1] = '\n\n'
data.append(f' {args["help"][0].upper()}{args["help"][1:]}.\n\n')
if 'choices' in args:
choices = ", ".join(args['choices'])
data.append(f' Possible choices: {choices}.\n\n')
if action == 'append':
data.append(' This option can be passed multiple times.\n\n')
return ''.join(data)
PARSER = Parser(
prog='weasyprint', description='Render web pages to PDF.')
PARSER.add_argument(
'input', help='URL or filename of the HTML input, or - for stdin')
PARSER.add_argument(
'output', help='filename where output is written, or - for stdout')
PARSER.add_argument(
'-e', '--encoding', help='force the input character encoding')
PARSER.add_argument(
'-s', '--stylesheet', action='append', dest='stylesheets',
help='URL or filename for a user CSS stylesheet')
PARSER.add_argument(
'-m', '--media-type',
help='media type to use for @media, defaults to print')
PARSER.add_argument(
'-u', '--base-url',
help='base for relative URLs in the HTML input, defaults to the '
'inputs own filename or URL or the current directory for stdin')
PARSER.add_argument(
'-a', '--attachment', action='append', dest='attachments',
help='URL or filename of a file to attach to the PDF document')
PARSER.add_argument('--pdf-identifier', help='PDF file identifier')
PARSER.add_argument(
'--pdf-variant', choices=VARIANTS, help='PDF variant to generate')
PARSER.add_argument('--pdf-version', help='PDF version number')
PARSER.add_argument(
'--pdf-forms', action='store_true', help='include PDF forms')
PARSER.add_argument(
'--uncompressed-pdf', action='store_true',
help='do not compress PDF content, mainly for debugging purpose')
PARSER.add_argument(
'--custom-metadata', action='store_true',
help='include custom HTML meta tags in PDF metadata')
PARSER.add_argument(
'-p', '--presentational-hints', action='store_true',
help='follow HTML presentational hints')
PARSER.add_argument(
'--optimize-images', action='store_true',
help='optimize size of embedded images with no quality loss')
PARSER.add_argument(
'-j', '--jpeg-quality', type=int,
help='JPEG quality between 0 (worst) to 95 (best)')
PARSER.add_argument(
'--full-fonts', action='store_true',
help='embed unmodified font files when possible')
PARSER.add_argument(
'--hinting', action='store_true',
help='keep hinting information in embedded fonts')
PARSER.add_argument(
'-c', '--cache-folder', dest='cache',
help='store cache on disk instead of memory, folder is '
'created if needed and cleaned after the PDF is generated')
PARSER.add_argument(
'-D', '--dpi', type=int,
help='set maximum resolution of images embedded in the PDF')
PARSER.add_argument(
'-v', '--verbose', action='store_true',
help='show warnings and information messages')
PARSER.add_argument(
'-d', '--debug', action='store_true', help='show debugging messages')
PARSER.add_argument(
'-q', '--quiet', action='store_true', help='hide logging messages')
PARSER.add_argument(
'--version', action='version',
version=f'WeasyPrint version {__version__}',
help='print WeasyPrints version number and exit')
PARSER.add_argument(
'-i', '--info', action=PrintInfo, nargs=0,
help='print system information and exit')
PARSER.add_argument(
'-O', '--optimize-size', action='append',
help='deprecated, use other options instead',
choices=('images', 'fonts', 'hinting', 'pdf', 'all', 'none'))
PARSER.set_defaults(**DEFAULT_OPTIONS)
def main(argv=None, stdout=None, stdin=None, HTML=HTML):
"""The ``weasyprint`` program takes at least two arguments:
.. code-block:: sh
weasyprint [options] <input> <output>
The input is a filename or URL to an HTML document, or ``-`` to read
HTML from stdin. The output is a filename, or ``-`` to write to stdout.
Options can be mixed anywhere before, between, or after the input and
output.
.. option:: -e <input_encoding>, --encoding <input_encoding>
Force the input character encoding (e.g. ``-e utf-8``).
.. option:: -s <filename_or_URL>, --stylesheet <filename_or_URL>
Filename or URL of a user cascading stylesheet (see
:ref:`Stylesheet Origins`) to add to the document
(e.g. ``-s print.css``). Multiple stylesheets are allowed.
.. option:: -m <type>, --media-type <type>
Set the media type to use for ``@media``. Defaults to ``print``.
.. option:: -u <URL>, --base-url <URL>
Set the base for relative URLs in the HTML input.
Defaults to the inputs own URL, or the current directory for stdin.
.. option:: -a <file>, --attachment <file>
Adds an attachment to the document. The attachment is included in the
PDF output. This option can be used multiple times.
.. option:: --pdf-identifier <identifier>
PDF file identifier, used to check whether two different files
are two different versions of the same original document.
.. option:: --pdf-variant <variant-name>
PDF variant to generate (e.g. ``--pdf-variant pdf/a-3b``).
.. option:: --pdf-version <version-number>
PDF version number (default is 1.7).
.. option:: --custom-metadata
Include custom HTML meta tags in PDF metadata.
.. option:: -p, --presentational-hints
Follow `HTML presentational hints
<https://www.w3.org/TR/html/rendering.html\
#the-css-user-agent-style-sheet-and-presentational-hints>`_.
.. option:: -O <type>, --optimize-size <type>
Optimize the size of generated documents. Supported types are
``images``, ``fonts``, ``all`` and ``none``. This option can be used
multiple times, ``all`` adds all allowed values, ``none`` removes all
previously set values.
.. option:: -v, --verbose
Show warnings and information messages.
.. option:: -d, --debug
Show debugging messages.
.. option:: -q, --quiet
Hide logging messages.
.. option:: --version
Show the version number. Other options and arguments are ignored.
.. option:: -h, --help
Show the command-line usage. Other options and arguments are ignored.
"""
parser = argparse.ArgumentParser(
prog='weasyprint', description='Render web pages to PDF.')
parser.add_argument(
'--version', action='version',
version=f'WeasyPrint version {__version__}',
help='print WeasyPrints version number and exit')
parser.add_argument(
'-i', '--info', action=PrintInfo, nargs=0,
help='print system information and exit')
parser.add_argument(
'-e', '--encoding', help='character encoding of the input')
parser.add_argument(
'-s', '--stylesheet', action='append',
help='URL or filename for a user CSS stylesheet, '
'may be given multiple times')
parser.add_argument(
'-m', '--media-type', default='print',
help='media type to use for @media, defaults to print')
parser.add_argument(
'-u', '--base-url',
help='base for relative URLs in the HTML input, defaults to the '
'inputs own filename or URL or the current directory for stdin')
parser.add_argument(
'-a', '--attachment', action='append',
help='URL or filename of a file to attach to the PDF document')
parser.add_argument('--pdf-identifier', help='PDF file identifier')
parser.add_argument(
'--pdf-variant', choices=VARIANTS, help='PDF variant to generate')
parser.add_argument('--pdf-version', help='PDF version number')
parser.add_argument(
'--pdf-forms', action='store_true', help='Include PDF forms')
parser.add_argument(
'--custom-metadata', action='store_true',
help='include custom HTML meta tags in PDF metadata')
parser.add_argument(
'-p', '--presentational-hints', action='store_true',
help='follow HTML presentational hints')
parser.add_argument(
'-O', '--optimize-size', action='append',
help='optimize output size for specified features',
choices=('images', 'fonts', 'all', 'none'), default=['fonts'])
parser.add_argument(
'-v', '--verbose', action='store_true',
help='show warnings and information messages')
parser.add_argument(
'-d', '--debug', action='store_true', help='show debugging messages')
parser.add_argument(
'-q', '--quiet', action='store_true', help='hide logging messages')
parser.add_argument(
'input', help='URL or filename of the HTML input, or - for stdin')
parser.add_argument(
'output', help='filename where output is written, or - for stdout')
args = parser.parse_args(argv)
args = PARSER.parse_args(argv)
if args.input == '-':
source = stdin or sys.stdin.buffer
@ -184,26 +162,34 @@ def main(argv=None, stdout=None, stdin=None):
else:
output = args.output
optimize_size = set()
for arg in args.optimize_size:
if arg == 'none':
optimize_size.clear()
elif arg == 'all':
optimize_size |= {'images', 'fonts'}
else:
optimize_size.add(arg)
# TODO: to be removed when --optimize-size is removed
optimize_size = {'fonts', 'hinting', 'pdf'}
if args.optimize_size is not None:
warn(
'The --optimize-size option is now deprecated '
'and will be removed in next version. '
'Please use the other options available in --help instead.',
category=FutureWarning)
for arg in args.optimize_size:
if arg == 'none':
optimize_size.clear()
elif arg == 'all':
optimize_size |= {'images', 'fonts', 'hinting', 'pdf'}
else:
optimize_size.add(arg)
del args.optimize_size
kwargs = {
'stylesheets': args.stylesheet,
'presentational_hints': args.presentational_hints,
'optimize_size': tuple(optimize_size),
'attachments': args.attachment,
'identifier': args.pdf_identifier,
'variant': args.pdf_variant,
'version': args.pdf_version,
'forms': args.pdf_forms,
'custom_metadata': args.custom_metadata,
}
options = vars(args)
# TODO: to be removed when --optimize-size is removed
if 'images' in optimize_size:
options['optimize_images'] = True
if 'fonts' not in optimize_size:
options['full_fonts'] = True
if 'hinting' not in optimize_size:
options['hinting'] = True
if 'pdf' not in optimize_size:
options['uncompressed_pdf'] = True
# Default to logging to stderr.
if args.debug:
@ -218,7 +204,10 @@ def main(argv=None, stdout=None, stdin=None):
html = HTML(
source, base_url=args.base_url, encoding=args.encoding,
media_type=args.media_type)
html.write_pdf(output, **kwargs)
html.write_pdf(output, **options)
main.__doc__ += '\n\n' + PARSER.docstring
if __name__ == '__main__': # pragma: no cover

View File

@ -2,9 +2,10 @@
import functools
import io
import shutil
from hashlib import md5
from pathlib import Path
from . import CSS
from . import CSS, DEFAULT_OPTIONS
from .anchors import gather_anchors, make_page_bookmark_tree
from .css import get_all_computed_styles
from .css.counters import CounterStyle
@ -159,6 +160,51 @@ class DocumentMetadata:
self.custom = custom or {}
class DiskCache:
"""Dict-like storing images content on disk.
Bytestring values are stored on disk. Other lightweight Python objects
(i.e. RasterImage instances) are still stored in memory.
"""
def __init__(self, folder):
self._path = Path(folder)
self._path.mkdir(parents=True, exist_ok=True)
self._memory_cache = {}
self._disk_paths = set()
def _path_from_key(self, key):
return self._path / md5(key.encode()).hexdigest()
def __getitem__(self, key):
if key in self._memory_cache:
return self._memory_cache[key]
else:
return self._path_from_key(key).read_bytes()
def __setitem__(self, key, value):
if isinstance(value, bytes):
path = self._path_from_key(key)
self._disk_paths.add(path)
path.write_bytes(value)
else:
self._memory_cache[key] = value
def __contains__(self, key):
return (
key in self._memory_cache or
self._path_from_key(key).exists())
def __del__(self):
try:
for path in self._disk_paths:
path.unlink(missing_ok=True)
self._path.rmdir()
except Exception:
# Silently ignore errors while clearing cache
pass
class Document:
"""A rendered document ready to be painted in a pydyf stream.
@ -171,9 +217,7 @@ class Document:
"""
@classmethod
def _build_layout_context(cls, html, stylesheets, presentational_hints,
optimize_size, font_config, counter_style,
image_cache, forms):
def _build_layout_context(cls, html, font_config, counter_style, options):
if font_config is None:
font_config = FontConfiguration()
if counter_style is None:
@ -181,19 +225,24 @@ class Document:
target_collector = TargetCollector()
page_rules = []
user_stylesheets = []
image_cache = {} if image_cache is None else image_cache
for css in stylesheets or []:
cache = options['cache']
if cache is None:
cache = {}
elif not isinstance(cache, (dict, DiskCache)):
cache = DiskCache(cache)
for css in options['stylesheets'] or []:
if not hasattr(css, 'matcher'):
css = CSS(
guess=css, media_type=html.media_type,
font_config=font_config, counter_style=counter_style)
user_stylesheets.append(css)
style_for = get_all_computed_styles(
html, user_stylesheets, presentational_hints, font_config,
counter_style, page_rules, target_collector, forms)
html, user_stylesheets, options['presentational_hints'],
font_config, counter_style, page_rules, target_collector,
options['pdf_forms'])
get_image_from_uri = functools.partial(
original_get_image_from_uri, cache=image_cache,
url_fetcher=html.url_fetcher, optimize_size=optimize_size)
original_get_image_from_uri, cache=cache,
url_fetcher=html.url_fetcher, options=options)
PROGRESS_LOGGER.info('Step 4 - Creating formatting structure')
context = LayoutContext(
style_for, get_image_from_uri, font_config, counter_style,
@ -201,8 +250,7 @@ class Document:
return context
@classmethod
def _render(cls, html, stylesheets, presentational_hints, optimize_size,
font_config, counter_style, image_cache, forms):
def _render(cls, html, font_config, counter_style, options):
if font_config is None:
font_config = FontConfiguration()
@ -210,8 +258,7 @@ class Document:
counter_style = CounterStyle()
context = cls._build_layout_context(
html, stylesheets, presentational_hints, optimize_size,
font_config, counter_style, image_cache, forms)
html, font_config, counter_style, options)
root_box = build_formatting_structure(
html.etree_element, context.style_for, context.get_image_from_uri,
@ -222,12 +269,11 @@ class Document:
rendering = cls(
[Page(page_box) for page_box in page_boxes],
DocumentMetadata(**get_html_metadata(html)),
html.url_fetcher, font_config, optimize_size)
html.url_fetcher, font_config)
rendering._html = html
return rendering
def __init__(self, pages, metadata, url_fetcher, font_config,
optimize_size):
def __init__(self, pages, metadata, url_fetcher, font_config):
#: A list of :class:`Page` objects.
self.pages = pages
#: A :class:`DocumentMetadata` object.
@ -246,9 +292,6 @@ class Document:
# rendering is destroyed. This is needed as font_config.__del__ removes
# fonts that may be used when rendering
self.font_config = font_config
# Set of flags for PDF size optimization. Can contain "images" and
# "fonts".
self._optimize_size = optimize_size
def build_element_structure(self, structure, etree_element=None):
if etree_element is None:
@ -288,8 +331,7 @@ class Document:
elif not isinstance(pages, list):
pages = list(pages)
return type(self)(
pages, self.metadata, self.url_fetcher, self.font_config,
self._optimize_size)
pages, self.metadata, self.url_fetcher, self.font_config)
def make_bookmark_tree(self, scale=1, transform_pages=False):
"""Make a tree of all bookmarks in the document.
@ -324,9 +366,7 @@ class Document:
page_number, matrix)
return root
def write_pdf(self, target=None, zoom=1, attachments=None, finisher=None,
identifier=None, variant=None, version=None,
custom_metadata=False):
def write_pdf(self, target=None, zoom=1, finisher=None, **options):
"""Paint the pages in a PDF file, with metadata.
:type target:
@ -339,40 +379,38 @@ class Document:
All CSS units are affected, including physical units like
``cm`` and named sizes like ``A4``. For values other than
1, the physical CSS units will thus be "wrong".
:param list attachments: A list of additional file attachments for the
generated PDF document or :obj:`None`. The list's elements are
:class:`weasyprint.Attachment` objects, filenames, URLs or
file-like objects.
:param finisher: A finisher function, that accepts the document and a
:class:`pydyf.PDF` object as parameters, can be passed to perform
:type finisher: :term:`callable`
:param finisher:
A finisher function or callable that accepts the document and a
:class:`pydyf.PDF` object as parameters. Can be passed to perform
post-processing on the PDF right before the trailer is written.
:param bytes identifier: A bytestring used as PDF file identifier.
:param str variant: A PDF variant name.
:param str version: A PDF version number.
:param bool custom_metadata: A boolean defining whether custom HTML
metadata should be stored in the generated PDF.
:param options:
The ``options`` parameter includes by default the
:data:`weasyprint.DEFAULT_OPTIONS` values.
:returns:
The PDF as :obj:`bytes` if ``target`` is not provided or
:obj:`None`, otherwise :obj:`None` (the PDF is written to
``target``).
"""
pdf = generate_pdf(
self, target, zoom, attachments, self._optimize_size, identifier,
variant, version, custom_metadata)
new_options = DEFAULT_OPTIONS.copy()
new_options.update(options)
options = new_options
pdf = generate_pdf(self, target, zoom, **options)
identifier = options['pdf_identifier']
compress = not options['uncompressed_pdf']
if finisher:
finisher(self, pdf)
output = io.BytesIO()
pdf.write(output, version=pdf.version, identifier=identifier)
if target is None:
output = io.BytesIO()
pdf.write(output, pdf.version, identifier, compress)
return output.getvalue()
if hasattr(target, 'write'):
pdf.write(target, pdf.version, identifier, compress)
else:
output.seek(0)
if hasattr(target, 'write'):
shutil.copyfileobj(output, target)
else:
with open(target, 'wb') as fd:
shutil.copyfileobj(output, fd)
with open(target, 'wb') as fd:
pdf.write(fd, pdf.version, identifier, compress)

View File

@ -1052,6 +1052,10 @@ def draw_emojis(stream, font_size, x, y, emojis):
def draw_first_line(stream, textbox, text_overflow, block_ellipsis, x, y,
angle=0):
"""Draw the given ``textbox`` line to the document ``stream``."""
# Dont draw lines with only invisible characters
if not textbox.text.strip():
return []
font_size = textbox.style['font_size']
if font_size < 1e-6: # Default float precision used by pydyf
return []
@ -1198,8 +1202,7 @@ def draw_first_line(stream, textbox, text_overflow, block_ellipsis, x, y,
png_data = ffi.unpack(hb_data, int(stream.length[0]))
pillow_image = Image.open(BytesIO(png_data))
image_id = f'{font.hash}{glyph}'
image = RasterImage(
pillow_image, image_id, optimize_size=())
image = RasterImage(pillow_image, image_id, png_data)
d = font.widths[glyph] / 1000
a = pillow_image.width / pillow_image.height * d
pango.pango_font_get_glyph_extents(
@ -1210,7 +1213,7 @@ def draw_first_line(stream, textbox, text_overflow, block_ellipsis, x, y,
f = f / font_size - font_size
emojis.append([image, font, a, d, x_advance, f])
x_advance += (font.widths[glyph] + offset) / 1000
x_advance += (font.widths[glyph] + offset - kerning) / 1000
# Close the last glyphs list, remove if empty
if string[-1] == '<':

View File

@ -45,9 +45,6 @@ HTML_SPACE_SEPARATED_TOKENS_RE = re.compile(f'[^{HTML_WHITESPACE}]+')
def ascii_lower(string):
r"""Transform (only) ASCII letters to lower case: A-Z is mapped to a-z.
:param string: An Unicode string.
:returns: A new Unicode string.
This is used for `ASCII case-insensitive
<https://whatwg.org/C#ascii-case-insensitive>`_ matching.
@ -66,15 +63,9 @@ def ascii_lower(string):
def element_has_link_type(element, link_type):
"""
Return whether the given element has a ``rel`` attribute with the
given link type.
:param link_type: Must be a lower-case string.
"""
return any(ascii_lower(token) == link_type for token in
HTML_SPACE_SEPARATED_TOKENS_RE.findall(element.get('rel', '')))
"""Return whether element has a ``rel`` attribute with given link type."""
tokens = HTML_SPACE_SEPARATED_TOKENS_RE.findall(element.get('rel', ''))
return any(ascii_lower(token) == link_type for token in tokens)
# Maps HTML tag names to function taking an HTML element and returning a Box.

View File

@ -1,14 +1,21 @@
"""Fetch and decode images in various formats."""
import io
import math
import struct
from hashlib import md5
from io import BytesIO
from itertools import cycle
from math import inf
from pathlib import Path
from urllib.parse import urlparse
from urllib.request import url2pathname
from xml.etree import ElementTree
import pydyf
from PIL import Image, ImageFile, ImageOps
from . import DEFAULT_OPTIONS
from .layout.percent import percentage
from .logger import LOGGER
from .svg import SVG
@ -33,32 +40,211 @@ class ImageLoadingError(ValueError):
class RasterImage:
def __init__(self, pillow_image, image_id, optimize_size):
pillow_image.id = image_id
self._pillow_image = pillow_image
self._optimize_size = optimize_size
self._intrinsic_width = pillow_image.width
self._intrinsic_height = pillow_image.height
self._intrinsic_ratio = (
self._intrinsic_width / self._intrinsic_height
if self._intrinsic_height != 0 else inf)
def __init__(self, pillow_image, image_id, image_data, filename=None,
cache=None, orientation='none', options=DEFAULT_OPTIONS):
# Transpose image
original_pillow_image = pillow_image
pillow_image = rotate_pillow_image(pillow_image, orientation)
if original_pillow_image is not pillow_image:
# Keep image format as it is discarded by transposition
pillow_image.format = original_pillow_image.format
# Discard original data, as the image has been transformed
image_data = filename = None
def get_intrinsic_size(self, image_resolution, font_size):
return (
self._intrinsic_width / image_resolution,
self._intrinsic_height / image_resolution,
self._intrinsic_ratio)
self.id = image_id
self._cache = {} if cache is None else cache
self._jpeg_quality = jpeg_quality = options['jpeg_quality']
self._dpi = options['dpi']
if 'transparency' in pillow_image.info:
pillow_image = pillow_image.convert('RGBA')
elif pillow_image.mode in ('1', 'P', 'I'):
pillow_image = pillow_image.convert('RGB')
self.mode = pillow_image.mode
self.width = pillow_image.width
self.height = pillow_image.height
self.ratio = (self.width / self.height) if self.height != 0 else inf
self.optimize = optimize = options['optimize_images']
if pillow_image.format in ('JPEG', 'MPO'):
self.format = 'JPEG'
if image_data is None or optimize or jpeg_quality is not None:
image_file = io.BytesIO()
options = {'format': 'JPEG', 'optimize': optimize}
if self._jpeg_quality is not None:
options['quality'] = self._jpeg_quality
pillow_image.save(image_file, **options)
image_data = image_file.getvalue()
filename = None
else:
self.format = 'PNG'
if image_data is None or optimize or pillow_image.format != 'PNG':
image_file = io.BytesIO()
pillow_image.save(image_file, format='PNG', optimize=optimize)
image_data = image_file.getvalue()
filename = None
self.image_data = self.cache_image_data(image_data, filename)
def get_intrinsic_size(self, resolution, font_size):
return self.width / resolution, self.height / resolution, self.ratio
def draw(self, stream, concrete_width, concrete_height, image_rendering):
if self._intrinsic_width <= 0 or self._intrinsic_height <= 0:
if self.width <= 0 or self.height <= 0:
return
image_name = stream.add_image(
self._pillow_image, image_rendering, self._optimize_size)
width, height = self.width, self.height
if self._dpi:
pt_to_in = 4 / 3 / 96
width_inches = abs(concrete_width * stream.ctm[0][0] * pt_to_in)
height_inches = abs(concrete_height * stream.ctm[1][1] * pt_to_in)
dpi = max(self.width / width_inches, self.height / height_inches)
if dpi > self._dpi:
ratio = self._dpi / dpi
image = Image.open(io.BytesIO(self.image_data.data))
width = int(round(self.width * ratio))
height = int(round(self.height * ratio))
image.thumbnail((max(1, width), max(1, height)))
image_file = io.BytesIO()
image.save(
image_file, format=image.format, optimize=self.optimize)
width, height = image.width, image.height
self.image_data = self.cache_image_data(image_file.getvalue())
else:
dpi = None
interpolate = 'true' if image_rendering == 'auto' else 'false'
image_name = stream.add_image(self, width, height, interpolate)
stream.transform(
concrete_width, 0, 0, -concrete_height, 0, concrete_height)
stream.draw_x_object(image_name)
def cache_image_data(self, data, filename=None, alpha=False):
if filename:
return LazyLocalImage(filename)
else:
key = f'{self.id}{int(alpha)}{self._dpi or ""}'
return LazyImage(self._cache, key, data)
def get_xobject(self, width, height, interpolate):
if self.mode in ('RGB', 'RGBA'):
color_space = '/DeviceRGB'
elif self.mode in ('L', 'LA'):
color_space = '/DeviceGray'
elif self.mode == 'CMYK':
color_space = '/DeviceCMYK'
else:
LOGGER.warning('Unknown image mode: %s', self.mode)
color_space = '/DeviceRGB'
extra = pydyf.Dictionary({
'Type': '/XObject',
'Subtype': '/Image',
'Width': width,
'Height': height,
'ColorSpace': color_space,
'BitsPerComponent': 8,
'Interpolate': interpolate,
})
if self.format == 'JPEG':
extra['Filter'] = '/DCTDecode'
return pydyf.Stream([self.image_data], extra)
extra['Filter'] = '/FlateDecode'
extra['DecodeParms'] = pydyf.Dictionary({
# Predictor 15 specifies that we're providing PNG data,
# ostensibly using an "optimum predictor", but doesn't actually
# matter as long as the predictor value is 10+ according to the
# spec. (Other PNG predictor values assert that we're using
# specific predictors that we don't want to commit to, but
# "optimum" can vary.)
'Predictor': 15,
'Columns': width,
})
if self.mode in ('RGB', 'RGBA'):
# Defaults to 1.
extra['DecodeParms']['Colors'] = 3
if self.mode in ('RGBA', 'LA'):
# Remove alpha channel from image
pillow_image = Image.open(io.BytesIO(self.image_data.data))
alpha = pillow_image.getchannel('A')
pillow_image = pillow_image.convert(self.mode[:-1])
png_data = self._get_png_data(pillow_image)
# Save alpha channel as mask
alpha_data = self._get_png_data(alpha)
stream = self.cache_image_data(alpha_data, alpha=True)
extra['SMask'] = pydyf.Stream([stream], extra={
'Filter': '/FlateDecode',
'Type': '/XObject',
'Subtype': '/Image',
'DecodeParms': pydyf.Dictionary({
'Predictor': 15,
'Columns': width,
}),
'Width': width,
'Height': height,
'ColorSpace': '/DeviceGray',
'BitsPerComponent': 8,
'Interpolate': interpolate,
})
else:
png_data = self._get_png_data(
Image.open(io.BytesIO(self.image_data.data)))
return pydyf.Stream([self.cache_image_data(png_data)], extra)
@staticmethod
def _get_png_data(pillow_image):
image_file = BytesIO()
pillow_image.save(image_file, format='PNG')
# Read the PNG header, then discard it because we know it's a PNG. If
# this weren't just output from Pillow, we should actually check it.
image_file.seek(8)
png_data = []
raw_chunk_length = image_file.read(4)
# PNG files consist of a series of chunks.
while raw_chunk_length:
# Each chunk begins with its data length (four bytes, may be zero),
# then its type (four ASCII characters), then the data, then four
# bytes of a CRC.
chunk_length, = struct.unpack('!I', raw_chunk_length)
chunk_type = image_file.read(4)
if chunk_type == b'IDAT':
png_data.append(image_file.read(chunk_length))
else:
image_file.seek(chunk_length, io.SEEK_CUR)
# We aren't checking the CRC, we assume this is a valid PNG.
image_file.seek(4, io.SEEK_CUR)
raw_chunk_length = image_file.read(4)
return b''.join(png_data)
class LazyImage(pydyf.Object):
def __init__(self, cache, key, data):
super().__init__()
self._key = key
self._cache = cache
cache[key] = data
@property
def data(self):
return self._cache[self._key]
class LazyLocalImage(pydyf.Object):
def __init__(self, filename):
super().__init__()
self._filename = filename
@property
def data(self):
return Path(self._filename).read_bytes()
class SVGImage:
def __init__(self, tree, base_url, url_fetcher, context):
@ -91,75 +277,88 @@ class SVGImage:
self._url_fetcher, self._context)
def get_image_from_uri(cache, url_fetcher, optimize_size, url,
forced_mime_type=None, context=None,
orientation='from-image'):
def get_image_from_uri(cache, url_fetcher, options, url, forced_mime_type=None,
context=None, orientation='from-image'):
"""Get an Image instance from an image URI."""
if url in cache:
return cache[url]
try:
with fetch(url_fetcher, url) as result:
parsed_url = urlparse(result.get('redirected_url'))
if parsed_url.scheme == 'file':
filename = url2pathname(parsed_url.path)
else:
filename = None
if 'string' in result:
string = result['string']
else:
string = result['file_obj'].read()
mime_type = forced_mime_type or result['mime_type']
image = None
svg_exceptions = []
# Try to rely on given mimetype for SVG
if mime_type == 'image/svg+xml':
image = None
svg_exceptions = []
# Try to rely on given mimetype for SVG
if mime_type == 'image/svg+xml':
try:
tree = ElementTree.fromstring(string)
image = SVGImage(tree, url, url_fetcher, context)
except Exception as svg_exception:
svg_exceptions.append(svg_exception)
# Try pillow for raster images, or for failing SVG
if image is None:
try:
pillow_image = Image.open(BytesIO(string))
except Exception as raster_exception:
if mime_type == 'image/svg+xml':
# Tried SVGImage then Pillow for a SVG, abort
raise ImageLoadingError.from_exception(svg_exceptions[0])
try:
# Last chance, try SVG
tree = ElementTree.fromstring(string)
image = SVGImage(tree, url, url_fetcher, context)
except Exception as svg_exception:
svg_exceptions.append(svg_exception)
# Try pillow for raster images, or for failing SVG
if image is None:
try:
pillow_image = Image.open(BytesIO(string))
except Exception as raster_exception:
if mime_type == 'image/svg+xml':
# Tried SVGImage then Pillow for a SVG, abort
raise ImageLoadingError.from_exception(
svg_exceptions[0])
try:
# Last chance, try SVG
tree = ElementTree.fromstring(string)
image = SVGImage(tree, url, url_fetcher, context)
except Exception:
# Tried Pillow then SVGImage for a raster, abort
raise ImageLoadingError.from_exception(
raster_exception)
else:
# Store image id to enable cache in Stream.add_image
image_id = md5(url.encode()).hexdigest()
# Keep image format as it is discarded by transposition
image_format = pillow_image.format
if orientation == 'from-image':
if 'exif' in pillow_image.info:
pillow_image = ImageOps.exif_transpose(
pillow_image)
elif orientation != 'none':
angle, flip = orientation
if angle > 0:
rotation = getattr(
Image.Transpose, f'ROTATE_{angle}')
pillow_image = pillow_image.transpose(rotation)
if flip:
pillow_image = pillow_image.transpose(
Image.Transpose.FLIP_LEFT_RIGHT)
pillow_image.format = image_format
image = RasterImage(pillow_image, image_id, optimize_size)
except Exception:
# Tried Pillow then SVGImage for a raster, abort
raise ImageLoadingError.from_exception(raster_exception)
else:
# Store image id to enable cache in Stream.add_image
image_id = md5(url.encode()).hexdigest()
image = RasterImage(
pillow_image, image_id, string, filename, cache,
orientation, options)
except (URLFetchingError, ImageLoadingError) as exception:
LOGGER.error('Failed to load image at %r: %s', url, exception)
image = None
cache[url] = image
return image
def rotate_pillow_image(pillow_image, orientation):
"""Return a copy of a Pillow image with modified orientation.
If orientation is not changed, return the same image.
"""
image_format = pillow_image.format
if orientation == 'from-image':
if 'exif' in pillow_image.info:
pillow_image = ImageOps.exif_transpose(pillow_image)
elif orientation != 'none':
angle, flip = orientation
if angle > 0:
rotation = getattr(Image.Transpose, f'ROTATE_{angle}')
pillow_image = pillow_image.transpose(rotation)
if flip:
pillow_image = pillow_image.transpose(
Image.Transpose.FLIP_LEFT_RIGHT)
# Keep image format as it is discarded by transposition
pillow_image.format = image_format
return pillow_image
def process_color_stops(vector_length, positions):
"""Give color stops positions on the gradient vector.

View File

@ -105,10 +105,12 @@ def layout_document(html, root_box, context, max_loops=8):
This includes line breaks, page breaks, absolute size and position for all
boxes. Page based counters might require multiple passes.
:param root_box: root of the box tree (formatting structure of the html)
the pages' boxes are created from that tree, i.e. this
structure is not lost during pagination
:returns: a list of laid out Page objects.
:param root_box:
Root of the box tree (formatting structure of the HTML). The page boxes
are created from that tree, this structure is not lost during
pagination.
:returns:
A list of laid out Page objects.
"""
initialize_page_maker(context, root_box)
@ -287,13 +289,18 @@ class LayoutContext:
Value depends on current page.
https://drafts.csswg.org/css-gcpm/#funcdef-string
:param store: dictionary where the resolved value is stored.
:param page: current page.
:param name: name of the named string or running element.
:param keyword: indicates which value of the named string or running
element to use. Default is the first assignment on the
current page else the most recent assignment.
:returns: text for string set, box for running element
:param dict store:
Dictionary where the resolved value is stored.
:param page:
Current page.
:param str name:
Name of the named string or running element.
:param str keyword:
Indicates which value of the named string or running element to
use. Default is the first assignment on the current page else the
most recent assignment.
:returns:
Text for string set, box for running element.
"""
if self.current_page in store[name]:

View File

@ -174,20 +174,21 @@ def compute_fixed_dimension(context, box, outer, vertical, top_or_left):
def compute_variable_dimension(context, side_boxes, vertical, outer_sum):
"""
Compute and set a margin box fixed dimension on ``box``, as described in:
https://drafts.csswg.org/css-page-3/#margin-dimension
"""Compute and set a margin box fixed dimension on ``box``
:param side_boxes: Three boxes on a same side (as opposed to a corner.)
Described in: https://drafts.csswg.org/css-page-3/#margin-dimension
:param side_boxes:
Three boxes on a same side (as opposed to a corner).
A list of:
- A @*-left or @*-top margin box
- A @*-center or @*-middle margin box
- A @*-right or @*-bottom margin box
:param vertical:
True to set height, margin-top and margin-bottom; False for width,
margin-left and margin-right
``True`` to set height, margin-top and margin-bottom;
``False`` for width, margin-left and margin-right.
:param outer_sum:
The target total outer dimension (max box width or height)
The target total outer dimension (max box width or height).
"""
box_class = VerticalBox if vertical else HorizontalBox
@ -310,8 +311,10 @@ def make_margin_boxes(context, page, state):
Return ``None`` if this margin box should not be generated.
:param at_keyword: which margin box to return, eg. '@top-left'
:param containing_block: as expected by :func:`resolve_percentages`.
:param at_keyword:
Which margin box to return, e.g. '@top-left'
:param containing_block:
As expected by :func:`resolve_percentages`.
"""
style = context.style_for(page.page_type, at_keyword)
@ -507,9 +510,11 @@ def make_page(context, root_box, page_type, resume_at, page_number,
and ``resume_at`` indicates where in the document to start the next page,
or is ``None`` if this was the last page.
:param page_number: integer, start at 1 for the first page
:param resume_at: as returned by ``make_page()`` for the previous page,
or ``None`` for the first page.
:param int page_number:
Page number, starts at 1 for the first page.
:param resume_at:
As returned by ``make_page()`` for the previous page, or ``None`` for
the first page.
"""
style = context.style_for(page_type)

View File

@ -744,9 +744,12 @@ def trailing_whitespace_size(context, box):
stripped_box = box.copy_with_text(stripped_text)
stripped_box, resume, _ = split_text_box(
context, stripped_box, None, old_resume)
assert stripped_box is not None
assert resume is None
return old_box.width - stripped_box.width
if stripped_box is None:
# old_box split just before the trailing spaces
return old_box.width
else:
assert resume is None
return old_box.width - stripped_box.width
else:
_, _, _, width, _, _ = split_first_line(
box.text, box.style, context, None, box.justification_spacing)

View File

@ -45,7 +45,7 @@ def _w3c_date_to_pdf(string, attr_name):
pdf_date += f"{tz_hour:+03d}'{tz_minute:02d}"
else:
pdf_date += 'Z'
return pdf_date
return f'D:{pdf_date}'
def _reference_resources(pdf, resources, images, fonts):
@ -100,8 +100,7 @@ def _use_references(pdf, resources, images):
alpha['SMask']['G'] = alpha['SMask']['G'].reference
def generate_pdf(document, target, zoom, attachments, optimize_size,
identifier, variant, version, custom_metadata):
def generate_pdf(document, target, zoom, **options):
# 0.75 = 72 PDF point per inch / 96 CSS pixel per inch
scale = zoom * 0.75
@ -109,6 +108,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
# Set properties according to PDF variants
mark = False
variant, version = options['pdf_variant'], options['pdf_version']
if variant:
variant_function, properties = VARIANTS[variant]
if 'version' in properties:
@ -116,6 +116,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
if 'mark' in properties:
mark = properties['mark']
identifier = options['pdf_identifier']
pdf = pydyf.PDF((version or '1.7'), identifier)
states = pydyf.Dictionary()
x_objects = pydyf.Dictionary()
@ -136,6 +137,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
annot_files = {}
pdf_pages, page_streams = [], []
compress = not options['uncompressed_pdf']
for page_number, (page, links_and_anchors) in enumerate(
zip(document.pages, page_links_and_anchors)):
# Draw from the top-left corner
@ -155,7 +157,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
(right - left) / scale, (bottom - top) / scale)
stream = Stream(
document.fonts, page_rectangle, states, x_objects, patterns,
shadings, images, mark)
shadings, images, mark, compress=compress)
stream.transform(d=-1, f=(page.height * scale))
pdf.add_object(stream)
page_streams.append(stream)
@ -175,10 +177,11 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
add_links(links_and_anchors, matrix, pdf, pdf_page, pdf_names, mark)
add_annotations(
links_and_anchors[0], matrix, document, pdf, pdf_page, annot_files)
links_and_anchors[0], matrix, document, pdf, pdf_page, annot_files,
compress)
add_inputs(
page.inputs, matrix, pdf, pdf_page, resources, stream,
document.font_config.font_map)
document.font_config.font_map, compress)
page.paint(stream, scale=scale)
# Bleed
@ -227,7 +230,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
_w3c_date_to_pdf(metadata.modified, 'modified'))
if metadata.lang:
pdf.catalog['Lang'] = pydyf.String(metadata.lang)
if custom_metadata:
if options['custom_metadata']:
for key, value in metadata.custom.items():
key = ''.join(char for char in key if char.isalnum())
key = key.encode('ascii', errors='ignore').decode()
@ -235,7 +238,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
pdf.info[key] = pydyf.String(value)
# Embedded files
attachments = metadata.attachments + (attachments or [])
attachments = metadata.attachments + (options['attachments'] or [])
pdf_attachments = []
for attachment in attachments:
pdf_attachment = write_pdf_attachment(
@ -256,7 +259,10 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
pdf.catalog['Names']['EmbeddedFiles'] = content.reference
# Embedded fonts
pdf_fonts = build_fonts_dictionary(pdf, document.fonts, optimize_size)
subset = not options['full_fonts']
hinting = options['hinting']
pdf_fonts = build_fonts_dictionary(
pdf, document.fonts, compress, subset, hinting)
pdf.add_object(pdf_fonts)
if 'AcroForm' in pdf.catalog:
# Include Dingbats for forms
@ -284,6 +290,6 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
# Apply PDF variants functions
if variant:
variant_function(pdf, metadata, document, page_streams)
variant_function(pdf, metadata, document, page_streams, compress)
return pdf

View File

@ -92,7 +92,8 @@ def add_outlines(pdf, bookmarks, parent=None):
return outlines, count
def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map):
def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map,
compress):
"""Include form inputs in PDF."""
if not inputs:
return
@ -119,7 +120,7 @@ def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map):
input_name = pydyf.String(element.attrib.get('name', default_name))
# TODO: where does this 0.75 scale come from?
font_size = style['font_size'] * 0.75
field_stream = pydyf.Stream()
field_stream = pydyf.Stream(compress=compress)
field_stream.set_color_rgb(*style['color'][:3])
if input_type == 'checkbox':
# Checkboxes
@ -130,7 +131,7 @@ def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map):
'Type': '/XObject',
'Subtype': '/Form',
'BBox': pydyf.Array((0, 0, width, height)),
})
}, compress=compress)
checked_stream.push_state()
checked_stream.begin_text()
checked_stream.set_color_rgb(*style['color'][:3])
@ -138,9 +139,8 @@ def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map):
# Center (lets assume that Dingbats check has a 0.8em size)
x = (width - font_size * 0.8) / 2
y = (height - font_size * 0.8) / 2
# TODO: we should have these operators in pydyf
checked_stream.stream.append(f'{x} {y} Td')
checked_stream.stream.append('(4) Tj')
checked_stream.move_text_to(x, y)
checked_stream.show_text_string('4')
checked_stream.end_text()
checked_stream.pop_state()
pdf.add_object(checked_stream)
@ -195,7 +195,7 @@ def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map):
pdf.catalog['AcroForm']['Fields'].append(field.reference)
def add_annotations(links, matrix, document, pdf, page, annot_files):
def add_annotations(links, matrix, document, pdf, page, annot_files, compress):
"""Include annotations in PDF."""
# TODO: splitting a link into multiple independent rectangular
# annotations works well for pure links, but rather mediocre for
@ -226,8 +226,7 @@ def add_annotations(links, matrix, document, pdf, page, annot_files):
'Type': '/XObject',
'Subtype': '/Form',
'BBox': pydyf.Array(rectangle),
'Length': 0,
})
}, compress)
pdf.add_object(stream)
annot = pydyf.Dictionary({
'Type': '/Annot',
@ -286,7 +285,7 @@ def write_pdf_attachment(pdf, attachment, url_fetcher):
'ModDate': attachment.modified,
})
})
file_stream = pydyf.Stream([stream], file_extra)
file_stream = pydyf.Stream([stream], file_extra, compress)
pdf.add_object(file_stream)
except URLFetchingError as exception:

View File

@ -7,7 +7,7 @@ import pydyf
from ..logger import LOGGER
def build_fonts_dictionary(pdf, fonts, optimize_size):
def build_fonts_dictionary(pdf, fonts, compress_pdf, subset, hinting):
pdf_fonts = pydyf.Dictionary()
fonts_by_file_hash = {}
for font in fonts.values():
@ -21,10 +21,10 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):
# Clean font, optimize and handle emojis
cmap = {}
if 'fonts' in optimize_size and not font.used_in_forms:
if subset and not font.used_in_forms:
for file_font in file_fonts:
cmap = {**cmap, **file_font.cmap}
font.clean(cmap)
font.clean(cmap, hinting)
# Include font
if font.type == 'otf':
@ -32,28 +32,24 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):
else:
font_extra = pydyf.Dictionary({'Length1': len(font.file_content)})
font_stream = pydyf.Stream(
[font.file_content], font_extra, compress=True)
[font.file_content], font_extra, compress=compress_pdf)
pdf.add_object(font_stream)
font_references_by_file_hash[file_hash] = font_stream.reference
for font in fonts.values():
optimize = (
'fonts' in optimize_size and
font.ttfont and not font.used_in_forms)
if optimize:
if subset and font.ttfont and not font.used_in_forms:
# Only store widths and map for used glyphs
font_widths = font.widths
cmap = font.cmap
else:
# Store width and Unicode map for all glyphs
font_widths, cmap = {}, {}
ratio = 1024 / font.ttfont['head'].unitsPerEm
for letter, key in font.ttfont.getBestCmap().items():
glyph = font.ttfont.getGlyphID(key)
if glyph not in cmap:
cmap[glyph] = chr(letter)
width = font.ttfont.getGlyphSet()[key].width
font_widths[glyph] = width * ratio
font_widths[glyph] = width * 1000 / font.upem
max_x = max(font_widths.values()) if font_widths else 0
bbox = (0, font.descent, max_x, font.ascent)
@ -81,7 +77,7 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):
b'1 begincodespacerange',
b'<0000> <ffff>',
b'endcodespacerange',
f'{len(cmap)} beginbfchar'.encode()])
f'{len(cmap)} beginbfchar'.encode()], compress=compress_pdf)
for glyph, text in cmap.items():
unicode_codepoints = ''.join(
f'{letter.encode("utf-16-be").hex()}' for letter in text)
@ -103,7 +99,7 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):
if font.bitmap:
_build_bitmap_font_dictionary(
font_dictionary, pdf, font, widths, optimize_size)
font_dictionary, pdf, font, widths, compress_pdf, subset)
else:
font_descriptor = pydyf.Dictionary({
'Type': '/FontDescriptor',
@ -126,7 +122,8 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):
for cid in cids:
bits[cid] = '1'
stream = pydyf.Stream(
(int(''.join(bits), 2).to_bytes(padded_width, 'big'),))
(int(''.join(bits), 2).to_bytes(padded_width, 'big'),),
compress=compress_pdf)
pdf.add_object(stream)
font_descriptor['CIDSet'] = stream.reference
if font.type == 'otf':
@ -156,11 +153,11 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):
def _build_bitmap_font_dictionary(font_dictionary, pdf, font, widths,
optimize_size):
compress_pdf, subset):
# https://docs.microsoft.com/typography/opentype/spec/ebdt
font_dictionary['FontBBox'] = pydyf.Array([0, 0, 1, 1])
font_dictionary['FontMatrix'] = pydyf.Array([1, 0, 0, 1, 0, 0])
if 'fonts' in optimize_size:
if subset:
chars = tuple(sorted(font.cmap))
else:
chars = tuple(range(256))
@ -309,7 +306,7 @@ def _build_bitmap_font_dictionary(font_dictionary, pdf, font, widths,
b'/BPC 1',
b'/D [1 0]',
b'ID', bitmap, b'EI'
])
], compress=compress_pdf)
pdf.add_object(bitmap_stream)
char_procs[glyph_id] = bitmap_stream.reference

View File

@ -20,7 +20,7 @@ for key, value in NS.items():
register_namespace(key, value)
def add_metadata(pdf, metadata, variant, version, conformance):
def add_metadata(pdf, metadata, variant, version, conformance, compress):
"""Add PDF stream of metadata.
Described in ISO-32000-1:2008, 14.3.2.
@ -88,6 +88,6 @@ def add_metadata(pdf, metadata, variant, version, conformance):
footer = b'<?xpacket end="r"?>'
stream_content = b'\n'.join((header, xml, footer))
extra = {'Type': '/Metadata', 'Subtype': '/XML'}
metadata = pydyf.Stream([stream_content], extra=extra)
metadata = pydyf.Stream([stream_content], extra, compress)
pdf.add_object(metadata)
pdf.catalog['Metadata'] = metadata.reference

View File

@ -18,7 +18,7 @@ from ..logger import LOGGER
from .metadata import add_metadata
def pdfa(pdf, metadata, document, page_streams, version):
def pdfa(pdf, metadata, document, page_streams, compress, version):
"""Set metadata for PDF/A documents."""
LOGGER.warning(
'PDF/A support is experimental, '
@ -29,7 +29,7 @@ def pdfa(pdf, metadata, document, page_streams, version):
profile = pydyf.Stream(
[read_binary(__package__, 'sRGB2014.icc')],
pydyf.Dictionary({'N': 3, 'Alternate': '/DeviceRGB'}),
compress=True)
compress=compress)
pdf.add_object(profile)
pdf.catalog['OutputIntents'] = pydyf.Array([
pydyf.Dictionary({
@ -46,7 +46,7 @@ def pdfa(pdf, metadata, document, page_streams, version):
pdf_object['F'] = 2 ** (3 - 1)
# Common PDF metadata stream
add_metadata(pdf, metadata, 'a', version, 'B')
add_metadata(pdf, metadata, 'a', version, 'B', compress)
VARIANTS = {

View File

@ -6,7 +6,7 @@ from ..logger import LOGGER
from .metadata import add_metadata
def pdfua(pdf, metadata, document, page_streams):
def pdfua(pdf, metadata, document, page_streams, compress):
"""Set metadata for PDF/UA documents."""
LOGGER.warning(
'PDF/UA support is experimental, '
@ -117,7 +117,7 @@ def pdfua(pdf, metadata, document, page_streams):
annotation['F'] = 2 ** (2 - 1)
# Common PDF metadata stream
add_metadata(pdf, metadata, 'ua', version=1, conformance=None)
add_metadata(pdf, metadata, 'ua', 1, conformance=None, compress=compress)
# PDF document extra metadata
if 'Lang' not in pdf.catalog:

View File

@ -1,7 +1,6 @@
"""PDF stream."""
import io
import struct
from functools import lru_cache
from hashlib import md5
@ -98,7 +97,7 @@ class Font:
if len(widths) > 1 and len(set(widths)) == 1:
self.flags += 2 ** (1 - 1) # FixedPitch
def clean(self, cmap):
def clean(self, cmap, hinting):
if self.ttfont is None:
return
@ -107,7 +106,7 @@ class Font:
optimized_font = io.BytesIO()
options = subset.Options(
retain_gids=True, passthrough_tables=True,
ignore_missing_glyphs=True, hinting=False,
ignore_missing_glyphs=True, hinting=hinting,
desubroutinize=True)
options.drop_tables += ['GSUB', 'GPOS', 'SVG']
subsetter = subset.Subsetter(options)
@ -196,7 +195,6 @@ class Stream(pydyf.Stream):
def __init__(self, fonts, page_rectangle, states, x_objects, patterns,
shadings, images, mark, *args, **kwargs):
super().__init__(*args, **kwargs)
self.compress = True
self.page_rectangle = page_rectangle
self.marked = []
self._fonts = fonts
@ -357,113 +355,20 @@ class Stream(pydyf.Stream):
})
group = Stream(
self._fonts, self.page_rectangle, states, x_objects, patterns,
shadings, self._images, self._mark, extra=extra)
shadings, self._images, self._mark, extra=extra,
compress=self.compress)
group.id = f'x{len(self._x_objects)}'
self._x_objects[group.id] = group
return group
def _get_png_data(self, pillow_image, optimize):
image_file = io.BytesIO()
pillow_image.save(image_file, format='PNG', optimize=optimize)
# Read the PNG header, then discard it because we know it's a PNG. If
# this weren't just output from Pillow, we should actually check it.
image_file.seek(8)
png_data = b''
raw_chunk_length = image_file.read(4)
# PNG files consist of a series of chunks.
while len(raw_chunk_length) > 0:
# Each chunk begins with its data length (four bytes, may be zero),
# then its type (four ASCII characters), then the data, then four
# bytes of a CRC.
chunk_len, = struct.unpack('!I', raw_chunk_length)
chunk_type = image_file.read(4)
if chunk_type == b'IDAT':
png_data += image_file.read(chunk_len)
else:
image_file.seek(chunk_len, io.SEEK_CUR)
# We aren't checking the CRC, we assume this is a valid PNG.
image_file.seek(4, io.SEEK_CUR)
raw_chunk_length = image_file.read(4)
return png_data
def add_image(self, pillow_image, image_rendering, optimize_size):
image_name = f'i{pillow_image.id}'
def add_image(self, image, width, height, interpolate):
image_name = f'i{image.id}{width}{height}{interpolate}'
self._x_objects[image_name] = None # Set by write_pdf
if image_name in self._images:
# Reuse image already stored in document
return image_name
if 'transparency' in pillow_image.info:
pillow_image = pillow_image.convert('RGBA')
elif pillow_image.mode in ('1', 'P', 'I'):
pillow_image = pillow_image.convert('RGB')
if pillow_image.mode in ('RGB', 'RGBA'):
color_space = '/DeviceRGB'
elif pillow_image.mode in ('L', 'LA'):
color_space = '/DeviceGray'
elif pillow_image.mode == 'CMYK':
color_space = '/DeviceCMYK'
else:
LOGGER.warning('Unknown image mode: %s', pillow_image.mode)
color_space = '/DeviceRGB'
interpolate = 'true' if image_rendering == 'auto' else 'false'
extra = pydyf.Dictionary({
'Type': '/XObject',
'Subtype': '/Image',
'Width': pillow_image.width,
'Height': pillow_image.height,
'ColorSpace': color_space,
'BitsPerComponent': 8,
'Interpolate': interpolate,
})
optimize = 'images' in optimize_size
if pillow_image.format in ('JPEG', 'MPO'):
extra['Filter'] = '/DCTDecode'
image_file = io.BytesIO()
pillow_image.save(image_file, format='JPEG', optimize=optimize)
stream = [image_file.getvalue()]
else:
extra['Filter'] = '/FlateDecode'
extra['DecodeParms'] = pydyf.Dictionary({
# Predictor 15 specifies that we're providing PNG data,
# ostensibly using an "optimum predictor", but doesn't actually
# matter as long as the predictor value is 10+ according to the
# spec. (Other PNG predictor values assert that we're using
# specific predictors that we don't want to commit to, but
# "optimum" can vary.)
'Predictor': 15,
'Columns': pillow_image.width,
})
if pillow_image.mode in ('RGB', 'RGBA'):
# Defaults to 1.
extra['DecodeParms']['Colors'] = 3
if pillow_image.mode in ('RGBA', 'LA'):
alpha = pillow_image.getchannel('A')
pillow_image = pillow_image.convert(pillow_image.mode[:-1])
alpha_data = self._get_png_data(alpha, optimize)
extra['SMask'] = pydyf.Stream([alpha_data], extra={
'Filter': '/FlateDecode',
'Type': '/XObject',
'Subtype': '/Image',
'DecodeParms': pydyf.Dictionary({
'Predictor': 15,
'Columns': pillow_image.width,
}),
'Width': pillow_image.width,
'Height': pillow_image.height,
'ColorSpace': '/DeviceGray',
'BitsPerComponent': 8,
'Interpolate': interpolate,
})
stream = [self._get_png_data(pillow_image, optimize)]
xobject = pydyf.Stream(stream, extra=extra)
xobject = image.get_xobject(width, height, interpolate)
self._images[image_name] = xobject
return image_name
@ -493,7 +398,8 @@ class Stream(pydyf.Stream):
})
pattern = Stream(
self._fonts, self.page_rectangle, states, x_objects, patterns,
shadings, self._images, self._mark, extra=extra)
shadings, self._images, self._mark, extra=extra,
compress=self.compress)
pattern.id = f'p{len(self._patterns)}'
self._patterns[pattern.id] = pattern
return pattern

View File

@ -14,7 +14,7 @@ def svg(svg, node, font_size):
node.get('width'), node.get('height'), font_size)
scale_x, scale_y, translate_x, translate_y = preserve_ratio(
svg, node, font_size, width, height)
if svg.tree != node:
if svg.tree != node and node.get('overflow', 'hidden') == 'hidden':
svg.stream.rectangle(0, 0, width, height)
svg.stream.clip()
svg.stream.end()

View File

@ -12,6 +12,10 @@ class TextBox:
self.pango_layout = pango_layout
self.style = style
@property
def text(self):
return self.pango_layout.text
def text(svg, node, font_size):
"""Draw text node."""

View File

@ -182,11 +182,20 @@ class Layout:
add_attr(0, len(bytestring), letter_spacing)
if word_spacing:
if bytestring == b' ':
# We need more than one space to set word spacing
self.text = ' \u200b' # Space + zero-width space
text, bytestring = unicode_to_char_p(self.text)
pango.pango_layout_set_text(self.layout, text, -1)
space_spacing = (
units_from_double(word_spacing) + letter_spacing)
position = bytestring.find(b' ')
# Pango gives only half of word-spacing on boundaries
boundary_positions = (0, len(bytestring) - 1)
while position != -1:
add_attr(position, position + 1, space_spacing)
factor = 1 + (position in boundary_positions)
add_attr(position, position + 1, factor * space_spacing)
position = bytestring.find(b' ', position + 1)
if word_breaking:
@ -226,15 +235,7 @@ class Layout:
def create_layout(text, style, context, max_width, justification_spacing):
"""Return an opaque Pango layout with default Pango line-breaks.
:param text: Unicode
:param style: a style dict of computed values
:param max_width:
The maximum available width in the same unit as ``style['font_size']``,
or ``None`` for unlimited width.
"""
"""Return an opaque Pango layout with default Pango line-breaks."""
layout = Layout(context, style, justification_spacing, max_width)
# Make sure that max_width * Pango.SCALE == max_width * 1024 fits in a

View File

@ -175,9 +175,12 @@ def default_url_fetcher(url, timeout=10, ssl_context=None):
``url_fetcher`` argument to :class:`HTML` or :class:`CSS`.
(See :ref:`URL Fetchers`.)
:param str url: The URL of the resource to fetch.
:param int timeout: The number of seconds before HTTP requests are dropped.
:param ssl.SSLContext ssl_context: An SSL context used for HTTP requests.
:param str url:
The URL of the resource to fetch.
:param int timeout:
The number of seconds before HTTP requests are dropped.
:param ssl.SSLContext ssl_context:
An SSL context used for HTTP requests.
:raises: An exception indicating failure, e.g. :obj:`ValueError` on
syntactically invalid URL.
:returns: A :obj:`dict` with the following keys: