mirror of
https://github.com/Kozea/WeasyPrint.git
synced 2024-09-11 20:47:56 +03:00
Merge branch 'master' of github.com:timoramsauer/WeasyPrint into HEAD
This commit is contained in:
commit
b505d56199
2
.github/workflows/test_samples.yml
vendored
2
.github/workflows/test_samples.yml
vendored
@ -43,7 +43,7 @@ jobs:
|
||||
- name: Ticket
|
||||
run: python -m weasyprint weasyprint-samples/ticket/ticket.html ${{env.REPORTS_FOLDER}}/ticket.pdf
|
||||
- name: Archive generated PDFs
|
||||
uses: actions/upload-artifact@v2
|
||||
uses: actions/upload-artifact@v3
|
||||
with:
|
||||
name: generated-documents
|
||||
path: ${{env.REPORTS_FOLDER}}
|
||||
|
11
.github/workflows/tests.yml
vendored
11
.github/workflows/tests.yml
vendored
@ -8,11 +8,12 @@ jobs:
|
||||
strategy:
|
||||
matrix:
|
||||
os: [ubuntu-latest, macos-latest, windows-latest]
|
||||
python-version: ['3.7', '3.8', '3.9', '3.10', '3.11', 'pypy-3.8']
|
||||
exclude:
|
||||
# Wheels missing for this configuration
|
||||
- os: macos-latest
|
||||
python-version: pypy-3.8
|
||||
python-version: ['3.11']
|
||||
include:
|
||||
- os: ubuntu-latest
|
||||
python-version: '3.7'
|
||||
- os: ubuntu-latest
|
||||
python-version: 'pypy-3.8'
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- uses: actions/setup-python@v4
|
||||
|
@ -57,6 +57,7 @@ Python API
|
||||
.. autoclass:: CSS(input, **kwargs)
|
||||
.. autoclass:: Attachment(input, **kwargs)
|
||||
.. autofunction:: default_url_fetcher
|
||||
.. autodata:: DEFAULT_OPTIONS
|
||||
|
||||
.. module:: weasyprint.document
|
||||
.. autoclass:: Document
|
||||
@ -645,6 +646,8 @@ supported.
|
||||
The ``attr()`` functional notation is allowed in the ``content`` and
|
||||
``string-set`` properties.
|
||||
|
||||
The ``calc()`` function is **not** supported.
|
||||
|
||||
Viewport-percentage lengths (``vw``, ``vh``, ``vmin``, ``vmax``) are **not**
|
||||
supported.
|
||||
|
||||
|
@ -2,6 +2,115 @@ Changelog
|
||||
=========
|
||||
|
||||
|
||||
Version 59.0b1
|
||||
--------------
|
||||
|
||||
Released on 2023-04-14.
|
||||
|
||||
**This version is experimental, don't use it in production. If you find bugs,
|
||||
please report them!**
|
||||
|
||||
Command-line API:
|
||||
|
||||
* The ``--optimize-size`` option and its short equivalent ``-O`` have been
|
||||
deprecated. To activate or deactivate different size optimizations, you can
|
||||
now use:
|
||||
|
||||
* ``--uncompressed-pdf``,
|
||||
* ``--optimize-images``,
|
||||
* ``--full-fonts``,
|
||||
* ``--hinting``,
|
||||
* ``--dpi <resolution>``, and
|
||||
* ``--jpeg-quality <quality>``.
|
||||
|
||||
* A new ``--cache-folder <folder>`` option has been added to store temporary
|
||||
data in the given folder on the disk instead of keeping them in memory.
|
||||
|
||||
Python API:
|
||||
|
||||
* Global rendering options are now given in ``**options`` instead of dedicated
|
||||
parameters, with slightly different names. It means that the signature of the
|
||||
``HTML.render()``, ``HTML.write_pdf()`` and ``Document.write_pdf()`` has
|
||||
changed. Here are the steps to port your Python code to v59.0:
|
||||
|
||||
1. Use named parameters for these functions, not positioned parameters.
|
||||
2. Rename some the parameters:
|
||||
|
||||
* ``image_cache`` becomes ``cache`` (see below),
|
||||
* ``identifier`` becomes ``pdf_identifier``,
|
||||
* ``variant`` becomes ``pdf_variant``,
|
||||
* ``version`` becomes ``pdf_version``,
|
||||
* ``forms`` becomes ``pdf_forms``,
|
||||
|
||||
* The ``optimize_size`` parameter of ``HTML.render()``, ``HTML.write_pdf()``
|
||||
and ``Document()`` has been removed and will be ignored. You can now use the
|
||||
``uncompressed_pdf``, ``full_fonts``, ``hinting``, ``dpi`` and
|
||||
``jpeg_quality`` parameters that are included in ``**options``.
|
||||
|
||||
* The ``cache`` parameter can be included in ``**options`` to replace
|
||||
``image_cache``. If it is a dictionary, this dictionary will be used to store
|
||||
temporary data in memory, and can be even shared between multiple documents.
|
||||
If it’s a folder Path or string, WeasyPrint stores temporary data in the
|
||||
given temporary folder on disk instead of keeping them in memory.
|
||||
|
||||
New features:
|
||||
|
||||
* `#1853 <https://github.com/Kozea/WeasyPrint/pull/1853>`_,
|
||||
`#1854 <https://github.com/Kozea/WeasyPrint/issues/1854>`_:
|
||||
Reduce PDF size, with financial support from Code & Co.
|
||||
* `#1824 <https://github.com/Kozea/WeasyPrint/issues/1824>`_,
|
||||
`#1829 <https://github.com/Kozea/WeasyPrint/pull/1829>`_:
|
||||
Reduce memory use for images
|
||||
* `#1858 <https://github.com/Kozea/WeasyPrint/issues/1858>`_:
|
||||
Add an option to keep hinting information in embedded fonts
|
||||
|
||||
Bug fixes:
|
||||
|
||||
* `#1855 <https://github.com/Kozea/WeasyPrint/issues/1855>`_:
|
||||
Fix position of emojis in justified text
|
||||
* `#1852 <https://github.com/Kozea/WeasyPrint/issues/1852>`_:
|
||||
Don’t crash when line can be split before trailing spaces
|
||||
* `#1843 <https://github.com/Kozea/WeasyPrint/issues/1843>`_:
|
||||
Fix syntax of dates in metadata
|
||||
* `#1827 <https://github.com/Kozea/WeasyPrint/issues/1827>`_,
|
||||
`#1832 <https://github.com/Kozea/WeasyPrint/pull/1832>`_:
|
||||
Fix word-spacing problems with nested tags
|
||||
|
||||
Documentation:
|
||||
|
||||
* `#1841 <https://github.com/Kozea/WeasyPrint/issues/1841>`_:
|
||||
Add a paragraph about unsupported calc() function
|
||||
|
||||
Contributors:
|
||||
|
||||
* Guillaume Ayoub
|
||||
* Lucie Anglade
|
||||
* Alex Ch
|
||||
* whi_ne
|
||||
* Jonas Castro
|
||||
|
||||
Backers and sponsors:
|
||||
|
||||
* Castedo Ellerman
|
||||
* Kobalt
|
||||
* Spacinov
|
||||
* Grip Angebotssoftware
|
||||
* Crisp BV
|
||||
* Manuel Barkhau
|
||||
* SimonSoft
|
||||
* Menutech
|
||||
* KontextWork
|
||||
* NCC Group
|
||||
* René Fritz
|
||||
* Moritz Mahringer
|
||||
* Yanal-Yvez Fargialla
|
||||
* Piotr Horzycki
|
||||
* Healthchecks.io
|
||||
* TrainingSparkle
|
||||
* Hammerbacher
|
||||
* Synapsium
|
||||
|
||||
|
||||
Version 58.1
|
||||
------------
|
||||
|
||||
|
@ -11,7 +11,7 @@ WeasyPrint |version| depends on:
|
||||
|
||||
* Python_ ≥ 3.7.0
|
||||
* Pango_ ≥ 1.44.0
|
||||
* pydyf_ ≥ 0.5.0
|
||||
* pydyf_ ≥ 0.6.0
|
||||
* CFFI_ ≥ 0.6
|
||||
* html5lib_ ≥ 1.1
|
||||
* tinycss2_ ≥ 1.0.0
|
||||
@ -513,7 +513,8 @@ WeasyPrint provides two options to deal with images: ``optimize_size`` and
|
||||
|
||||
``optimize_size`` can enable size optimization for images, but also for fonts.
|
||||
When enabled, the generated PDF will include smaller images and fonts, but the
|
||||
rendering time may be slightly increased.
|
||||
rendering time may be slightly increased. The whole structure of the PDF can be
|
||||
compressed too.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@ -523,7 +524,7 @@ rendering time may be slightly increased.
|
||||
|
||||
# Full size optimization, slower, but generated PDF is smaller
|
||||
HTML('https://example.org/').write_pdf(
|
||||
'example.pdf', optimize_size=('fonts', 'images'))
|
||||
'example.pdf', optimize_size=('fonts', 'images', 'hinting', 'pdf'))
|
||||
|
||||
``image_cache`` gives the possibility to use a cache for images, avoiding to
|
||||
download, parse and optimize them each time they are used.
|
||||
@ -539,6 +540,11 @@ time when you render a lot of documents that use the same images.
|
||||
HTML(f'https://example.org/?id={i}').write_pdf(
|
||||
f'example-{i}.pdf', image_cache=cache)
|
||||
|
||||
It’s also possible to cache images on disk instead of keeping them in memory.
|
||||
The ``--cache-folder`` CLI option can be used to define the folder used to
|
||||
store temporary images. You can also provide this folder path as a string for
|
||||
``image_cache``.
|
||||
|
||||
|
||||
Logging
|
||||
~~~~~~~
|
||||
|
@ -12,7 +12,7 @@ requires-python = '>=3.7'
|
||||
readme = {file = 'README.rst', content-type = 'text/x-rst'}
|
||||
license = {file = 'LICENSE'}
|
||||
dependencies = [
|
||||
'pydyf >=0.5.0',
|
||||
'pydyf >=0.6.0',
|
||||
'cffi >=0.6',
|
||||
'html5lib >=1.1',
|
||||
'tinycss2 >=1.0.0',
|
||||
|
@ -73,14 +73,10 @@ def document_write_png(self, target=None, resolution=96, antialiasing=1,
|
||||
shutil.copyfileobj(png, fd)
|
||||
|
||||
|
||||
def html_write_png(self, target=None, stylesheets=None, resolution=96,
|
||||
presentational_hints=False, optimize_size=('fonts',),
|
||||
font_config=None, counter_style=None, image_cache=None):
|
||||
return self.render(
|
||||
stylesheets, presentational_hints=presentational_hints,
|
||||
optimize_size=optimize_size, font_config=font_config,
|
||||
counter_style=counter_style, image_cache=image_cache).write_png(
|
||||
target, resolution)
|
||||
def html_write_png(self, target=None, font_config=None, counter_style=None,
|
||||
resolution=96, **options):
|
||||
document = self.render(font_config, counter_style, **options)
|
||||
return document.write_png(target, resolution)
|
||||
|
||||
|
||||
Document.write_png = document_write_png
|
||||
|
@ -6,6 +6,7 @@ import os
|
||||
import sys
|
||||
import unicodedata
|
||||
import zlib
|
||||
from functools import partial
|
||||
from pathlib import Path
|
||||
from urllib.parse import urljoin, uses_relative
|
||||
|
||||
@ -78,11 +79,8 @@ def _check_doc1(html, has_base_url=True):
|
||||
def _run(args, stdin=b''):
|
||||
stdin = io.BytesIO(stdin)
|
||||
stdout = io.BytesIO()
|
||||
try:
|
||||
__main__.HTML = FakeHTML
|
||||
__main__.main(args.split(), stdin=stdin, stdout=stdout)
|
||||
finally:
|
||||
__main__.HTML = HTML
|
||||
HTML = partial(FakeHTML, force_uncompressed_pdf=False)
|
||||
__main__.main(args.split(), stdin=stdin, stdout=stdout, HTML=HTML)
|
||||
return stdout.getvalue()
|
||||
|
||||
|
||||
@ -303,11 +301,12 @@ def test_command_line_render(tmpdir):
|
||||
tmpdir.join(name).write_binary(pattern_bytes)
|
||||
|
||||
# Reference
|
||||
html_obj = FakeHTML(string=combined, base_url='dummy.html')
|
||||
html_obj = FakeHTML(
|
||||
string=combined, base_url='dummy.html', force_uncompressed_pdf=False)
|
||||
pdf_bytes = html_obj.write_pdf()
|
||||
rotated_pdf_bytes = FakeHTML(
|
||||
string=combined, base_url='dummy.html',
|
||||
media_type='screen').write_pdf()
|
||||
media_type='screen', force_uncompressed_pdf=False).write_pdf()
|
||||
|
||||
tmpdir.join('no_css.html').write_binary(html)
|
||||
tmpdir.join('combined.html').write_binary(combined)
|
||||
@ -360,35 +359,34 @@ def test_command_line_render(tmpdir):
|
||||
|
||||
os.environ['SOURCE_DATE_EPOCH'] = '0'
|
||||
_run('not_optimized.html out15.pdf')
|
||||
_run('not_optimized.html out16.pdf -O images')
|
||||
_run('not_optimized.html out17.pdf -O fonts')
|
||||
_run('not_optimized.html out18.pdf -O fonts -O images')
|
||||
_run('not_optimized.html out19.pdf -O all')
|
||||
_run('not_optimized.html out20.pdf -O none')
|
||||
_run('not_optimized.html out21.pdf -O none -O all')
|
||||
_run('not_optimized.html out22.pdf -O all -O none')
|
||||
_run('not_optimized.html out16.pdf --optimize-images')
|
||||
_run('not_optimized.html out17.pdf --optimize-images -j 10')
|
||||
_run('not_optimized.html out18.pdf --optimize-images -j 10 -D 1')
|
||||
_run('not_optimized.html out19.pdf --hinting')
|
||||
_run('not_optimized.html out20.pdf --full-fonts')
|
||||
_run('not_optimized.html out21.pdf --full-fonts --uncompressed-pdf')
|
||||
_run(f'not_optimized.html out22.pdf -c {tmpdir}')
|
||||
assert (
|
||||
len(tmpdir.join('out18.pdf').read_binary()) <
|
||||
len(tmpdir.join('out17.pdf').read_binary()) <
|
||||
len(tmpdir.join('out16.pdf').read_binary()) <
|
||||
len(tmpdir.join('out15.pdf').read_binary()) <
|
||||
len(tmpdir.join('out20.pdf').read_binary()))
|
||||
len(tmpdir.join('out19.pdf').read_binary()) <
|
||||
len(tmpdir.join('out20.pdf').read_binary()) <
|
||||
len(tmpdir.join('out21.pdf').read_binary()))
|
||||
assert len({
|
||||
tmpdir.join(f'out{i}.pdf').read_binary()
|
||||
for i in (16, 18, 19, 21)}) == 1
|
||||
assert len({
|
||||
tmpdir.join(f'out{i}.pdf').read_binary()
|
||||
for i in (15, 17)}) == 1
|
||||
assert len({
|
||||
tmpdir.join(f'out{i}.pdf').read_binary()
|
||||
for i in (20, 22)}) == 1
|
||||
for i in (15, 22)}) == 1
|
||||
os.environ.pop('SOURCE_DATE_EPOCH')
|
||||
|
||||
stdout = _run('combined.html -')
|
||||
stdout = _run('combined.html --uncompressed-pdf -')
|
||||
assert stdout.count(b'attachment') == 0
|
||||
stdout = _run('combined.html -')
|
||||
stdout = _run('combined.html --uncompressed-pdf -')
|
||||
assert stdout.count(b'attachment') == 0
|
||||
stdout = _run('-a pattern.png combined.html -')
|
||||
stdout = _run('-a pattern.png --uncompressed-pdf combined.html -')
|
||||
assert stdout.count(b'attachment') == 1
|
||||
stdout = _run('-a style.css -a pattern.png combined.html -')
|
||||
stdout = _run(
|
||||
'-a style.css -a pattern.png --uncompressed-pdf combined.html -')
|
||||
assert stdout.count(b'attachment') == 2
|
||||
|
||||
os.mkdir('subdirectory')
|
||||
@ -423,42 +421,59 @@ def test_command_line_render(tmpdir):
|
||||
(4, '2.0'),
|
||||
))
|
||||
def test_pdfa(version, pdf_version):
|
||||
stdout = _run(f'--pdf-variant=pdf/a-{version}b - -', b'test')
|
||||
stdout = _run(
|
||||
f'--pdf-variant=pdf/a-{version}b --uncompressed-pdf - -', b'test')
|
||||
assert f'PDF-{pdf_version}'.encode() in stdout
|
||||
assert f'part="{version}"'.encode() in stdout
|
||||
|
||||
|
||||
@pytest.mark.parametrize('version, pdf_version', (
|
||||
(1, '1.4'),
|
||||
(2, '1.7'),
|
||||
(3, '1.7'),
|
||||
(4, '2.0'),
|
||||
))
|
||||
def test_pdfa_compressed(version, pdf_version):
|
||||
_run(f'--pdf-variant=pdf/a-{version}b - -', b'test')
|
||||
|
||||
|
||||
def test_pdfua():
|
||||
stdout = _run('--pdf-variant=pdf/ua-1 - -', b'test')
|
||||
stdout = _run('--pdf-variant=pdf/ua-1 --uncompressed-pdf - -', b'test')
|
||||
assert b'part="1"' in stdout
|
||||
|
||||
|
||||
def test_pdfua_compressed():
|
||||
_run('--pdf-variant=pdf/ua-1 - -', b'test')
|
||||
|
||||
|
||||
def test_pdf_identifier():
|
||||
stdout = _run('--pdf-identifier=abc - -', b'test')
|
||||
stdout = _run('--pdf-identifier=abc --uncompressed-pdf - -', b'test')
|
||||
assert b'abc' in stdout
|
||||
|
||||
|
||||
def test_pdf_version():
|
||||
stdout = _run('--pdf-version=1.4 - -', b'test')
|
||||
stdout = _run('--pdf-version=1.4 --uncompressed-pdf - -', b'test')
|
||||
assert b'PDF-1.4' in stdout
|
||||
|
||||
|
||||
def test_pdf_custom_metadata():
|
||||
stdout = _run('--custom-metadata - -', b'<meta name=key content=value />')
|
||||
stdout = _run(
|
||||
'--custom-metadata --uncompressed-pdf - -',
|
||||
b'<meta name=key content=value />')
|
||||
assert b'/key' in stdout
|
||||
assert b'value' in stdout
|
||||
|
||||
|
||||
def test_bad_pdf_custom_metadata():
|
||||
stdout = _run(
|
||||
'--custom-metadata - -',
|
||||
'--custom-metadata --uncompressed-pdf - -',
|
||||
'<meta name=é content=value />'.encode('latin1'))
|
||||
assert b'value' not in stdout
|
||||
|
||||
|
||||
def test_partial_pdf_custom_metadata():
|
||||
stdout = _run(
|
||||
'--custom-metadata - -',
|
||||
'--custom-metadata --uncompressed-pdf - -',
|
||||
'<meta name=a.b/céd0 content=value />'.encode('latin1'))
|
||||
assert b'/abcd0' in stdout
|
||||
assert b'value' in stdout
|
||||
@ -470,10 +485,10 @@ def test_partial_pdf_custom_metadata():
|
||||
(b'<textarea></textarea>', b'/Tx'),
|
||||
))
|
||||
def test_pdf_inputs(html, field):
|
||||
stdout = _run('--pdf-forms - -', html)
|
||||
stdout = _run('--pdf-forms --uncompressed-pdf - -', html)
|
||||
assert b'AcroForm' in stdout
|
||||
assert field in stdout
|
||||
stdout = _run('- -', html)
|
||||
stdout = _run('--uncompressed-pdf - -', html)
|
||||
assert b'AcroForm' not in stdout
|
||||
|
||||
|
||||
@ -484,8 +499,10 @@ def test_pdf_inputs(html, field):
|
||||
))
|
||||
def test_appearance(css, with_forms, without_forms):
|
||||
html = f'<input style="{css}">'.encode()
|
||||
assert (b'AcroForm' in _run('--pdf-forms - -', html)) is with_forms
|
||||
assert (b'AcroForm' in _run('- -', html)) is without_forms
|
||||
assert with_forms is (
|
||||
b'AcroForm' in _run('--pdf-forms --uncompressed-pdf - -', html))
|
||||
assert without_forms is (
|
||||
b'AcroForm' in _run(' --uncompressed-pdf - -', html))
|
||||
|
||||
|
||||
def test_reproducible():
|
||||
@ -541,20 +558,20 @@ def test_low_level_api(assert_pixels_equal):
|
||||
assert pdf_bytes.startswith(b'%PDF')
|
||||
|
||||
png_bytes = html.write_png(stylesheets=[css])
|
||||
document = html.render([css])
|
||||
document = html.render(stylesheets=[css])
|
||||
page, = document.pages
|
||||
assert page.width == 8
|
||||
assert page.height == 8
|
||||
assert document.write_png() == png_bytes
|
||||
assert document.copy([page]).write_png() == png_bytes
|
||||
|
||||
document = html.render([css])
|
||||
document = html.render(stylesheets=[css])
|
||||
page, = document.pages
|
||||
assert (page.width, page.height) == (8, 8)
|
||||
png_bytes = document.write_png(resolution=192)
|
||||
check_png_pattern(assert_pixels_equal, png_bytes, x2=True)
|
||||
|
||||
document = html.render([css])
|
||||
document = html.render(stylesheets=[css])
|
||||
page, = document.pages
|
||||
assert (page.width, page.height) == (8, 8)
|
||||
# A resolution that is not multiple of 96:
|
||||
|
@ -26,7 +26,7 @@ RIGHT = round(210 * 72 / 25.4, 6)
|
||||
def test_page_size_zoom(zoom):
|
||||
pdf = FakeHTML(string='<style>@page{size:3in 4in').write_pdf(zoom=zoom)
|
||||
width, height = int(216 * zoom), int(288 * zoom)
|
||||
assert f'/MediaBox [ 0 0 {width} {height} ]'.encode() in pdf
|
||||
assert f'/MediaBox [0 0 {width} {height}]'.encode() in pdf
|
||||
|
||||
|
||||
@assert_no_logs
|
||||
@ -57,7 +57,7 @@ def test_bookmarks_2():
|
||||
@assert_no_logs
|
||||
def test_bookmarks_3():
|
||||
pdf = FakeHTML(string='<h1>a nbsp…</h1>').write_pdf()
|
||||
assert re.findall(b'/Title <(.*)>', pdf) == [
|
||||
assert re.findall(b'/Title <(\\w*)>', pdf) == [
|
||||
b'feff006100a0006e0062007300702026']
|
||||
|
||||
|
||||
@ -327,11 +327,11 @@ def test_links():
|
||||
''', base_url=resource_filename('<inline HTML>')).write_pdf()
|
||||
|
||||
uris = re.findall(b'/URI \\((.*)\\)', pdf)
|
||||
types = re.findall(b'/S (.*)', pdf)
|
||||
subtypes = re.findall(b'/Subtype (.*)', pdf)
|
||||
types = re.findall(b'/S (/\\w*)', pdf)
|
||||
subtypes = re.findall(b'/Subtype (/\\w*)', pdf)
|
||||
rects = [
|
||||
[float(number) for number in match.split()] for match in re.findall(
|
||||
b'/Rect \\[ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+) \\]', pdf)]
|
||||
b'/Rect \\[([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+)\\]', pdf)]
|
||||
|
||||
# 30pt wide (like the image), 20pt high (like line-height)
|
||||
assert uris.pop(0) == b'https://weasyprint.org'
|
||||
@ -349,7 +349,7 @@ def test_links():
|
||||
assert subtypes.pop(0) == b'/Link'
|
||||
assert b'/Dest (lipsum)' in pdf
|
||||
link = re.search(
|
||||
b'\\(lipsum\\) \\[ \\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+) ]',
|
||||
b'\\(lipsum\\) \\[\\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+)]',
|
||||
pdf).group(1)
|
||||
assert [float(number) for number in link.split()] == [0, TOP, 0]
|
||||
assert rects.pop(0) == [10, TOP - 100, 10 + 32, TOP - 100 - 20]
|
||||
@ -362,7 +362,7 @@ def test_links():
|
||||
assert subtypes.pop(0) == b'/Link'
|
||||
assert b'/Dest (hello)' in pdf
|
||||
link = re.search(
|
||||
b'\\(hello\\) \\[ \\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+) ]',
|
||||
b'\\(hello\\) \\[\\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+)]',
|
||||
pdf).group(1)
|
||||
assert [float(number) for number in link.split()] == [0, TOP - 200, 0]
|
||||
assert rects.pop(0) == [0, TOP, RIGHT, TOP - 30]
|
||||
@ -387,7 +387,7 @@ def test_relative_links_no_height():
|
||||
string='<a href="../lipsum" style="display: block"></a>a',
|
||||
base_url='https://weasyprint.org/foo/bar/').write_pdf()
|
||||
assert b'/S /URI\n/URI (https://weasyprint.org/foo/lipsum)'
|
||||
assert f'/Rect [ 0 {TOP} {RIGHT} {TOP} ]'.encode() in pdf
|
||||
assert f'/Rect [0 {TOP} {RIGHT} {TOP}]'.encode() in pdf
|
||||
|
||||
|
||||
@assert_no_logs
|
||||
@ -397,7 +397,7 @@ def test_relative_links_missing_base():
|
||||
string='<a href="../lipsum" style="display: block"></a>a',
|
||||
base_url=None).write_pdf()
|
||||
assert b'/S /URI\n/URI (../lipsum)'
|
||||
assert f'/Rect [ 0 {TOP} {RIGHT} {TOP} ]'.encode() in pdf
|
||||
assert f'/Rect [0 {TOP} {RIGHT} {TOP}]'.encode() in pdf
|
||||
|
||||
|
||||
@assert_no_logs
|
||||
@ -421,11 +421,11 @@ def test_relative_links_internal():
|
||||
base_url=None).write_pdf()
|
||||
assert b'/Dest (lipsum)' in pdf
|
||||
link = re.search(
|
||||
b'\\(lipsum\\) \\[ \\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+) ]',
|
||||
b'\\(lipsum\\) \\[\\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+)]',
|
||||
pdf).group(1)
|
||||
assert [float(number) for number in link.split()] == [0, TOP, 0]
|
||||
rect = re.search(
|
||||
b'/Rect \\[ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+) \\]',
|
||||
b'/Rect \\[([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+)\\]',
|
||||
pdf).group(1)
|
||||
assert [float(number) for number in rect.split()] == [0, TOP, RIGHT, TOP]
|
||||
|
||||
@ -437,11 +437,11 @@ def test_relative_links_anchors():
|
||||
base_url=None).write_pdf()
|
||||
assert b'/Dest (lipsum)' in pdf
|
||||
link = re.search(
|
||||
b'\\(lipsum\\) \\[ \\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+) ]',
|
||||
b'\\(lipsum\\) \\[\\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+)]',
|
||||
pdf).group(1)
|
||||
assert [float(number) for number in link.split()] == [0, TOP, 0]
|
||||
rect = re.search(
|
||||
b'/Rect \\[ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+) \\]',
|
||||
b'/Rect \\[([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+)\\]',
|
||||
pdf).group(1)
|
||||
assert [float(number) for number in rect.split()] == [0, TOP, RIGHT, TOP]
|
||||
|
||||
@ -474,11 +474,11 @@ def test_missing_links():
|
||||
assert b'/Dest (lipsum)' in pdf
|
||||
assert len(logs) == 1
|
||||
link = re.search(
|
||||
b'\\(lipsum\\) \\[ \\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+) ]',
|
||||
b'\\(lipsum\\) \\[\\d+ 0 R /XYZ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+)]',
|
||||
pdf).group(1)
|
||||
assert [float(number) for number in link.split()] == [0, TOP - 15, 0]
|
||||
rect = re.search(
|
||||
b'/Rect \\[ ([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+) \\]',
|
||||
b'/Rect \\[([\\d\\.]+ [\\d\\.]+ [\\d\\.]+ [\\d\\.]+)\\]',
|
||||
pdf).group(1)
|
||||
assert [float(number) for number in rect.split()] == [
|
||||
0, TOP, RIGHT, TOP - 15]
|
||||
@ -495,8 +495,8 @@ def test_anchor_multiple_pages():
|
||||
<a href="#lipsum"></a>
|
||||
</div>
|
||||
''', base_url=None).write_pdf()
|
||||
first_page, = re.findall(b'/Kids \\[ (\\d+) 0 R', pdf)
|
||||
assert b'/Names [ (lipsum) [ ' + first_page in pdf
|
||||
first_page, = re.findall(b'/Kids \\[(\\d+) 0 R', pdf)
|
||||
assert b'/Names [(lipsum) [' + first_page in pdf
|
||||
|
||||
|
||||
@assert_no_logs
|
||||
@ -537,8 +537,7 @@ def test_embed_images_from_pages():
|
||||
string='<img src="not-optimized.jpg">').render().pages
|
||||
document = Document(
|
||||
(page1, page2), metadata=DocumentMetadata(),
|
||||
font_config=FontConfiguration(), url_fetcher=None,
|
||||
optimize_size=()).write_pdf()
|
||||
font_config=FontConfiguration(), url_fetcher=None).write_pdf()
|
||||
assert document.count(b'/Filter /DCTDecode') == 2
|
||||
|
||||
|
||||
@ -562,8 +561,8 @@ def test_document_info():
|
||||
b'006600740065007200a00061006c006c>') in pdf
|
||||
assert b'/Keywords (html, css, pdf)' in pdf
|
||||
assert b'/Subject <feff0042006c0061006820260020>' in pdf
|
||||
assert b'/CreationDate (20110421230000Z)' in pdf
|
||||
assert b"/ModDate (20130721234600+01'00)" in pdf
|
||||
assert b'/CreationDate (D:20110421230000Z)' in pdf
|
||||
assert b"/ModDate (D:20130721234600+01'00)" in pdf
|
||||
|
||||
|
||||
@assert_no_logs
|
||||
@ -717,6 +716,6 @@ def test_bleed(style, media, bleed, trim):
|
||||
<style>@page { %s }</style>
|
||||
<body>test
|
||||
''' % style).write_pdf()
|
||||
assert '/MediaBox [ {} {} {} {} ]'.format(*media).encode() in pdf
|
||||
assert '/BleedBox [ {} {} {} {} ]'.format(*bleed).encode() in pdf
|
||||
assert '/TrimBox [ {} {} {} {} ]'.format(*trim).encode() in pdf
|
||||
assert '/MediaBox [{} {} {} {}]'.format(*media).encode() in pdf
|
||||
assert '/BleedBox [{} {} {} {}]'.format(*bleed).encode() in pdf
|
||||
assert '/TrimBox [{} {} {} {}]'.format(*trim).encode() in pdf
|
||||
|
@ -458,8 +458,16 @@ def test_text_align_justify_no_break_between_children():
|
||||
assert span_3.position_x == 5 * 16 # (3 + 1) characters + 1 space
|
||||
|
||||
|
||||
@pytest.mark.parametrize('text', (
|
||||
'Lorem ipsum dolor<em>sit amet</em>',
|
||||
'Lorem ipsum <em>dolorsit</em> amet',
|
||||
'Lorem ipsum <em></em>dolorsit amet',
|
||||
'Lorem ipsum<em> </em>dolorsit amet',
|
||||
'Lorem ipsum<em> dolorsit</em> amet',
|
||||
'Lorem ipsum <em>dolorsit </em>amet',
|
||||
))
|
||||
@assert_no_logs
|
||||
def test_word_spacing():
|
||||
def test_word_spacing(text):
|
||||
# keep the empty <style> as a regression test: element.text is None
|
||||
# (Not a string.)
|
||||
page, = render_pages('''
|
||||
@ -470,15 +478,14 @@ def test_word_spacing():
|
||||
line, = body.children
|
||||
strong_1, = line.children
|
||||
|
||||
# TODO: Pango gives only half of word-spacing to a space at the end
|
||||
# of a TextBox. Is this what we want?
|
||||
page, = render_pages('''
|
||||
<style>strong { word-spacing: 11px }</style>
|
||||
<body><strong>Lorem ipsum dolor<em>sit amet</em></strong>''')
|
||||
<body><strong>%s</strong>''' % text)
|
||||
html, = page.children
|
||||
body, = html.children
|
||||
line, = body.children
|
||||
strong_2, = line.children
|
||||
|
||||
assert strong_2.width - strong_1.width == 33
|
||||
|
||||
|
||||
@ -1018,6 +1025,19 @@ def test_overflow_wrap_trailing_space(wrap, text, body_width, expected_width):
|
||||
assert td.width == expected_width
|
||||
|
||||
|
||||
def test_line_break_before_trailing_space():
|
||||
# Test regression: https://github.com/Kozea/WeasyPrint/issues/1852
|
||||
page, = render_pages('''
|
||||
<p style="display: inline-block">test\u2028 </p>a
|
||||
<p style="display: inline-block">test\u2028</p>a
|
||||
''')
|
||||
html, = page.children
|
||||
body, = html.children
|
||||
line, = body.children
|
||||
p1, space1, p2, space2 = line.children
|
||||
assert p1.width == p2.width
|
||||
|
||||
|
||||
def white_space_lines(width, space):
|
||||
page, = render_pages('''
|
||||
<style>
|
||||
|
@ -8,7 +8,7 @@ import sys
|
||||
import threading
|
||||
import wsgiref.simple_server
|
||||
|
||||
from weasyprint import CSS, HTML, images
|
||||
from weasyprint import CSS, DEFAULT_OPTIONS, HTML, images
|
||||
from weasyprint.css import get_all_computed_styles
|
||||
from weasyprint.css.counters import CounterStyle
|
||||
from weasyprint.css.targets import TargetCollector
|
||||
@ -29,30 +29,40 @@ TEST_UA_STYLESHEET = CSS(filename=os.path.join(
|
||||
os.path.dirname(__file__), '..', 'weasyprint', 'css', 'tests_ua.css'
|
||||
))
|
||||
|
||||
PROPER_CHILDREN = dict((key, tuple(map(tuple, value))) for key, value in {
|
||||
PROPER_CHILDREN = {
|
||||
# Children can be of *any* type in *one* of the lists.
|
||||
boxes.BlockContainerBox: [[boxes.BlockLevelBox], [boxes.LineBox]],
|
||||
boxes.LineBox: [[boxes.InlineLevelBox]],
|
||||
boxes.InlineBox: [[boxes.InlineLevelBox]],
|
||||
boxes.TableBox: [[boxes.TableCaptionBox,
|
||||
boxes.TableColumnGroupBox, boxes.TableColumnBox,
|
||||
boxes.TableRowGroupBox, boxes.TableRowBox]],
|
||||
boxes.InlineTableBox: [[boxes.TableCaptionBox,
|
||||
boxes.TableColumnGroupBox, boxes.TableColumnBox,
|
||||
boxes.TableRowGroupBox, boxes.TableRowBox]],
|
||||
boxes.TableColumnGroupBox: [[boxes.TableColumnBox]],
|
||||
boxes.TableRowGroupBox: [[boxes.TableRowBox]],
|
||||
boxes.TableRowBox: [[boxes.TableCellBox]],
|
||||
}.items())
|
||||
boxes.BlockContainerBox: ((boxes.BlockLevelBox,), (boxes.LineBox,)),
|
||||
boxes.LineBox: ((boxes.InlineLevelBox,),),
|
||||
boxes.InlineBox: ((boxes.InlineLevelBox,),),
|
||||
boxes.TableBox: ((
|
||||
boxes.TableCaptionBox, boxes.TableColumnGroupBox, boxes.TableColumnBox,
|
||||
boxes.TableRowGroupBox, boxes.TableRowBox),),
|
||||
boxes.InlineTableBox: ((
|
||||
boxes.TableCaptionBox, boxes.TableColumnGroupBox, boxes.TableColumnBox,
|
||||
boxes.TableRowGroupBox, boxes.TableRowBox),),
|
||||
boxes.TableColumnGroupBox: ((boxes.TableColumnBox,),),
|
||||
boxes.TableRowGroupBox: ((boxes.TableRowBox,),),
|
||||
boxes.TableRowBox: ((boxes.TableCellBox,),),
|
||||
}
|
||||
|
||||
|
||||
class FakeHTML(HTML):
|
||||
"""Like weasyprint.HTML, but with a lighter UA stylesheet."""
|
||||
def __init__(self, *args, force_uncompressed_pdf=True, **kwargs):
|
||||
super().__init__(*args, **kwargs)
|
||||
self._force_uncompressed_pdf = force_uncompressed_pdf
|
||||
|
||||
def _ua_stylesheets(self, forms=False):
|
||||
return [
|
||||
TEST_UA_STYLESHEET if stylesheet == HTML5_UA_STYLESHEET
|
||||
else stylesheet for stylesheet in super()._ua_stylesheets(forms)]
|
||||
|
||||
def write_pdf(self, target=None, zoom=1, finisher=None, **options):
|
||||
# Override function to force the generation of uncompressed PDFs
|
||||
if self._force_uncompressed_pdf:
|
||||
options['uncompressed_pdf'] = True
|
||||
return super().write_pdf(target, zoom, finisher, **options)
|
||||
|
||||
|
||||
def resource_filename(basename):
|
||||
"""Return the absolute path of the resource called ``basename``."""
|
||||
@ -182,7 +192,7 @@ def _parse_base(html_content, base_url=BASE_URL):
|
||||
style_for = get_all_computed_styles(document, counter_style=counter_style)
|
||||
get_image_from_uri = functools.partial(
|
||||
images.get_image_from_uri, cache={}, url_fetcher=document.url_fetcher,
|
||||
optimize_size=())
|
||||
options=DEFAULT_OPTIONS)
|
||||
target_collector = TargetCollector()
|
||||
footnotes = []
|
||||
return (
|
||||
|
@ -15,11 +15,73 @@ import cssselect2
|
||||
import html5lib
|
||||
import tinycss2
|
||||
|
||||
VERSION = __version__ = '58.1'
|
||||
VERSION = __version__ = '59.0b1'
|
||||
|
||||
#: Default values for command-line and Python API options. See
|
||||
#: :func:`__main__.main` to learn more about specific options for
|
||||
#: command-line.
|
||||
#:
|
||||
#: :param list stylesheets:
|
||||
#: An optional list of user stylesheets. The list can include
|
||||
#: are :class:`CSS` objects, filenames, URLs, or file-like
|
||||
#: objects. (See :ref:`Stylesheet Origins`.)
|
||||
#: :param str media_type:
|
||||
#: Media type to use for @media.
|
||||
#: :param list attachments:
|
||||
#: A list of additional file attachments for the generated PDF
|
||||
#: document or :obj:`None`. The list's elements are
|
||||
#: :class:`Attachment` objects, filenames, URLs or file-like objects.
|
||||
#: :param bytes pdf_identifier:
|
||||
#: A bytestring used as PDF file identifier.
|
||||
#: :param str pdf_variant:
|
||||
#: A PDF variant name.
|
||||
#: :param str pdf_version:
|
||||
#: A PDF version number.
|
||||
#: :param bool pdf_forms:
|
||||
#: Whether PDF forms have to be included.
|
||||
#: :param bool uncompressed_pdf:
|
||||
#: Whether PDF content should be compressed.
|
||||
#: :param bool custom_metadata:
|
||||
#: Whether custom HTML metadata should be stored in the generated PDF.
|
||||
#: :param bool presentational_hints:
|
||||
#: Whether HTML presentational hints are followed.
|
||||
#: :param bool optimize_images:
|
||||
#: Whether size of embedded images should be optimized, with no quality
|
||||
#: loss.
|
||||
#: :param int jpeg_quality:
|
||||
#: JPEG quality between 0 (worst) to 95 (best).
|
||||
#: :param int dpi:
|
||||
#: Maximum resolution of images embedded in the PDF.
|
||||
#: :param bool full_fonts:
|
||||
#: Whether unmodified font files should be embedded when possible.
|
||||
#: :param bool hinting:
|
||||
#: Whether hinting information should be kept in embedded fonts.
|
||||
#: :type cache: :obj:`dict`, :class:`pathlib.Path` or :obj:`str`
|
||||
#: :param cache:
|
||||
#: A dictionary used to cache images in memory, or a folder path where
|
||||
#: images are temporarily stored.
|
||||
DEFAULT_OPTIONS = {
|
||||
'stylesheets': None,
|
||||
'media_type': 'print',
|
||||
'attachments': None,
|
||||
'pdf_identifier': None,
|
||||
'pdf_variant': None,
|
||||
'pdf_version': None,
|
||||
'pdf_forms': None,
|
||||
'uncompressed_pdf': False,
|
||||
'custom_metadata': False,
|
||||
'presentational_hints': False,
|
||||
'optimize_images': False,
|
||||
'jpeg_quality': None,
|
||||
'dpi': None,
|
||||
'full_fonts': False,
|
||||
'hinting': False,
|
||||
'cache': None,
|
||||
}
|
||||
|
||||
__all__ = [
|
||||
'HTML', 'CSS', 'Attachment', 'Document', 'Page', 'default_url_fetcher',
|
||||
'VERSION', '__version__']
|
||||
'HTML', 'CSS', 'DEFAULT_OPTIONS', 'Attachment', 'Document', 'Page',
|
||||
'default_url_fetcher', 'VERSION', '__version__']
|
||||
|
||||
|
||||
# Import after setting the version, as the version is used in other modules
|
||||
@ -55,12 +117,15 @@ class HTML:
|
||||
Alternatively, use **one** named argument so that no guessing is involved:
|
||||
|
||||
:type filename: str or pathlib.Path
|
||||
:param filename: A filename, relative to the current directory, or
|
||||
absolute.
|
||||
:param str url: An absolute, fully qualified URL.
|
||||
:param filename:
|
||||
A filename, relative to the current directory, or absolute.
|
||||
:param str url:
|
||||
An absolute, fully qualified URL.
|
||||
:type file_obj: :term:`file object`
|
||||
:param file_obj: Any object with a ``read`` method.
|
||||
:param str string: A string of HTML source.
|
||||
:param file_obj:
|
||||
Any object with a ``read`` method.
|
||||
:param str string:
|
||||
A string of HTML source.
|
||||
|
||||
Specifying multiple inputs is an error:
|
||||
``HTML(filename="foo.html", url="localhost://bar.html")``
|
||||
@ -68,20 +133,22 @@ class HTML:
|
||||
|
||||
You can also pass optional named arguments:
|
||||
|
||||
:param str encoding: Force the source character encoding.
|
||||
:param str base_url: The base used to resolve relative URLs
|
||||
(e.g. in ``<img src="../foo.png">``). If not provided, try to use
|
||||
the input filename, URL, or ``name`` attribute of :term:`file objects
|
||||
<file object>`.
|
||||
:type url_fetcher: :term:`function`
|
||||
:param url_fetcher: A function or other callable
|
||||
with the same signature as :func:`default_url_fetcher` called to
|
||||
fetch external resources such as stylesheets and images.
|
||||
(See :ref:`URL Fetchers`.)
|
||||
:param str media_type: The media type to use for ``@media``.
|
||||
Defaults to ``'print'``. **Note:** In some cases like
|
||||
``HTML(string=foo)`` relative URLs will be invalid if ``base_url``
|
||||
is not provided.
|
||||
:param str encoding:
|
||||
Force the source character encoding.
|
||||
:param str base_url:
|
||||
The base used to resolve relative URLs (e.g. in
|
||||
``<img src="../foo.png">``). If not provided, try to use the input
|
||||
filename, URL, or ``name`` attribute of
|
||||
:term:`file objects <file object>`.
|
||||
:type url_fetcher: :term:`callable`
|
||||
:param url_fetcher:
|
||||
A function or other callable with the same signature as
|
||||
:func:`default_url_fetcher` called to fetch external resources such as
|
||||
stylesheets and images. (See :ref:`URL Fetchers`.)
|
||||
:param str media_type:
|
||||
The media type to use for ``@media``. Defaults to ``'print'``.
|
||||
**Note:** In some cases like ``HTML(string=foo)`` relative URLs will be
|
||||
invalid if ``base_url`` is not provided.
|
||||
|
||||
"""
|
||||
def __init__(self, guess=None, filename=None, url=None, file_obj=None,
|
||||
@ -119,42 +186,32 @@ class HTML:
|
||||
def _ph_stylesheets(self):
|
||||
return [HTML5_PH_STYLESHEET]
|
||||
|
||||
def render(self, stylesheets=None, presentational_hints=False,
|
||||
optimize_size=('fonts',), font_config=None, counter_style=None,
|
||||
image_cache=None, forms=False):
|
||||
def render(self, font_config=None, counter_style=None, **options):
|
||||
"""Lay out and paginate the document, but do not (yet) export it.
|
||||
|
||||
This returns a :class:`document.Document` object which provides
|
||||
access to individual pages and various meta-data.
|
||||
See :meth:`write_pdf` to get a PDF directly.
|
||||
|
||||
:param list stylesheets:
|
||||
An optional list of user stylesheets. List elements are
|
||||
:class:`CSS` objects, filenames, URLs, or file
|
||||
objects. (See :ref:`Stylesheet Origins`.)
|
||||
:param bool presentational_hints:
|
||||
Whether HTML presentational hints are followed.
|
||||
:param tuple optimize_size:
|
||||
Optimize size of generated PDF. Can contain "images" and "fonts".
|
||||
:type font_config: :class:`text.fonts.FontConfiguration`
|
||||
:param font_config: A font configuration handling ``@font-face`` rules.
|
||||
:param font_config:
|
||||
A font configuration handling ``@font-face`` rules.
|
||||
:type counter_style: :class:`css.counters.CounterStyle`
|
||||
:param counter_style: A dictionary storing ``@counter-style`` rules.
|
||||
:param dict image_cache: A dictionary used to cache images.
|
||||
:param bool forms: Whether PDF forms have to be included.
|
||||
:param counter_style:
|
||||
A dictionary storing ``@counter-style`` rules.
|
||||
:param options:
|
||||
The ``options`` parameter includes by default the
|
||||
:data:`DEFAULT_OPTIONS` values.
|
||||
:returns: A :class:`document.Document` object.
|
||||
|
||||
"""
|
||||
return Document._render(
|
||||
self, stylesheets, presentational_hints, optimize_size,
|
||||
font_config, counter_style, image_cache, forms)
|
||||
new_options = DEFAULT_OPTIONS.copy()
|
||||
new_options.update(options)
|
||||
options = new_options
|
||||
return Document._render(self, font_config, counter_style, options)
|
||||
|
||||
def write_pdf(self, target=None, stylesheets=None, zoom=1,
|
||||
attachments=None, finisher=None, presentational_hints=False,
|
||||
optimize_size=('fonts',), font_config=None,
|
||||
counter_style=None, image_cache=None, identifier=None,
|
||||
variant=None, version=None, forms=False,
|
||||
custom_metadata=False):
|
||||
def write_pdf(self, target=None, zoom=1, finisher=None,
|
||||
font_config=None, counter_style=None, **options):
|
||||
"""Render the document to a PDF file.
|
||||
|
||||
This is a shortcut for calling :meth:`render`, then
|
||||
@ -165,49 +222,37 @@ class HTML:
|
||||
:param target:
|
||||
A filename where the PDF file is generated, a file object, or
|
||||
:obj:`None`.
|
||||
:param list stylesheets:
|
||||
An optional list of user stylesheets. The list's elements
|
||||
are :class:`CSS` objects, filenames, URLs, or file-like
|
||||
objects. (See :ref:`Stylesheet Origins`.)
|
||||
:param float zoom:
|
||||
The zoom factor in PDF units per CSS units. **Warning**:
|
||||
All CSS units are affected, including physical units like
|
||||
``cm`` and named sizes like ``A4``. For values other than
|
||||
1, the physical CSS units will thus be "wrong".
|
||||
:param list attachments: A list of additional file attachments for the
|
||||
generated PDF document or :obj:`None`. The list's elements are
|
||||
:class:`Attachment` objects, filenames, URLs or file-like objects.
|
||||
:param finisher: A finisher function, that accepts the document and a
|
||||
:class:`pydyf.PDF` object as parameters, can be passed to perform
|
||||
:type finisher: :term:`callable`
|
||||
:param finisher:
|
||||
A finisher function or callable that accepts the document and a
|
||||
:class:`pydyf.PDF` object as parameters. Can be passed to perform
|
||||
post-processing on the PDF right before the trailer is written.
|
||||
:param bool presentational_hints: Whether HTML presentational hints are
|
||||
followed.
|
||||
:param tuple optimize_size:
|
||||
Optimize size of generated PDF. Can contain "images" and "fonts".
|
||||
:type font_config: :class:`text.fonts.FontConfiguration`
|
||||
:param font_config: A font configuration handling ``@font-face`` rules.
|
||||
:param font_config:
|
||||
A font configuration handling ``@font-face`` rules.
|
||||
:type counter_style: :class:`css.counters.CounterStyle`
|
||||
:param counter_style: A dictionary storing ``@counter-style`` rules.
|
||||
:param dict image_cache: A dictionary used to cache images.
|
||||
:param bytes identifier: A bytestring used as PDF file identifier.
|
||||
:param str variant: A PDF variant name.
|
||||
:param str version: A PDF version number.
|
||||
:param bool forms: Whether PDF forms have to be included.
|
||||
:param bool custom_metadata: Whether custom HTML metadata should be
|
||||
stored in the generated PDF.
|
||||
:param counter_style:
|
||||
A dictionary storing ``@counter-style`` rules.
|
||||
:param options:
|
||||
The ``options`` parameter includes by default the
|
||||
:data:`DEFAULT_OPTIONS` values.
|
||||
:returns:
|
||||
The PDF as :obj:`bytes` if ``target`` is not provided or
|
||||
:obj:`None`, otherwise :obj:`None` (the PDF is written to
|
||||
``target``).
|
||||
|
||||
"""
|
||||
new_options = DEFAULT_OPTIONS.copy()
|
||||
new_options.update(options)
|
||||
options = new_options
|
||||
return (
|
||||
self.render(
|
||||
stylesheets, presentational_hints, optimize_size, font_config,
|
||||
counter_style, image_cache, forms)
|
||||
.write_pdf(
|
||||
target, zoom, attachments, finisher, identifier, variant,
|
||||
version, custom_metadata))
|
||||
self.render(font_config, counter_style, **options)
|
||||
.write_pdf(target, zoom, finisher, **options))
|
||||
|
||||
|
||||
class CSS:
|
||||
@ -263,8 +308,9 @@ class Attachment:
|
||||
supported. An optional description can be provided with the ``description``
|
||||
argument.
|
||||
|
||||
:param description: A description of the attachment to be included in the
|
||||
PDF document. May be :obj:`None`.
|
||||
:param description:
|
||||
A description of the attachment to be included in the PDF document.
|
||||
May be :obj:`None`.
|
||||
|
||||
"""
|
||||
def __init__(self, guess=None, filename=None, url=None, file_obj=None,
|
||||
|
@ -4,10 +4,11 @@ import argparse
|
||||
import logging
|
||||
import platform
|
||||
import sys
|
||||
from warnings import warn
|
||||
|
||||
import pydyf
|
||||
|
||||
from . import HTML, LOGGER, __version__
|
||||
from . import DEFAULT_OPTIONS, HTML, LOGGER, __version__
|
||||
from .pdf import VARIANTS
|
||||
from .text.ffi import pango
|
||||
|
||||
@ -27,148 +28,125 @@ class PrintInfo(argparse.Action):
|
||||
sys.exit()
|
||||
|
||||
|
||||
def main(argv=None, stdout=None, stdin=None):
|
||||
class Parser(argparse.ArgumentParser):
|
||||
def __init__(self, *args, **kwargs):
|
||||
self._arguments = {}
|
||||
super().__init__(*args, **kwargs)
|
||||
|
||||
def add_argument(self, *args, **kwargs):
|
||||
super().add_argument(*args, **kwargs)
|
||||
key = args[-1].lstrip('-')
|
||||
kwargs['flags'] = args
|
||||
kwargs['positional'] = args[-1][0] != '-'
|
||||
self._arguments[key] = kwargs
|
||||
|
||||
@property
|
||||
def docstring(self):
|
||||
self._arguments['help'] = self._arguments.pop('help')
|
||||
data = []
|
||||
for key, args in self._arguments.items():
|
||||
data.append('.. option:: ')
|
||||
action = args.get('action', 'store')
|
||||
for flag in args['flags']:
|
||||
data.append(flag)
|
||||
if not args['positional'] and action in ('store', 'append'):
|
||||
data.append(f' <{key}>')
|
||||
data.append(', ')
|
||||
data[-1] = '\n\n'
|
||||
data.append(f' {args["help"][0].upper()}{args["help"][1:]}.\n\n')
|
||||
if 'choices' in args:
|
||||
choices = ", ".join(args['choices'])
|
||||
data.append(f' Possible choices: {choices}.\n\n')
|
||||
if action == 'append':
|
||||
data.append(' This option can be passed multiple times.\n\n')
|
||||
return ''.join(data)
|
||||
|
||||
|
||||
PARSER = Parser(
|
||||
prog='weasyprint', description='Render web pages to PDF.')
|
||||
PARSER.add_argument(
|
||||
'input', help='URL or filename of the HTML input, or - for stdin')
|
||||
PARSER.add_argument(
|
||||
'output', help='filename where output is written, or - for stdout')
|
||||
PARSER.add_argument(
|
||||
'-e', '--encoding', help='force the input character encoding')
|
||||
PARSER.add_argument(
|
||||
'-s', '--stylesheet', action='append', dest='stylesheets',
|
||||
help='URL or filename for a user CSS stylesheet')
|
||||
PARSER.add_argument(
|
||||
'-m', '--media-type',
|
||||
help='media type to use for @media, defaults to print')
|
||||
PARSER.add_argument(
|
||||
'-u', '--base-url',
|
||||
help='base for relative URLs in the HTML input, defaults to the '
|
||||
'input’s own filename or URL or the current directory for stdin')
|
||||
PARSER.add_argument(
|
||||
'-a', '--attachment', action='append', dest='attachments',
|
||||
help='URL or filename of a file to attach to the PDF document')
|
||||
PARSER.add_argument('--pdf-identifier', help='PDF file identifier')
|
||||
PARSER.add_argument(
|
||||
'--pdf-variant', choices=VARIANTS, help='PDF variant to generate')
|
||||
PARSER.add_argument('--pdf-version', help='PDF version number')
|
||||
PARSER.add_argument(
|
||||
'--pdf-forms', action='store_true', help='include PDF forms')
|
||||
PARSER.add_argument(
|
||||
'--uncompressed-pdf', action='store_true',
|
||||
help='do not compress PDF content, mainly for debugging purpose')
|
||||
PARSER.add_argument(
|
||||
'--custom-metadata', action='store_true',
|
||||
help='include custom HTML meta tags in PDF metadata')
|
||||
PARSER.add_argument(
|
||||
'-p', '--presentational-hints', action='store_true',
|
||||
help='follow HTML presentational hints')
|
||||
PARSER.add_argument(
|
||||
'--optimize-images', action='store_true',
|
||||
help='optimize size of embedded images with no quality loss')
|
||||
PARSER.add_argument(
|
||||
'-j', '--jpeg-quality', type=int,
|
||||
help='JPEG quality between 0 (worst) to 95 (best)')
|
||||
PARSER.add_argument(
|
||||
'--full-fonts', action='store_true',
|
||||
help='embed unmodified font files when possible')
|
||||
PARSER.add_argument(
|
||||
'--hinting', action='store_true',
|
||||
help='keep hinting information in embedded fonts')
|
||||
PARSER.add_argument(
|
||||
'-c', '--cache-folder', dest='cache',
|
||||
help='store cache on disk instead of memory, folder is '
|
||||
'created if needed and cleaned after the PDF is generated')
|
||||
PARSER.add_argument(
|
||||
'-D', '--dpi', type=int,
|
||||
help='set maximum resolution of images embedded in the PDF')
|
||||
PARSER.add_argument(
|
||||
'-v', '--verbose', action='store_true',
|
||||
help='show warnings and information messages')
|
||||
PARSER.add_argument(
|
||||
'-d', '--debug', action='store_true', help='show debugging messages')
|
||||
PARSER.add_argument(
|
||||
'-q', '--quiet', action='store_true', help='hide logging messages')
|
||||
PARSER.add_argument(
|
||||
'--version', action='version',
|
||||
version=f'WeasyPrint version {__version__}',
|
||||
help='print WeasyPrint’s version number and exit')
|
||||
PARSER.add_argument(
|
||||
'-i', '--info', action=PrintInfo, nargs=0,
|
||||
help='print system information and exit')
|
||||
PARSER.add_argument(
|
||||
'-O', '--optimize-size', action='append',
|
||||
help='deprecated, use other options instead',
|
||||
choices=('images', 'fonts', 'hinting', 'pdf', 'all', 'none'))
|
||||
PARSER.set_defaults(**DEFAULT_OPTIONS)
|
||||
|
||||
|
||||
def main(argv=None, stdout=None, stdin=None, HTML=HTML):
|
||||
"""The ``weasyprint`` program takes at least two arguments:
|
||||
|
||||
.. code-block:: sh
|
||||
|
||||
weasyprint [options] <input> <output>
|
||||
|
||||
The input is a filename or URL to an HTML document, or ``-`` to read
|
||||
HTML from stdin. The output is a filename, or ``-`` to write to stdout.
|
||||
|
||||
Options can be mixed anywhere before, between, or after the input and
|
||||
output.
|
||||
|
||||
.. option:: -e <input_encoding>, --encoding <input_encoding>
|
||||
|
||||
Force the input character encoding (e.g. ``-e utf-8``).
|
||||
|
||||
.. option:: -s <filename_or_URL>, --stylesheet <filename_or_URL>
|
||||
|
||||
Filename or URL of a user cascading stylesheet (see
|
||||
:ref:`Stylesheet Origins`) to add to the document
|
||||
(e.g. ``-s print.css``). Multiple stylesheets are allowed.
|
||||
|
||||
.. option:: -m <type>, --media-type <type>
|
||||
|
||||
Set the media type to use for ``@media``. Defaults to ``print``.
|
||||
|
||||
.. option:: -u <URL>, --base-url <URL>
|
||||
|
||||
Set the base for relative URLs in the HTML input.
|
||||
Defaults to the input’s own URL, or the current directory for stdin.
|
||||
|
||||
.. option:: -a <file>, --attachment <file>
|
||||
|
||||
Adds an attachment to the document. The attachment is included in the
|
||||
PDF output. This option can be used multiple times.
|
||||
|
||||
.. option:: --pdf-identifier <identifier>
|
||||
|
||||
PDF file identifier, used to check whether two different files
|
||||
are two different versions of the same original document.
|
||||
|
||||
.. option:: --pdf-variant <variant-name>
|
||||
|
||||
PDF variant to generate (e.g. ``--pdf-variant pdf/a-3b``).
|
||||
|
||||
.. option:: --pdf-version <version-number>
|
||||
|
||||
PDF version number (default is 1.7).
|
||||
|
||||
.. option:: --custom-metadata
|
||||
|
||||
Include custom HTML meta tags in PDF metadata.
|
||||
|
||||
.. option:: -p, --presentational-hints
|
||||
|
||||
Follow `HTML presentational hints
|
||||
<https://www.w3.org/TR/html/rendering.html\
|
||||
#the-css-user-agent-style-sheet-and-presentational-hints>`_.
|
||||
|
||||
.. option:: -O <type>, --optimize-size <type>
|
||||
|
||||
Optimize the size of generated documents. Supported types are
|
||||
``images``, ``fonts``, ``all`` and ``none``. This option can be used
|
||||
multiple times, ``all`` adds all allowed values, ``none`` removes all
|
||||
previously set values.
|
||||
|
||||
.. option:: -v, --verbose
|
||||
|
||||
Show warnings and information messages.
|
||||
|
||||
.. option:: -d, --debug
|
||||
|
||||
Show debugging messages.
|
||||
|
||||
.. option:: -q, --quiet
|
||||
|
||||
Hide logging messages.
|
||||
|
||||
.. option:: --version
|
||||
|
||||
Show the version number. Other options and arguments are ignored.
|
||||
|
||||
.. option:: -h, --help
|
||||
|
||||
Show the command-line usage. Other options and arguments are ignored.
|
||||
|
||||
"""
|
||||
parser = argparse.ArgumentParser(
|
||||
prog='weasyprint', description='Render web pages to PDF.')
|
||||
parser.add_argument(
|
||||
'--version', action='version',
|
||||
version=f'WeasyPrint version {__version__}',
|
||||
help='print WeasyPrint’s version number and exit')
|
||||
parser.add_argument(
|
||||
'-i', '--info', action=PrintInfo, nargs=0,
|
||||
help='print system information and exit')
|
||||
parser.add_argument(
|
||||
'-e', '--encoding', help='character encoding of the input')
|
||||
parser.add_argument(
|
||||
'-s', '--stylesheet', action='append',
|
||||
help='URL or filename for a user CSS stylesheet, '
|
||||
'may be given multiple times')
|
||||
parser.add_argument(
|
||||
'-m', '--media-type', default='print',
|
||||
help='media type to use for @media, defaults to print')
|
||||
parser.add_argument(
|
||||
'-u', '--base-url',
|
||||
help='base for relative URLs in the HTML input, defaults to the '
|
||||
'input’s own filename or URL or the current directory for stdin')
|
||||
parser.add_argument(
|
||||
'-a', '--attachment', action='append',
|
||||
help='URL or filename of a file to attach to the PDF document')
|
||||
parser.add_argument('--pdf-identifier', help='PDF file identifier')
|
||||
parser.add_argument(
|
||||
'--pdf-variant', choices=VARIANTS, help='PDF variant to generate')
|
||||
parser.add_argument('--pdf-version', help='PDF version number')
|
||||
parser.add_argument(
|
||||
'--pdf-forms', action='store_true', help='Include PDF forms')
|
||||
parser.add_argument(
|
||||
'--custom-metadata', action='store_true',
|
||||
help='include custom HTML meta tags in PDF metadata')
|
||||
parser.add_argument(
|
||||
'-p', '--presentational-hints', action='store_true',
|
||||
help='follow HTML presentational hints')
|
||||
parser.add_argument(
|
||||
'-O', '--optimize-size', action='append',
|
||||
help='optimize output size for specified features',
|
||||
choices=('images', 'fonts', 'all', 'none'), default=['fonts'])
|
||||
parser.add_argument(
|
||||
'-v', '--verbose', action='store_true',
|
||||
help='show warnings and information messages')
|
||||
parser.add_argument(
|
||||
'-d', '--debug', action='store_true', help='show debugging messages')
|
||||
parser.add_argument(
|
||||
'-q', '--quiet', action='store_true', help='hide logging messages')
|
||||
parser.add_argument(
|
||||
'input', help='URL or filename of the HTML input, or - for stdin')
|
||||
parser.add_argument(
|
||||
'output', help='filename where output is written, or - for stdout')
|
||||
|
||||
args = parser.parse_args(argv)
|
||||
args = PARSER.parse_args(argv)
|
||||
|
||||
if args.input == '-':
|
||||
source = stdin or sys.stdin.buffer
|
||||
@ -184,26 +162,34 @@ def main(argv=None, stdout=None, stdin=None):
|
||||
else:
|
||||
output = args.output
|
||||
|
||||
optimize_size = set()
|
||||
for arg in args.optimize_size:
|
||||
if arg == 'none':
|
||||
optimize_size.clear()
|
||||
elif arg == 'all':
|
||||
optimize_size |= {'images', 'fonts'}
|
||||
else:
|
||||
optimize_size.add(arg)
|
||||
# TODO: to be removed when --optimize-size is removed
|
||||
optimize_size = {'fonts', 'hinting', 'pdf'}
|
||||
if args.optimize_size is not None:
|
||||
warn(
|
||||
'The --optimize-size option is now deprecated '
|
||||
'and will be removed in next version. '
|
||||
'Please use the other options available in --help instead.',
|
||||
category=FutureWarning)
|
||||
for arg in args.optimize_size:
|
||||
if arg == 'none':
|
||||
optimize_size.clear()
|
||||
elif arg == 'all':
|
||||
optimize_size |= {'images', 'fonts', 'hinting', 'pdf'}
|
||||
else:
|
||||
optimize_size.add(arg)
|
||||
del args.optimize_size
|
||||
|
||||
kwargs = {
|
||||
'stylesheets': args.stylesheet,
|
||||
'presentational_hints': args.presentational_hints,
|
||||
'optimize_size': tuple(optimize_size),
|
||||
'attachments': args.attachment,
|
||||
'identifier': args.pdf_identifier,
|
||||
'variant': args.pdf_variant,
|
||||
'version': args.pdf_version,
|
||||
'forms': args.pdf_forms,
|
||||
'custom_metadata': args.custom_metadata,
|
||||
}
|
||||
options = vars(args)
|
||||
|
||||
# TODO: to be removed when --optimize-size is removed
|
||||
if 'images' in optimize_size:
|
||||
options['optimize_images'] = True
|
||||
if 'fonts' not in optimize_size:
|
||||
options['full_fonts'] = True
|
||||
if 'hinting' not in optimize_size:
|
||||
options['hinting'] = True
|
||||
if 'pdf' not in optimize_size:
|
||||
options['uncompressed_pdf'] = True
|
||||
|
||||
# Default to logging to stderr.
|
||||
if args.debug:
|
||||
@ -218,7 +204,10 @@ def main(argv=None, stdout=None, stdin=None):
|
||||
html = HTML(
|
||||
source, base_url=args.base_url, encoding=args.encoding,
|
||||
media_type=args.media_type)
|
||||
html.write_pdf(output, **kwargs)
|
||||
html.write_pdf(output, **options)
|
||||
|
||||
|
||||
main.__doc__ += '\n\n' + PARSER.docstring
|
||||
|
||||
|
||||
if __name__ == '__main__': # pragma: no cover
|
||||
|
@ -2,9 +2,10 @@
|
||||
|
||||
import functools
|
||||
import io
|
||||
import shutil
|
||||
from hashlib import md5
|
||||
from pathlib import Path
|
||||
|
||||
from . import CSS
|
||||
from . import CSS, DEFAULT_OPTIONS
|
||||
from .anchors import gather_anchors, make_page_bookmark_tree
|
||||
from .css import get_all_computed_styles
|
||||
from .css.counters import CounterStyle
|
||||
@ -159,6 +160,51 @@ class DocumentMetadata:
|
||||
self.custom = custom or {}
|
||||
|
||||
|
||||
class DiskCache:
|
||||
"""Dict-like storing images content on disk.
|
||||
|
||||
Bytestring values are stored on disk. Other lightweight Python objects
|
||||
(i.e. RasterImage instances) are still stored in memory.
|
||||
|
||||
"""
|
||||
def __init__(self, folder):
|
||||
self._path = Path(folder)
|
||||
self._path.mkdir(parents=True, exist_ok=True)
|
||||
self._memory_cache = {}
|
||||
self._disk_paths = set()
|
||||
|
||||
def _path_from_key(self, key):
|
||||
return self._path / md5(key.encode()).hexdigest()
|
||||
|
||||
def __getitem__(self, key):
|
||||
if key in self._memory_cache:
|
||||
return self._memory_cache[key]
|
||||
else:
|
||||
return self._path_from_key(key).read_bytes()
|
||||
|
||||
def __setitem__(self, key, value):
|
||||
if isinstance(value, bytes):
|
||||
path = self._path_from_key(key)
|
||||
self._disk_paths.add(path)
|
||||
path.write_bytes(value)
|
||||
else:
|
||||
self._memory_cache[key] = value
|
||||
|
||||
def __contains__(self, key):
|
||||
return (
|
||||
key in self._memory_cache or
|
||||
self._path_from_key(key).exists())
|
||||
|
||||
def __del__(self):
|
||||
try:
|
||||
for path in self._disk_paths:
|
||||
path.unlink(missing_ok=True)
|
||||
self._path.rmdir()
|
||||
except Exception:
|
||||
# Silently ignore errors while clearing cache
|
||||
pass
|
||||
|
||||
|
||||
class Document:
|
||||
"""A rendered document ready to be painted in a pydyf stream.
|
||||
|
||||
@ -171,9 +217,7 @@ class Document:
|
||||
"""
|
||||
|
||||
@classmethod
|
||||
def _build_layout_context(cls, html, stylesheets, presentational_hints,
|
||||
optimize_size, font_config, counter_style,
|
||||
image_cache, forms):
|
||||
def _build_layout_context(cls, html, font_config, counter_style, options):
|
||||
if font_config is None:
|
||||
font_config = FontConfiguration()
|
||||
if counter_style is None:
|
||||
@ -181,19 +225,24 @@ class Document:
|
||||
target_collector = TargetCollector()
|
||||
page_rules = []
|
||||
user_stylesheets = []
|
||||
image_cache = {} if image_cache is None else image_cache
|
||||
for css in stylesheets or []:
|
||||
cache = options['cache']
|
||||
if cache is None:
|
||||
cache = {}
|
||||
elif not isinstance(cache, (dict, DiskCache)):
|
||||
cache = DiskCache(cache)
|
||||
for css in options['stylesheets'] or []:
|
||||
if not hasattr(css, 'matcher'):
|
||||
css = CSS(
|
||||
guess=css, media_type=html.media_type,
|
||||
font_config=font_config, counter_style=counter_style)
|
||||
user_stylesheets.append(css)
|
||||
style_for = get_all_computed_styles(
|
||||
html, user_stylesheets, presentational_hints, font_config,
|
||||
counter_style, page_rules, target_collector, forms)
|
||||
html, user_stylesheets, options['presentational_hints'],
|
||||
font_config, counter_style, page_rules, target_collector,
|
||||
options['pdf_forms'])
|
||||
get_image_from_uri = functools.partial(
|
||||
original_get_image_from_uri, cache=image_cache,
|
||||
url_fetcher=html.url_fetcher, optimize_size=optimize_size)
|
||||
original_get_image_from_uri, cache=cache,
|
||||
url_fetcher=html.url_fetcher, options=options)
|
||||
PROGRESS_LOGGER.info('Step 4 - Creating formatting structure')
|
||||
context = LayoutContext(
|
||||
style_for, get_image_from_uri, font_config, counter_style,
|
||||
@ -201,8 +250,7 @@ class Document:
|
||||
return context
|
||||
|
||||
@classmethod
|
||||
def _render(cls, html, stylesheets, presentational_hints, optimize_size,
|
||||
font_config, counter_style, image_cache, forms):
|
||||
def _render(cls, html, font_config, counter_style, options):
|
||||
if font_config is None:
|
||||
font_config = FontConfiguration()
|
||||
|
||||
@ -210,8 +258,7 @@ class Document:
|
||||
counter_style = CounterStyle()
|
||||
|
||||
context = cls._build_layout_context(
|
||||
html, stylesheets, presentational_hints, optimize_size,
|
||||
font_config, counter_style, image_cache, forms)
|
||||
html, font_config, counter_style, options)
|
||||
|
||||
root_box = build_formatting_structure(
|
||||
html.etree_element, context.style_for, context.get_image_from_uri,
|
||||
@ -222,12 +269,11 @@ class Document:
|
||||
rendering = cls(
|
||||
[Page(page_box) for page_box in page_boxes],
|
||||
DocumentMetadata(**get_html_metadata(html)),
|
||||
html.url_fetcher, font_config, optimize_size)
|
||||
html.url_fetcher, font_config)
|
||||
rendering._html = html
|
||||
return rendering
|
||||
|
||||
def __init__(self, pages, metadata, url_fetcher, font_config,
|
||||
optimize_size):
|
||||
def __init__(self, pages, metadata, url_fetcher, font_config):
|
||||
#: A list of :class:`Page` objects.
|
||||
self.pages = pages
|
||||
#: A :class:`DocumentMetadata` object.
|
||||
@ -246,9 +292,6 @@ class Document:
|
||||
# rendering is destroyed. This is needed as font_config.__del__ removes
|
||||
# fonts that may be used when rendering
|
||||
self.font_config = font_config
|
||||
# Set of flags for PDF size optimization. Can contain "images" and
|
||||
# "fonts".
|
||||
self._optimize_size = optimize_size
|
||||
|
||||
def build_element_structure(self, structure, etree_element=None):
|
||||
if etree_element is None:
|
||||
@ -288,8 +331,7 @@ class Document:
|
||||
elif not isinstance(pages, list):
|
||||
pages = list(pages)
|
||||
return type(self)(
|
||||
pages, self.metadata, self.url_fetcher, self.font_config,
|
||||
self._optimize_size)
|
||||
pages, self.metadata, self.url_fetcher, self.font_config)
|
||||
|
||||
def make_bookmark_tree(self, scale=1, transform_pages=False):
|
||||
"""Make a tree of all bookmarks in the document.
|
||||
@ -324,9 +366,7 @@ class Document:
|
||||
page_number, matrix)
|
||||
return root
|
||||
|
||||
def write_pdf(self, target=None, zoom=1, attachments=None, finisher=None,
|
||||
identifier=None, variant=None, version=None,
|
||||
custom_metadata=False):
|
||||
def write_pdf(self, target=None, zoom=1, finisher=None, **options):
|
||||
"""Paint the pages in a PDF file, with metadata.
|
||||
|
||||
:type target:
|
||||
@ -339,40 +379,38 @@ class Document:
|
||||
All CSS units are affected, including physical units like
|
||||
``cm`` and named sizes like ``A4``. For values other than
|
||||
1, the physical CSS units will thus be "wrong".
|
||||
:param list attachments: A list of additional file attachments for the
|
||||
generated PDF document or :obj:`None`. The list's elements are
|
||||
:class:`weasyprint.Attachment` objects, filenames, URLs or
|
||||
file-like objects.
|
||||
:param finisher: A finisher function, that accepts the document and a
|
||||
:class:`pydyf.PDF` object as parameters, can be passed to perform
|
||||
:type finisher: :term:`callable`
|
||||
:param finisher:
|
||||
A finisher function or callable that accepts the document and a
|
||||
:class:`pydyf.PDF` object as parameters. Can be passed to perform
|
||||
post-processing on the PDF right before the trailer is written.
|
||||
:param bytes identifier: A bytestring used as PDF file identifier.
|
||||
:param str variant: A PDF variant name.
|
||||
:param str version: A PDF version number.
|
||||
:param bool custom_metadata: A boolean defining whether custom HTML
|
||||
metadata should be stored in the generated PDF.
|
||||
:param options:
|
||||
The ``options`` parameter includes by default the
|
||||
:data:`weasyprint.DEFAULT_OPTIONS` values.
|
||||
:returns:
|
||||
The PDF as :obj:`bytes` if ``target`` is not provided or
|
||||
:obj:`None`, otherwise :obj:`None` (the PDF is written to
|
||||
``target``).
|
||||
|
||||
"""
|
||||
pdf = generate_pdf(
|
||||
self, target, zoom, attachments, self._optimize_size, identifier,
|
||||
variant, version, custom_metadata)
|
||||
new_options = DEFAULT_OPTIONS.copy()
|
||||
new_options.update(options)
|
||||
options = new_options
|
||||
pdf = generate_pdf(self, target, zoom, **options)
|
||||
|
||||
identifier = options['pdf_identifier']
|
||||
compress = not options['uncompressed_pdf']
|
||||
|
||||
if finisher:
|
||||
finisher(self, pdf)
|
||||
|
||||
output = io.BytesIO()
|
||||
pdf.write(output, version=pdf.version, identifier=identifier)
|
||||
|
||||
if target is None:
|
||||
output = io.BytesIO()
|
||||
pdf.write(output, pdf.version, identifier, compress)
|
||||
return output.getvalue()
|
||||
|
||||
if hasattr(target, 'write'):
|
||||
pdf.write(target, pdf.version, identifier, compress)
|
||||
else:
|
||||
output.seek(0)
|
||||
if hasattr(target, 'write'):
|
||||
shutil.copyfileobj(output, target)
|
||||
else:
|
||||
with open(target, 'wb') as fd:
|
||||
shutil.copyfileobj(output, fd)
|
||||
with open(target, 'wb') as fd:
|
||||
pdf.write(fd, pdf.version, identifier, compress)
|
||||
|
@ -1052,6 +1052,10 @@ def draw_emojis(stream, font_size, x, y, emojis):
|
||||
def draw_first_line(stream, textbox, text_overflow, block_ellipsis, x, y,
|
||||
angle=0):
|
||||
"""Draw the given ``textbox`` line to the document ``stream``."""
|
||||
# Don’t draw lines with only invisible characters
|
||||
if not textbox.text.strip():
|
||||
return []
|
||||
|
||||
font_size = textbox.style['font_size']
|
||||
if font_size < 1e-6: # Default float precision used by pydyf
|
||||
return []
|
||||
@ -1198,8 +1202,7 @@ def draw_first_line(stream, textbox, text_overflow, block_ellipsis, x, y,
|
||||
png_data = ffi.unpack(hb_data, int(stream.length[0]))
|
||||
pillow_image = Image.open(BytesIO(png_data))
|
||||
image_id = f'{font.hash}{glyph}'
|
||||
image = RasterImage(
|
||||
pillow_image, image_id, optimize_size=())
|
||||
image = RasterImage(pillow_image, image_id, png_data)
|
||||
d = font.widths[glyph] / 1000
|
||||
a = pillow_image.width / pillow_image.height * d
|
||||
pango.pango_font_get_glyph_extents(
|
||||
@ -1210,7 +1213,7 @@ def draw_first_line(stream, textbox, text_overflow, block_ellipsis, x, y,
|
||||
f = f / font_size - font_size
|
||||
emojis.append([image, font, a, d, x_advance, f])
|
||||
|
||||
x_advance += (font.widths[glyph] + offset) / 1000
|
||||
x_advance += (font.widths[glyph] + offset - kerning) / 1000
|
||||
|
||||
# Close the last glyphs list, remove if empty
|
||||
if string[-1] == '<':
|
||||
|
@ -45,9 +45,6 @@ HTML_SPACE_SEPARATED_TOKENS_RE = re.compile(f'[^{HTML_WHITESPACE}]+')
|
||||
def ascii_lower(string):
|
||||
r"""Transform (only) ASCII letters to lower case: A-Z is mapped to a-z.
|
||||
|
||||
:param string: An Unicode string.
|
||||
:returns: A new Unicode string.
|
||||
|
||||
This is used for `ASCII case-insensitive
|
||||
<https://whatwg.org/C#ascii-case-insensitive>`_ matching.
|
||||
|
||||
@ -66,15 +63,9 @@ def ascii_lower(string):
|
||||
|
||||
|
||||
def element_has_link_type(element, link_type):
|
||||
"""
|
||||
Return whether the given element has a ``rel`` attribute with the
|
||||
given link type.
|
||||
|
||||
:param link_type: Must be a lower-case string.
|
||||
|
||||
"""
|
||||
return any(ascii_lower(token) == link_type for token in
|
||||
HTML_SPACE_SEPARATED_TOKENS_RE.findall(element.get('rel', '')))
|
||||
"""Return whether element has a ``rel`` attribute with given link type."""
|
||||
tokens = HTML_SPACE_SEPARATED_TOKENS_RE.findall(element.get('rel', ''))
|
||||
return any(ascii_lower(token) == link_type for token in tokens)
|
||||
|
||||
|
||||
# Maps HTML tag names to function taking an HTML element and returning a Box.
|
||||
|
@ -1,14 +1,21 @@
|
||||
"""Fetch and decode images in various formats."""
|
||||
|
||||
import io
|
||||
import math
|
||||
import struct
|
||||
from hashlib import md5
|
||||
from io import BytesIO
|
||||
from itertools import cycle
|
||||
from math import inf
|
||||
from pathlib import Path
|
||||
from urllib.parse import urlparse
|
||||
from urllib.request import url2pathname
|
||||
from xml.etree import ElementTree
|
||||
|
||||
import pydyf
|
||||
from PIL import Image, ImageFile, ImageOps
|
||||
|
||||
from . import DEFAULT_OPTIONS
|
||||
from .layout.percent import percentage
|
||||
from .logger import LOGGER
|
||||
from .svg import SVG
|
||||
@ -33,32 +40,211 @@ class ImageLoadingError(ValueError):
|
||||
|
||||
|
||||
class RasterImage:
|
||||
def __init__(self, pillow_image, image_id, optimize_size):
|
||||
pillow_image.id = image_id
|
||||
self._pillow_image = pillow_image
|
||||
self._optimize_size = optimize_size
|
||||
self._intrinsic_width = pillow_image.width
|
||||
self._intrinsic_height = pillow_image.height
|
||||
self._intrinsic_ratio = (
|
||||
self._intrinsic_width / self._intrinsic_height
|
||||
if self._intrinsic_height != 0 else inf)
|
||||
def __init__(self, pillow_image, image_id, image_data, filename=None,
|
||||
cache=None, orientation='none', options=DEFAULT_OPTIONS):
|
||||
# Transpose image
|
||||
original_pillow_image = pillow_image
|
||||
pillow_image = rotate_pillow_image(pillow_image, orientation)
|
||||
if original_pillow_image is not pillow_image:
|
||||
# Keep image format as it is discarded by transposition
|
||||
pillow_image.format = original_pillow_image.format
|
||||
# Discard original data, as the image has been transformed
|
||||
image_data = filename = None
|
||||
|
||||
def get_intrinsic_size(self, image_resolution, font_size):
|
||||
return (
|
||||
self._intrinsic_width / image_resolution,
|
||||
self._intrinsic_height / image_resolution,
|
||||
self._intrinsic_ratio)
|
||||
self.id = image_id
|
||||
self._cache = {} if cache is None else cache
|
||||
self._jpeg_quality = jpeg_quality = options['jpeg_quality']
|
||||
self._dpi = options['dpi']
|
||||
|
||||
if 'transparency' in pillow_image.info:
|
||||
pillow_image = pillow_image.convert('RGBA')
|
||||
elif pillow_image.mode in ('1', 'P', 'I'):
|
||||
pillow_image = pillow_image.convert('RGB')
|
||||
|
||||
self.mode = pillow_image.mode
|
||||
self.width = pillow_image.width
|
||||
self.height = pillow_image.height
|
||||
self.ratio = (self.width / self.height) if self.height != 0 else inf
|
||||
self.optimize = optimize = options['optimize_images']
|
||||
|
||||
if pillow_image.format in ('JPEG', 'MPO'):
|
||||
self.format = 'JPEG'
|
||||
if image_data is None or optimize or jpeg_quality is not None:
|
||||
image_file = io.BytesIO()
|
||||
options = {'format': 'JPEG', 'optimize': optimize}
|
||||
if self._jpeg_quality is not None:
|
||||
options['quality'] = self._jpeg_quality
|
||||
pillow_image.save(image_file, **options)
|
||||
image_data = image_file.getvalue()
|
||||
filename = None
|
||||
else:
|
||||
self.format = 'PNG'
|
||||
if image_data is None or optimize or pillow_image.format != 'PNG':
|
||||
image_file = io.BytesIO()
|
||||
pillow_image.save(image_file, format='PNG', optimize=optimize)
|
||||
image_data = image_file.getvalue()
|
||||
filename = None
|
||||
self.image_data = self.cache_image_data(image_data, filename)
|
||||
|
||||
def get_intrinsic_size(self, resolution, font_size):
|
||||
return self.width / resolution, self.height / resolution, self.ratio
|
||||
|
||||
def draw(self, stream, concrete_width, concrete_height, image_rendering):
|
||||
if self._intrinsic_width <= 0 or self._intrinsic_height <= 0:
|
||||
if self.width <= 0 or self.height <= 0:
|
||||
return
|
||||
|
||||
image_name = stream.add_image(
|
||||
self._pillow_image, image_rendering, self._optimize_size)
|
||||
width, height = self.width, self.height
|
||||
if self._dpi:
|
||||
pt_to_in = 4 / 3 / 96
|
||||
width_inches = abs(concrete_width * stream.ctm[0][0] * pt_to_in)
|
||||
height_inches = abs(concrete_height * stream.ctm[1][1] * pt_to_in)
|
||||
dpi = max(self.width / width_inches, self.height / height_inches)
|
||||
if dpi > self._dpi:
|
||||
ratio = self._dpi / dpi
|
||||
image = Image.open(io.BytesIO(self.image_data.data))
|
||||
width = int(round(self.width * ratio))
|
||||
height = int(round(self.height * ratio))
|
||||
image.thumbnail((max(1, width), max(1, height)))
|
||||
image_file = io.BytesIO()
|
||||
image.save(
|
||||
image_file, format=image.format, optimize=self.optimize)
|
||||
width, height = image.width, image.height
|
||||
self.image_data = self.cache_image_data(image_file.getvalue())
|
||||
else:
|
||||
dpi = None
|
||||
|
||||
interpolate = 'true' if image_rendering == 'auto' else 'false'
|
||||
|
||||
image_name = stream.add_image(self, width, height, interpolate)
|
||||
stream.transform(
|
||||
concrete_width, 0, 0, -concrete_height, 0, concrete_height)
|
||||
stream.draw_x_object(image_name)
|
||||
|
||||
def cache_image_data(self, data, filename=None, alpha=False):
|
||||
if filename:
|
||||
return LazyLocalImage(filename)
|
||||
else:
|
||||
key = f'{self.id}{int(alpha)}{self._dpi or ""}'
|
||||
return LazyImage(self._cache, key, data)
|
||||
|
||||
def get_xobject(self, width, height, interpolate):
|
||||
if self.mode in ('RGB', 'RGBA'):
|
||||
color_space = '/DeviceRGB'
|
||||
elif self.mode in ('L', 'LA'):
|
||||
color_space = '/DeviceGray'
|
||||
elif self.mode == 'CMYK':
|
||||
color_space = '/DeviceCMYK'
|
||||
else:
|
||||
LOGGER.warning('Unknown image mode: %s', self.mode)
|
||||
color_space = '/DeviceRGB'
|
||||
|
||||
extra = pydyf.Dictionary({
|
||||
'Type': '/XObject',
|
||||
'Subtype': '/Image',
|
||||
'Width': width,
|
||||
'Height': height,
|
||||
'ColorSpace': color_space,
|
||||
'BitsPerComponent': 8,
|
||||
'Interpolate': interpolate,
|
||||
})
|
||||
|
||||
if self.format == 'JPEG':
|
||||
extra['Filter'] = '/DCTDecode'
|
||||
return pydyf.Stream([self.image_data], extra)
|
||||
|
||||
extra['Filter'] = '/FlateDecode'
|
||||
extra['DecodeParms'] = pydyf.Dictionary({
|
||||
# Predictor 15 specifies that we're providing PNG data,
|
||||
# ostensibly using an "optimum predictor", but doesn't actually
|
||||
# matter as long as the predictor value is 10+ according to the
|
||||
# spec. (Other PNG predictor values assert that we're using
|
||||
# specific predictors that we don't want to commit to, but
|
||||
# "optimum" can vary.)
|
||||
'Predictor': 15,
|
||||
'Columns': width,
|
||||
})
|
||||
if self.mode in ('RGB', 'RGBA'):
|
||||
# Defaults to 1.
|
||||
extra['DecodeParms']['Colors'] = 3
|
||||
if self.mode in ('RGBA', 'LA'):
|
||||
# Remove alpha channel from image
|
||||
pillow_image = Image.open(io.BytesIO(self.image_data.data))
|
||||
alpha = pillow_image.getchannel('A')
|
||||
pillow_image = pillow_image.convert(self.mode[:-1])
|
||||
png_data = self._get_png_data(pillow_image)
|
||||
# Save alpha channel as mask
|
||||
alpha_data = self._get_png_data(alpha)
|
||||
stream = self.cache_image_data(alpha_data, alpha=True)
|
||||
extra['SMask'] = pydyf.Stream([stream], extra={
|
||||
'Filter': '/FlateDecode',
|
||||
'Type': '/XObject',
|
||||
'Subtype': '/Image',
|
||||
'DecodeParms': pydyf.Dictionary({
|
||||
'Predictor': 15,
|
||||
'Columns': width,
|
||||
}),
|
||||
'Width': width,
|
||||
'Height': height,
|
||||
'ColorSpace': '/DeviceGray',
|
||||
'BitsPerComponent': 8,
|
||||
'Interpolate': interpolate,
|
||||
})
|
||||
else:
|
||||
png_data = self._get_png_data(
|
||||
Image.open(io.BytesIO(self.image_data.data)))
|
||||
|
||||
return pydyf.Stream([self.cache_image_data(png_data)], extra)
|
||||
|
||||
@staticmethod
|
||||
def _get_png_data(pillow_image):
|
||||
image_file = BytesIO()
|
||||
pillow_image.save(image_file, format='PNG')
|
||||
|
||||
# Read the PNG header, then discard it because we know it's a PNG. If
|
||||
# this weren't just output from Pillow, we should actually check it.
|
||||
image_file.seek(8)
|
||||
|
||||
png_data = []
|
||||
raw_chunk_length = image_file.read(4)
|
||||
# PNG files consist of a series of chunks.
|
||||
while raw_chunk_length:
|
||||
# Each chunk begins with its data length (four bytes, may be zero),
|
||||
# then its type (four ASCII characters), then the data, then four
|
||||
# bytes of a CRC.
|
||||
chunk_length, = struct.unpack('!I', raw_chunk_length)
|
||||
chunk_type = image_file.read(4)
|
||||
if chunk_type == b'IDAT':
|
||||
png_data.append(image_file.read(chunk_length))
|
||||
else:
|
||||
image_file.seek(chunk_length, io.SEEK_CUR)
|
||||
# We aren't checking the CRC, we assume this is a valid PNG.
|
||||
image_file.seek(4, io.SEEK_CUR)
|
||||
raw_chunk_length = image_file.read(4)
|
||||
|
||||
return b''.join(png_data)
|
||||
|
||||
|
||||
class LazyImage(pydyf.Object):
|
||||
def __init__(self, cache, key, data):
|
||||
super().__init__()
|
||||
self._key = key
|
||||
self._cache = cache
|
||||
cache[key] = data
|
||||
|
||||
@property
|
||||
def data(self):
|
||||
return self._cache[self._key]
|
||||
|
||||
|
||||
class LazyLocalImage(pydyf.Object):
|
||||
def __init__(self, filename):
|
||||
super().__init__()
|
||||
self._filename = filename
|
||||
|
||||
@property
|
||||
def data(self):
|
||||
return Path(self._filename).read_bytes()
|
||||
|
||||
|
||||
class SVGImage:
|
||||
def __init__(self, tree, base_url, url_fetcher, context):
|
||||
@ -91,75 +277,88 @@ class SVGImage:
|
||||
self._url_fetcher, self._context)
|
||||
|
||||
|
||||
def get_image_from_uri(cache, url_fetcher, optimize_size, url,
|
||||
forced_mime_type=None, context=None,
|
||||
orientation='from-image'):
|
||||
def get_image_from_uri(cache, url_fetcher, options, url, forced_mime_type=None,
|
||||
context=None, orientation='from-image'):
|
||||
"""Get an Image instance from an image URI."""
|
||||
if url in cache:
|
||||
return cache[url]
|
||||
|
||||
try:
|
||||
with fetch(url_fetcher, url) as result:
|
||||
parsed_url = urlparse(result.get('redirected_url'))
|
||||
if parsed_url.scheme == 'file':
|
||||
filename = url2pathname(parsed_url.path)
|
||||
else:
|
||||
filename = None
|
||||
if 'string' in result:
|
||||
string = result['string']
|
||||
else:
|
||||
string = result['file_obj'].read()
|
||||
mime_type = forced_mime_type or result['mime_type']
|
||||
|
||||
image = None
|
||||
svg_exceptions = []
|
||||
# Try to rely on given mimetype for SVG
|
||||
if mime_type == 'image/svg+xml':
|
||||
image = None
|
||||
svg_exceptions = []
|
||||
# Try to rely on given mimetype for SVG
|
||||
if mime_type == 'image/svg+xml':
|
||||
try:
|
||||
tree = ElementTree.fromstring(string)
|
||||
image = SVGImage(tree, url, url_fetcher, context)
|
||||
except Exception as svg_exception:
|
||||
svg_exceptions.append(svg_exception)
|
||||
# Try pillow for raster images, or for failing SVG
|
||||
if image is None:
|
||||
try:
|
||||
pillow_image = Image.open(BytesIO(string))
|
||||
except Exception as raster_exception:
|
||||
if mime_type == 'image/svg+xml':
|
||||
# Tried SVGImage then Pillow for a SVG, abort
|
||||
raise ImageLoadingError.from_exception(svg_exceptions[0])
|
||||
try:
|
||||
# Last chance, try SVG
|
||||
tree = ElementTree.fromstring(string)
|
||||
image = SVGImage(tree, url, url_fetcher, context)
|
||||
except Exception as svg_exception:
|
||||
svg_exceptions.append(svg_exception)
|
||||
# Try pillow for raster images, or for failing SVG
|
||||
if image is None:
|
||||
try:
|
||||
pillow_image = Image.open(BytesIO(string))
|
||||
except Exception as raster_exception:
|
||||
if mime_type == 'image/svg+xml':
|
||||
# Tried SVGImage then Pillow for a SVG, abort
|
||||
raise ImageLoadingError.from_exception(
|
||||
svg_exceptions[0])
|
||||
try:
|
||||
# Last chance, try SVG
|
||||
tree = ElementTree.fromstring(string)
|
||||
image = SVGImage(tree, url, url_fetcher, context)
|
||||
except Exception:
|
||||
# Tried Pillow then SVGImage for a raster, abort
|
||||
raise ImageLoadingError.from_exception(
|
||||
raster_exception)
|
||||
else:
|
||||
# Store image id to enable cache in Stream.add_image
|
||||
image_id = md5(url.encode()).hexdigest()
|
||||
# Keep image format as it is discarded by transposition
|
||||
image_format = pillow_image.format
|
||||
if orientation == 'from-image':
|
||||
if 'exif' in pillow_image.info:
|
||||
pillow_image = ImageOps.exif_transpose(
|
||||
pillow_image)
|
||||
elif orientation != 'none':
|
||||
angle, flip = orientation
|
||||
if angle > 0:
|
||||
rotation = getattr(
|
||||
Image.Transpose, f'ROTATE_{angle}')
|
||||
pillow_image = pillow_image.transpose(rotation)
|
||||
if flip:
|
||||
pillow_image = pillow_image.transpose(
|
||||
Image.Transpose.FLIP_LEFT_RIGHT)
|
||||
pillow_image.format = image_format
|
||||
image = RasterImage(pillow_image, image_id, optimize_size)
|
||||
except Exception:
|
||||
# Tried Pillow then SVGImage for a raster, abort
|
||||
raise ImageLoadingError.from_exception(raster_exception)
|
||||
else:
|
||||
# Store image id to enable cache in Stream.add_image
|
||||
image_id = md5(url.encode()).hexdigest()
|
||||
image = RasterImage(
|
||||
pillow_image, image_id, string, filename, cache,
|
||||
orientation, options)
|
||||
|
||||
except (URLFetchingError, ImageLoadingError) as exception:
|
||||
LOGGER.error('Failed to load image at %r: %s', url, exception)
|
||||
image = None
|
||||
|
||||
cache[url] = image
|
||||
return image
|
||||
|
||||
|
||||
def rotate_pillow_image(pillow_image, orientation):
|
||||
"""Return a copy of a Pillow image with modified orientation.
|
||||
|
||||
If orientation is not changed, return the same image.
|
||||
|
||||
"""
|
||||
image_format = pillow_image.format
|
||||
if orientation == 'from-image':
|
||||
if 'exif' in pillow_image.info:
|
||||
pillow_image = ImageOps.exif_transpose(pillow_image)
|
||||
elif orientation != 'none':
|
||||
angle, flip = orientation
|
||||
if angle > 0:
|
||||
rotation = getattr(Image.Transpose, f'ROTATE_{angle}')
|
||||
pillow_image = pillow_image.transpose(rotation)
|
||||
if flip:
|
||||
pillow_image = pillow_image.transpose(
|
||||
Image.Transpose.FLIP_LEFT_RIGHT)
|
||||
|
||||
# Keep image format as it is discarded by transposition
|
||||
pillow_image.format = image_format
|
||||
return pillow_image
|
||||
|
||||
|
||||
def process_color_stops(vector_length, positions):
|
||||
"""Give color stops positions on the gradient vector.
|
||||
|
||||
|
@ -105,10 +105,12 @@ def layout_document(html, root_box, context, max_loops=8):
|
||||
This includes line breaks, page breaks, absolute size and position for all
|
||||
boxes. Page based counters might require multiple passes.
|
||||
|
||||
:param root_box: root of the box tree (formatting structure of the html)
|
||||
the pages' boxes are created from that tree, i.e. this
|
||||
structure is not lost during pagination
|
||||
:returns: a list of laid out Page objects.
|
||||
:param root_box:
|
||||
Root of the box tree (formatting structure of the HTML). The page boxes
|
||||
are created from that tree, this structure is not lost during
|
||||
pagination.
|
||||
:returns:
|
||||
A list of laid out Page objects.
|
||||
|
||||
"""
|
||||
initialize_page_maker(context, root_box)
|
||||
@ -287,13 +289,18 @@ class LayoutContext:
|
||||
Value depends on current page.
|
||||
https://drafts.csswg.org/css-gcpm/#funcdef-string
|
||||
|
||||
:param store: dictionary where the resolved value is stored.
|
||||
:param page: current page.
|
||||
:param name: name of the named string or running element.
|
||||
:param keyword: indicates which value of the named string or running
|
||||
element to use. Default is the first assignment on the
|
||||
current page else the most recent assignment.
|
||||
:returns: text for string set, box for running element
|
||||
:param dict store:
|
||||
Dictionary where the resolved value is stored.
|
||||
:param page:
|
||||
Current page.
|
||||
:param str name:
|
||||
Name of the named string or running element.
|
||||
:param str keyword:
|
||||
Indicates which value of the named string or running element to
|
||||
use. Default is the first assignment on the current page else the
|
||||
most recent assignment.
|
||||
:returns:
|
||||
Text for string set, box for running element.
|
||||
|
||||
"""
|
||||
if self.current_page in store[name]:
|
||||
|
@ -174,20 +174,21 @@ def compute_fixed_dimension(context, box, outer, vertical, top_or_left):
|
||||
|
||||
|
||||
def compute_variable_dimension(context, side_boxes, vertical, outer_sum):
|
||||
"""
|
||||
Compute and set a margin box fixed dimension on ``box``, as described in:
|
||||
https://drafts.csswg.org/css-page-3/#margin-dimension
|
||||
"""Compute and set a margin box fixed dimension on ``box``
|
||||
|
||||
:param side_boxes: Three boxes on a same side (as opposed to a corner.)
|
||||
Described in: https://drafts.csswg.org/css-page-3/#margin-dimension
|
||||
|
||||
:param side_boxes:
|
||||
Three boxes on a same side (as opposed to a corner).
|
||||
A list of:
|
||||
- A @*-left or @*-top margin box
|
||||
- A @*-center or @*-middle margin box
|
||||
- A @*-right or @*-bottom margin box
|
||||
:param vertical:
|
||||
True to set height, margin-top and margin-bottom; False for width,
|
||||
margin-left and margin-right
|
||||
``True`` to set height, margin-top and margin-bottom;
|
||||
``False`` for width, margin-left and margin-right.
|
||||
:param outer_sum:
|
||||
The target total outer dimension (max box width or height)
|
||||
The target total outer dimension (max box width or height).
|
||||
|
||||
"""
|
||||
box_class = VerticalBox if vertical else HorizontalBox
|
||||
@ -310,8 +311,10 @@ def make_margin_boxes(context, page, state):
|
||||
|
||||
Return ``None`` if this margin box should not be generated.
|
||||
|
||||
:param at_keyword: which margin box to return, eg. '@top-left'
|
||||
:param containing_block: as expected by :func:`resolve_percentages`.
|
||||
:param at_keyword:
|
||||
Which margin box to return, e.g. '@top-left'
|
||||
:param containing_block:
|
||||
As expected by :func:`resolve_percentages`.
|
||||
|
||||
"""
|
||||
style = context.style_for(page.page_type, at_keyword)
|
||||
@ -507,9 +510,11 @@ def make_page(context, root_box, page_type, resume_at, page_number,
|
||||
and ``resume_at`` indicates where in the document to start the next page,
|
||||
or is ``None`` if this was the last page.
|
||||
|
||||
:param page_number: integer, start at 1 for the first page
|
||||
:param resume_at: as returned by ``make_page()`` for the previous page,
|
||||
or ``None`` for the first page.
|
||||
:param int page_number:
|
||||
Page number, starts at 1 for the first page.
|
||||
:param resume_at:
|
||||
As returned by ``make_page()`` for the previous page, or ``None`` for
|
||||
the first page.
|
||||
|
||||
"""
|
||||
style = context.style_for(page_type)
|
||||
|
@ -744,9 +744,12 @@ def trailing_whitespace_size(context, box):
|
||||
stripped_box = box.copy_with_text(stripped_text)
|
||||
stripped_box, resume, _ = split_text_box(
|
||||
context, stripped_box, None, old_resume)
|
||||
assert stripped_box is not None
|
||||
assert resume is None
|
||||
return old_box.width - stripped_box.width
|
||||
if stripped_box is None:
|
||||
# old_box split just before the trailing spaces
|
||||
return old_box.width
|
||||
else:
|
||||
assert resume is None
|
||||
return old_box.width - stripped_box.width
|
||||
else:
|
||||
_, _, _, width, _, _ = split_first_line(
|
||||
box.text, box.style, context, None, box.justification_spacing)
|
||||
|
@ -45,7 +45,7 @@ def _w3c_date_to_pdf(string, attr_name):
|
||||
pdf_date += f"{tz_hour:+03d}'{tz_minute:02d}"
|
||||
else:
|
||||
pdf_date += 'Z'
|
||||
return pdf_date
|
||||
return f'D:{pdf_date}'
|
||||
|
||||
|
||||
def _reference_resources(pdf, resources, images, fonts):
|
||||
@ -100,8 +100,7 @@ def _use_references(pdf, resources, images):
|
||||
alpha['SMask']['G'] = alpha['SMask']['G'].reference
|
||||
|
||||
|
||||
def generate_pdf(document, target, zoom, attachments, optimize_size,
|
||||
identifier, variant, version, custom_metadata):
|
||||
def generate_pdf(document, target, zoom, **options):
|
||||
# 0.75 = 72 PDF point per inch / 96 CSS pixel per inch
|
||||
scale = zoom * 0.75
|
||||
|
||||
@ -109,6 +108,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
|
||||
|
||||
# Set properties according to PDF variants
|
||||
mark = False
|
||||
variant, version = options['pdf_variant'], options['pdf_version']
|
||||
if variant:
|
||||
variant_function, properties = VARIANTS[variant]
|
||||
if 'version' in properties:
|
||||
@ -116,6 +116,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
|
||||
if 'mark' in properties:
|
||||
mark = properties['mark']
|
||||
|
||||
identifier = options['pdf_identifier']
|
||||
pdf = pydyf.PDF((version or '1.7'), identifier)
|
||||
states = pydyf.Dictionary()
|
||||
x_objects = pydyf.Dictionary()
|
||||
@ -136,6 +137,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
|
||||
|
||||
annot_files = {}
|
||||
pdf_pages, page_streams = [], []
|
||||
compress = not options['uncompressed_pdf']
|
||||
for page_number, (page, links_and_anchors) in enumerate(
|
||||
zip(document.pages, page_links_and_anchors)):
|
||||
# Draw from the top-left corner
|
||||
@ -155,7 +157,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
|
||||
(right - left) / scale, (bottom - top) / scale)
|
||||
stream = Stream(
|
||||
document.fonts, page_rectangle, states, x_objects, patterns,
|
||||
shadings, images, mark)
|
||||
shadings, images, mark, compress=compress)
|
||||
stream.transform(d=-1, f=(page.height * scale))
|
||||
pdf.add_object(stream)
|
||||
page_streams.append(stream)
|
||||
@ -175,10 +177,11 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
|
||||
|
||||
add_links(links_and_anchors, matrix, pdf, pdf_page, pdf_names, mark)
|
||||
add_annotations(
|
||||
links_and_anchors[0], matrix, document, pdf, pdf_page, annot_files)
|
||||
links_and_anchors[0], matrix, document, pdf, pdf_page, annot_files,
|
||||
compress)
|
||||
add_inputs(
|
||||
page.inputs, matrix, pdf, pdf_page, resources, stream,
|
||||
document.font_config.font_map)
|
||||
document.font_config.font_map, compress)
|
||||
page.paint(stream, scale=scale)
|
||||
|
||||
# Bleed
|
||||
@ -227,7 +230,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
|
||||
_w3c_date_to_pdf(metadata.modified, 'modified'))
|
||||
if metadata.lang:
|
||||
pdf.catalog['Lang'] = pydyf.String(metadata.lang)
|
||||
if custom_metadata:
|
||||
if options['custom_metadata']:
|
||||
for key, value in metadata.custom.items():
|
||||
key = ''.join(char for char in key if char.isalnum())
|
||||
key = key.encode('ascii', errors='ignore').decode()
|
||||
@ -235,7 +238,7 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
|
||||
pdf.info[key] = pydyf.String(value)
|
||||
|
||||
# Embedded files
|
||||
attachments = metadata.attachments + (attachments or [])
|
||||
attachments = metadata.attachments + (options['attachments'] or [])
|
||||
pdf_attachments = []
|
||||
for attachment in attachments:
|
||||
pdf_attachment = write_pdf_attachment(
|
||||
@ -256,7 +259,10 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
|
||||
pdf.catalog['Names']['EmbeddedFiles'] = content.reference
|
||||
|
||||
# Embedded fonts
|
||||
pdf_fonts = build_fonts_dictionary(pdf, document.fonts, optimize_size)
|
||||
subset = not options['full_fonts']
|
||||
hinting = options['hinting']
|
||||
pdf_fonts = build_fonts_dictionary(
|
||||
pdf, document.fonts, compress, subset, hinting)
|
||||
pdf.add_object(pdf_fonts)
|
||||
if 'AcroForm' in pdf.catalog:
|
||||
# Include Dingbats for forms
|
||||
@ -284,6 +290,6 @@ def generate_pdf(document, target, zoom, attachments, optimize_size,
|
||||
|
||||
# Apply PDF variants functions
|
||||
if variant:
|
||||
variant_function(pdf, metadata, document, page_streams)
|
||||
variant_function(pdf, metadata, document, page_streams, compress)
|
||||
|
||||
return pdf
|
||||
|
@ -92,7 +92,8 @@ def add_outlines(pdf, bookmarks, parent=None):
|
||||
return outlines, count
|
||||
|
||||
|
||||
def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map):
|
||||
def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map,
|
||||
compress):
|
||||
"""Include form inputs in PDF."""
|
||||
if not inputs:
|
||||
return
|
||||
@ -119,7 +120,7 @@ def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map):
|
||||
input_name = pydyf.String(element.attrib.get('name', default_name))
|
||||
# TODO: where does this 0.75 scale come from?
|
||||
font_size = style['font_size'] * 0.75
|
||||
field_stream = pydyf.Stream()
|
||||
field_stream = pydyf.Stream(compress=compress)
|
||||
field_stream.set_color_rgb(*style['color'][:3])
|
||||
if input_type == 'checkbox':
|
||||
# Checkboxes
|
||||
@ -130,7 +131,7 @@ def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map):
|
||||
'Type': '/XObject',
|
||||
'Subtype': '/Form',
|
||||
'BBox': pydyf.Array((0, 0, width, height)),
|
||||
})
|
||||
}, compress=compress)
|
||||
checked_stream.push_state()
|
||||
checked_stream.begin_text()
|
||||
checked_stream.set_color_rgb(*style['color'][:3])
|
||||
@ -138,9 +139,8 @@ def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map):
|
||||
# Center (let’s assume that Dingbat’s check has a 0.8em size)
|
||||
x = (width - font_size * 0.8) / 2
|
||||
y = (height - font_size * 0.8) / 2
|
||||
# TODO: we should have these operators in pydyf
|
||||
checked_stream.stream.append(f'{x} {y} Td')
|
||||
checked_stream.stream.append('(4) Tj')
|
||||
checked_stream.move_text_to(x, y)
|
||||
checked_stream.show_text_string('4')
|
||||
checked_stream.end_text()
|
||||
checked_stream.pop_state()
|
||||
pdf.add_object(checked_stream)
|
||||
@ -195,7 +195,7 @@ def add_inputs(inputs, matrix, pdf, page, resources, stream, font_map):
|
||||
pdf.catalog['AcroForm']['Fields'].append(field.reference)
|
||||
|
||||
|
||||
def add_annotations(links, matrix, document, pdf, page, annot_files):
|
||||
def add_annotations(links, matrix, document, pdf, page, annot_files, compress):
|
||||
"""Include annotations in PDF."""
|
||||
# TODO: splitting a link into multiple independent rectangular
|
||||
# annotations works well for pure links, but rather mediocre for
|
||||
@ -226,8 +226,7 @@ def add_annotations(links, matrix, document, pdf, page, annot_files):
|
||||
'Type': '/XObject',
|
||||
'Subtype': '/Form',
|
||||
'BBox': pydyf.Array(rectangle),
|
||||
'Length': 0,
|
||||
})
|
||||
}, compress)
|
||||
pdf.add_object(stream)
|
||||
annot = pydyf.Dictionary({
|
||||
'Type': '/Annot',
|
||||
@ -286,7 +285,7 @@ def write_pdf_attachment(pdf, attachment, url_fetcher):
|
||||
'ModDate': attachment.modified,
|
||||
})
|
||||
})
|
||||
file_stream = pydyf.Stream([stream], file_extra)
|
||||
file_stream = pydyf.Stream([stream], file_extra, compress)
|
||||
pdf.add_object(file_stream)
|
||||
|
||||
except URLFetchingError as exception:
|
||||
|
@ -7,7 +7,7 @@ import pydyf
|
||||
from ..logger import LOGGER
|
||||
|
||||
|
||||
def build_fonts_dictionary(pdf, fonts, optimize_size):
|
||||
def build_fonts_dictionary(pdf, fonts, compress_pdf, subset, hinting):
|
||||
pdf_fonts = pydyf.Dictionary()
|
||||
fonts_by_file_hash = {}
|
||||
for font in fonts.values():
|
||||
@ -21,10 +21,10 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):
|
||||
|
||||
# Clean font, optimize and handle emojis
|
||||
cmap = {}
|
||||
if 'fonts' in optimize_size and not font.used_in_forms:
|
||||
if subset and not font.used_in_forms:
|
||||
for file_font in file_fonts:
|
||||
cmap = {**cmap, **file_font.cmap}
|
||||
font.clean(cmap)
|
||||
font.clean(cmap, hinting)
|
||||
|
||||
# Include font
|
||||
if font.type == 'otf':
|
||||
@ -32,28 +32,24 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):
|
||||
else:
|
||||
font_extra = pydyf.Dictionary({'Length1': len(font.file_content)})
|
||||
font_stream = pydyf.Stream(
|
||||
[font.file_content], font_extra, compress=True)
|
||||
[font.file_content], font_extra, compress=compress_pdf)
|
||||
pdf.add_object(font_stream)
|
||||
font_references_by_file_hash[file_hash] = font_stream.reference
|
||||
|
||||
for font in fonts.values():
|
||||
optimize = (
|
||||
'fonts' in optimize_size and
|
||||
font.ttfont and not font.used_in_forms)
|
||||
if optimize:
|
||||
if subset and font.ttfont and not font.used_in_forms:
|
||||
# Only store widths and map for used glyphs
|
||||
font_widths = font.widths
|
||||
cmap = font.cmap
|
||||
else:
|
||||
# Store width and Unicode map for all glyphs
|
||||
font_widths, cmap = {}, {}
|
||||
ratio = 1024 / font.ttfont['head'].unitsPerEm
|
||||
for letter, key in font.ttfont.getBestCmap().items():
|
||||
glyph = font.ttfont.getGlyphID(key)
|
||||
if glyph not in cmap:
|
||||
cmap[glyph] = chr(letter)
|
||||
width = font.ttfont.getGlyphSet()[key].width
|
||||
font_widths[glyph] = width * ratio
|
||||
font_widths[glyph] = width * 1000 / font.upem
|
||||
|
||||
max_x = max(font_widths.values()) if font_widths else 0
|
||||
bbox = (0, font.descent, max_x, font.ascent)
|
||||
@ -81,7 +77,7 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):
|
||||
b'1 begincodespacerange',
|
||||
b'<0000> <ffff>',
|
||||
b'endcodespacerange',
|
||||
f'{len(cmap)} beginbfchar'.encode()])
|
||||
f'{len(cmap)} beginbfchar'.encode()], compress=compress_pdf)
|
||||
for glyph, text in cmap.items():
|
||||
unicode_codepoints = ''.join(
|
||||
f'{letter.encode("utf-16-be").hex()}' for letter in text)
|
||||
@ -103,7 +99,7 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):
|
||||
|
||||
if font.bitmap:
|
||||
_build_bitmap_font_dictionary(
|
||||
font_dictionary, pdf, font, widths, optimize_size)
|
||||
font_dictionary, pdf, font, widths, compress_pdf, subset)
|
||||
else:
|
||||
font_descriptor = pydyf.Dictionary({
|
||||
'Type': '/FontDescriptor',
|
||||
@ -126,7 +122,8 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):
|
||||
for cid in cids:
|
||||
bits[cid] = '1'
|
||||
stream = pydyf.Stream(
|
||||
(int(''.join(bits), 2).to_bytes(padded_width, 'big'),))
|
||||
(int(''.join(bits), 2).to_bytes(padded_width, 'big'),),
|
||||
compress=compress_pdf)
|
||||
pdf.add_object(stream)
|
||||
font_descriptor['CIDSet'] = stream.reference
|
||||
if font.type == 'otf':
|
||||
@ -156,11 +153,11 @@ def build_fonts_dictionary(pdf, fonts, optimize_size):
|
||||
|
||||
|
||||
def _build_bitmap_font_dictionary(font_dictionary, pdf, font, widths,
|
||||
optimize_size):
|
||||
compress_pdf, subset):
|
||||
# https://docs.microsoft.com/typography/opentype/spec/ebdt
|
||||
font_dictionary['FontBBox'] = pydyf.Array([0, 0, 1, 1])
|
||||
font_dictionary['FontMatrix'] = pydyf.Array([1, 0, 0, 1, 0, 0])
|
||||
if 'fonts' in optimize_size:
|
||||
if subset:
|
||||
chars = tuple(sorted(font.cmap))
|
||||
else:
|
||||
chars = tuple(range(256))
|
||||
@ -309,7 +306,7 @@ def _build_bitmap_font_dictionary(font_dictionary, pdf, font, widths,
|
||||
b'/BPC 1',
|
||||
b'/D [1 0]',
|
||||
b'ID', bitmap, b'EI'
|
||||
])
|
||||
], compress=compress_pdf)
|
||||
pdf.add_object(bitmap_stream)
|
||||
char_procs[glyph_id] = bitmap_stream.reference
|
||||
|
||||
|
@ -20,7 +20,7 @@ for key, value in NS.items():
|
||||
register_namespace(key, value)
|
||||
|
||||
|
||||
def add_metadata(pdf, metadata, variant, version, conformance):
|
||||
def add_metadata(pdf, metadata, variant, version, conformance, compress):
|
||||
"""Add PDF stream of metadata.
|
||||
|
||||
Described in ISO-32000-1:2008, 14.3.2.
|
||||
@ -88,6 +88,6 @@ def add_metadata(pdf, metadata, variant, version, conformance):
|
||||
footer = b'<?xpacket end="r"?>'
|
||||
stream_content = b'\n'.join((header, xml, footer))
|
||||
extra = {'Type': '/Metadata', 'Subtype': '/XML'}
|
||||
metadata = pydyf.Stream([stream_content], extra=extra)
|
||||
metadata = pydyf.Stream([stream_content], extra, compress)
|
||||
pdf.add_object(metadata)
|
||||
pdf.catalog['Metadata'] = metadata.reference
|
||||
|
@ -18,7 +18,7 @@ from ..logger import LOGGER
|
||||
from .metadata import add_metadata
|
||||
|
||||
|
||||
def pdfa(pdf, metadata, document, page_streams, version):
|
||||
def pdfa(pdf, metadata, document, page_streams, compress, version):
|
||||
"""Set metadata for PDF/A documents."""
|
||||
LOGGER.warning(
|
||||
'PDF/A support is experimental, '
|
||||
@ -29,7 +29,7 @@ def pdfa(pdf, metadata, document, page_streams, version):
|
||||
profile = pydyf.Stream(
|
||||
[read_binary(__package__, 'sRGB2014.icc')],
|
||||
pydyf.Dictionary({'N': 3, 'Alternate': '/DeviceRGB'}),
|
||||
compress=True)
|
||||
compress=compress)
|
||||
pdf.add_object(profile)
|
||||
pdf.catalog['OutputIntents'] = pydyf.Array([
|
||||
pydyf.Dictionary({
|
||||
@ -46,7 +46,7 @@ def pdfa(pdf, metadata, document, page_streams, version):
|
||||
pdf_object['F'] = 2 ** (3 - 1)
|
||||
|
||||
# Common PDF metadata stream
|
||||
add_metadata(pdf, metadata, 'a', version, 'B')
|
||||
add_metadata(pdf, metadata, 'a', version, 'B', compress)
|
||||
|
||||
|
||||
VARIANTS = {
|
||||
|
@ -6,7 +6,7 @@ from ..logger import LOGGER
|
||||
from .metadata import add_metadata
|
||||
|
||||
|
||||
def pdfua(pdf, metadata, document, page_streams):
|
||||
def pdfua(pdf, metadata, document, page_streams, compress):
|
||||
"""Set metadata for PDF/UA documents."""
|
||||
LOGGER.warning(
|
||||
'PDF/UA support is experimental, '
|
||||
@ -117,7 +117,7 @@ def pdfua(pdf, metadata, document, page_streams):
|
||||
annotation['F'] = 2 ** (2 - 1)
|
||||
|
||||
# Common PDF metadata stream
|
||||
add_metadata(pdf, metadata, 'ua', version=1, conformance=None)
|
||||
add_metadata(pdf, metadata, 'ua', 1, conformance=None, compress=compress)
|
||||
|
||||
# PDF document extra metadata
|
||||
if 'Lang' not in pdf.catalog:
|
||||
|
@ -1,7 +1,6 @@
|
||||
"""PDF stream."""
|
||||
|
||||
import io
|
||||
import struct
|
||||
from functools import lru_cache
|
||||
from hashlib import md5
|
||||
|
||||
@ -98,7 +97,7 @@ class Font:
|
||||
if len(widths) > 1 and len(set(widths)) == 1:
|
||||
self.flags += 2 ** (1 - 1) # FixedPitch
|
||||
|
||||
def clean(self, cmap):
|
||||
def clean(self, cmap, hinting):
|
||||
if self.ttfont is None:
|
||||
return
|
||||
|
||||
@ -107,7 +106,7 @@ class Font:
|
||||
optimized_font = io.BytesIO()
|
||||
options = subset.Options(
|
||||
retain_gids=True, passthrough_tables=True,
|
||||
ignore_missing_glyphs=True, hinting=False,
|
||||
ignore_missing_glyphs=True, hinting=hinting,
|
||||
desubroutinize=True)
|
||||
options.drop_tables += ['GSUB', 'GPOS', 'SVG']
|
||||
subsetter = subset.Subsetter(options)
|
||||
@ -196,7 +195,6 @@ class Stream(pydyf.Stream):
|
||||
def __init__(self, fonts, page_rectangle, states, x_objects, patterns,
|
||||
shadings, images, mark, *args, **kwargs):
|
||||
super().__init__(*args, **kwargs)
|
||||
self.compress = True
|
||||
self.page_rectangle = page_rectangle
|
||||
self.marked = []
|
||||
self._fonts = fonts
|
||||
@ -357,113 +355,20 @@ class Stream(pydyf.Stream):
|
||||
})
|
||||
group = Stream(
|
||||
self._fonts, self.page_rectangle, states, x_objects, patterns,
|
||||
shadings, self._images, self._mark, extra=extra)
|
||||
shadings, self._images, self._mark, extra=extra,
|
||||
compress=self.compress)
|
||||
group.id = f'x{len(self._x_objects)}'
|
||||
self._x_objects[group.id] = group
|
||||
return group
|
||||
|
||||
def _get_png_data(self, pillow_image, optimize):
|
||||
image_file = io.BytesIO()
|
||||
pillow_image.save(image_file, format='PNG', optimize=optimize)
|
||||
|
||||
# Read the PNG header, then discard it because we know it's a PNG. If
|
||||
# this weren't just output from Pillow, we should actually check it.
|
||||
image_file.seek(8)
|
||||
|
||||
png_data = b''
|
||||
raw_chunk_length = image_file.read(4)
|
||||
# PNG files consist of a series of chunks.
|
||||
while len(raw_chunk_length) > 0:
|
||||
# Each chunk begins with its data length (four bytes, may be zero),
|
||||
# then its type (four ASCII characters), then the data, then four
|
||||
# bytes of a CRC.
|
||||
chunk_len, = struct.unpack('!I', raw_chunk_length)
|
||||
chunk_type = image_file.read(4)
|
||||
if chunk_type == b'IDAT':
|
||||
png_data += image_file.read(chunk_len)
|
||||
else:
|
||||
image_file.seek(chunk_len, io.SEEK_CUR)
|
||||
# We aren't checking the CRC, we assume this is a valid PNG.
|
||||
image_file.seek(4, io.SEEK_CUR)
|
||||
raw_chunk_length = image_file.read(4)
|
||||
|
||||
return png_data
|
||||
|
||||
def add_image(self, pillow_image, image_rendering, optimize_size):
|
||||
image_name = f'i{pillow_image.id}'
|
||||
def add_image(self, image, width, height, interpolate):
|
||||
image_name = f'i{image.id}{width}{height}{interpolate}'
|
||||
self._x_objects[image_name] = None # Set by write_pdf
|
||||
if image_name in self._images:
|
||||
# Reuse image already stored in document
|
||||
return image_name
|
||||
|
||||
if 'transparency' in pillow_image.info:
|
||||
pillow_image = pillow_image.convert('RGBA')
|
||||
elif pillow_image.mode in ('1', 'P', 'I'):
|
||||
pillow_image = pillow_image.convert('RGB')
|
||||
|
||||
if pillow_image.mode in ('RGB', 'RGBA'):
|
||||
color_space = '/DeviceRGB'
|
||||
elif pillow_image.mode in ('L', 'LA'):
|
||||
color_space = '/DeviceGray'
|
||||
elif pillow_image.mode == 'CMYK':
|
||||
color_space = '/DeviceCMYK'
|
||||
else:
|
||||
LOGGER.warning('Unknown image mode: %s', pillow_image.mode)
|
||||
color_space = '/DeviceRGB'
|
||||
|
||||
interpolate = 'true' if image_rendering == 'auto' else 'false'
|
||||
extra = pydyf.Dictionary({
|
||||
'Type': '/XObject',
|
||||
'Subtype': '/Image',
|
||||
'Width': pillow_image.width,
|
||||
'Height': pillow_image.height,
|
||||
'ColorSpace': color_space,
|
||||
'BitsPerComponent': 8,
|
||||
'Interpolate': interpolate,
|
||||
})
|
||||
|
||||
optimize = 'images' in optimize_size
|
||||
if pillow_image.format in ('JPEG', 'MPO'):
|
||||
extra['Filter'] = '/DCTDecode'
|
||||
image_file = io.BytesIO()
|
||||
pillow_image.save(image_file, format='JPEG', optimize=optimize)
|
||||
stream = [image_file.getvalue()]
|
||||
else:
|
||||
extra['Filter'] = '/FlateDecode'
|
||||
extra['DecodeParms'] = pydyf.Dictionary({
|
||||
# Predictor 15 specifies that we're providing PNG data,
|
||||
# ostensibly using an "optimum predictor", but doesn't actually
|
||||
# matter as long as the predictor value is 10+ according to the
|
||||
# spec. (Other PNG predictor values assert that we're using
|
||||
# specific predictors that we don't want to commit to, but
|
||||
# "optimum" can vary.)
|
||||
'Predictor': 15,
|
||||
'Columns': pillow_image.width,
|
||||
})
|
||||
if pillow_image.mode in ('RGB', 'RGBA'):
|
||||
# Defaults to 1.
|
||||
extra['DecodeParms']['Colors'] = 3
|
||||
if pillow_image.mode in ('RGBA', 'LA'):
|
||||
alpha = pillow_image.getchannel('A')
|
||||
pillow_image = pillow_image.convert(pillow_image.mode[:-1])
|
||||
alpha_data = self._get_png_data(alpha, optimize)
|
||||
extra['SMask'] = pydyf.Stream([alpha_data], extra={
|
||||
'Filter': '/FlateDecode',
|
||||
'Type': '/XObject',
|
||||
'Subtype': '/Image',
|
||||
'DecodeParms': pydyf.Dictionary({
|
||||
'Predictor': 15,
|
||||
'Columns': pillow_image.width,
|
||||
}),
|
||||
'Width': pillow_image.width,
|
||||
'Height': pillow_image.height,
|
||||
'ColorSpace': '/DeviceGray',
|
||||
'BitsPerComponent': 8,
|
||||
'Interpolate': interpolate,
|
||||
})
|
||||
stream = [self._get_png_data(pillow_image, optimize)]
|
||||
|
||||
xobject = pydyf.Stream(stream, extra=extra)
|
||||
xobject = image.get_xobject(width, height, interpolate)
|
||||
self._images[image_name] = xobject
|
||||
return image_name
|
||||
|
||||
@ -493,7 +398,8 @@ class Stream(pydyf.Stream):
|
||||
})
|
||||
pattern = Stream(
|
||||
self._fonts, self.page_rectangle, states, x_objects, patterns,
|
||||
shadings, self._images, self._mark, extra=extra)
|
||||
shadings, self._images, self._mark, extra=extra,
|
||||
compress=self.compress)
|
||||
pattern.id = f'p{len(self._patterns)}'
|
||||
self._patterns[pattern.id] = pattern
|
||||
return pattern
|
||||
|
@ -14,7 +14,7 @@ def svg(svg, node, font_size):
|
||||
node.get('width'), node.get('height'), font_size)
|
||||
scale_x, scale_y, translate_x, translate_y = preserve_ratio(
|
||||
svg, node, font_size, width, height)
|
||||
if svg.tree != node:
|
||||
if svg.tree != node and node.get('overflow', 'hidden') == 'hidden':
|
||||
svg.stream.rectangle(0, 0, width, height)
|
||||
svg.stream.clip()
|
||||
svg.stream.end()
|
||||
|
@ -12,6 +12,10 @@ class TextBox:
|
||||
self.pango_layout = pango_layout
|
||||
self.style = style
|
||||
|
||||
@property
|
||||
def text(self):
|
||||
return self.pango_layout.text
|
||||
|
||||
|
||||
def text(svg, node, font_size):
|
||||
"""Draw text node."""
|
||||
|
@ -182,11 +182,20 @@ class Layout:
|
||||
add_attr(0, len(bytestring), letter_spacing)
|
||||
|
||||
if word_spacing:
|
||||
if bytestring == b' ':
|
||||
# We need more than one space to set word spacing
|
||||
self.text = ' \u200b' # Space + zero-width space
|
||||
text, bytestring = unicode_to_char_p(self.text)
|
||||
pango.pango_layout_set_text(self.layout, text, -1)
|
||||
|
||||
space_spacing = (
|
||||
units_from_double(word_spacing) + letter_spacing)
|
||||
position = bytestring.find(b' ')
|
||||
# Pango gives only half of word-spacing on boundaries
|
||||
boundary_positions = (0, len(bytestring) - 1)
|
||||
while position != -1:
|
||||
add_attr(position, position + 1, space_spacing)
|
||||
factor = 1 + (position in boundary_positions)
|
||||
add_attr(position, position + 1, factor * space_spacing)
|
||||
position = bytestring.find(b' ', position + 1)
|
||||
|
||||
if word_breaking:
|
||||
@ -226,15 +235,7 @@ class Layout:
|
||||
|
||||
|
||||
def create_layout(text, style, context, max_width, justification_spacing):
|
||||
"""Return an opaque Pango layout with default Pango line-breaks.
|
||||
|
||||
:param text: Unicode
|
||||
:param style: a style dict of computed values
|
||||
:param max_width:
|
||||
The maximum available width in the same unit as ``style['font_size']``,
|
||||
or ``None`` for unlimited width.
|
||||
|
||||
"""
|
||||
"""Return an opaque Pango layout with default Pango line-breaks."""
|
||||
layout = Layout(context, style, justification_spacing, max_width)
|
||||
|
||||
# Make sure that max_width * Pango.SCALE == max_width * 1024 fits in a
|
||||
|
@ -175,9 +175,12 @@ def default_url_fetcher(url, timeout=10, ssl_context=None):
|
||||
``url_fetcher`` argument to :class:`HTML` or :class:`CSS`.
|
||||
(See :ref:`URL Fetchers`.)
|
||||
|
||||
:param str url: The URL of the resource to fetch.
|
||||
:param int timeout: The number of seconds before HTTP requests are dropped.
|
||||
:param ssl.SSLContext ssl_context: An SSL context used for HTTP requests.
|
||||
:param str url:
|
||||
The URL of the resource to fetch.
|
||||
:param int timeout:
|
||||
The number of seconds before HTTP requests are dropped.
|
||||
:param ssl.SSLContext ssl_context:
|
||||
An SSL context used for HTTP requests.
|
||||
:raises: An exception indicating failure, e.g. :obj:`ValueError` on
|
||||
syntactically invalid URL.
|
||||
:returns: A :obj:`dict` with the following keys:
|
||||
|
Loading…
Reference in New Issue
Block a user