2011-10-28 19:14:31 +04:00
|
|
|
|
Documentation
|
|
|
|
|
=============
|
|
|
|
|
|
|
|
|
|
* `Installing </install/>`_
|
|
|
|
|
* **Using**
|
|
|
|
|
* `Hacking </hacking/>`_
|
2011-12-13 16:42:40 +04:00
|
|
|
|
* `Features </features/>`_
|
2011-10-28 19:14:31 +04:00
|
|
|
|
|
|
|
|
|
Using WeasyPrint
|
|
|
|
|
~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
As a standalone program
|
|
|
|
|
-----------------------
|
|
|
|
|
|
|
|
|
|
Once you have WeasyPrint `installed </install/>`_, you should have a
|
2012-02-29 22:13:10 +04:00
|
|
|
|
``weasyprint`` executable. Using it can be as simple as this::
|
2011-10-28 19:14:31 +04:00
|
|
|
|
|
|
|
|
|
weasyprint http://weasyprint.org /tmp/weasyprint-website.pdf
|
|
|
|
|
|
2012-02-29 22:13:10 +04:00
|
|
|
|
You may see warnings on stderr about unsupported CSS.
|
|
|
|
|
|
2012-06-04 20:52:33 +04:00
|
|
|
|
The ``weasyprint`` command takes two arguments: its input and output.
|
|
|
|
|
The input is a filename or URL to an HTML document, or ``-`` to read
|
2012-02-29 22:13:10 +04:00
|
|
|
|
HTML from stdin. The output is a filename, or ``-`` to write to stdout.
|
|
|
|
|
|
|
|
|
|
More options are available:
|
|
|
|
|
|
|
|
|
|
``-e`` or ``--encoding``
|
|
|
|
|
Force the input character encoding (eg. ``-e utf8``).
|
|
|
|
|
|
|
|
|
|
``-f`` or ``--format``
|
|
|
|
|
Choose the output file format among PDF and PNG (eg. ``-f png``).
|
|
|
|
|
Required if the output is not a ``.pdf`` or ``.png`` filename.
|
|
|
|
|
|
|
|
|
|
``-s`` or ``--stylesheet``
|
|
|
|
|
Add a user CSS stylesheet to the document. (eg. ``-s print.css``).
|
|
|
|
|
Multiple stylesheets are allowed.
|
|
|
|
|
|
2012-08-03 20:03:44 +04:00
|
|
|
|
``-m`` or ``--media-type``
|
|
|
|
|
Set the media type to use for ``@media``. Defaults to ``print``.
|
|
|
|
|
|
|
|
|
|
``-r`` or ``--resolution``
|
|
|
|
|
For PNG output only. Set the resolution in PNG pixel per CSS inch.
|
|
|
|
|
Defaults to 96, which means that PNG pixels match CSS pixels.
|
|
|
|
|
|
|
|
|
|
``--base-url``
|
|
|
|
|
Set the base for relative URLs in the HTML input. Defaults to the input’s
|
|
|
|
|
own URL or the current directory for stdin.
|
|
|
|
|
|
2012-02-29 22:13:10 +04:00
|
|
|
|
``--version``
|
|
|
|
|
Show the version number.
|
|
|
|
|
|
|
|
|
|
``-h`` or ``--help``
|
|
|
|
|
Show the command-line usage.
|
|
|
|
|
|
2011-10-28 19:14:31 +04:00
|
|
|
|
|
|
|
|
|
As a Python library
|
|
|
|
|
-------------------
|
|
|
|
|
|
|
|
|
|
If you’re writing Python code you can import and use WeasyPrint just like
|
|
|
|
|
any other Python library:
|
|
|
|
|
|
|
|
|
|
.. code-block:: python
|
|
|
|
|
|
2012-02-29 22:13:10 +04:00
|
|
|
|
import weasyprint
|
|
|
|
|
weasyprint.HTML('http://weasyprint.org/').write_pdf('/tmp/weasyprint-website.pdf')
|
2011-10-31 19:45:22 +04:00
|
|
|
|
|
2012-06-25 18:12:48 +04:00
|
|
|
|
The public API is made of two classes: ``HTML`` and ``CSS``.
|
2012-02-29 23:00:19 +04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
API stability
|
|
|
|
|
.............
|
|
|
|
|
|
|
|
|
|
Everything described here is considered “public”: this is what you can rely
|
|
|
|
|
on. We will try to maintain backward-compatibility, although there is no
|
|
|
|
|
hard promise until version 1.0.
|
|
|
|
|
|
|
|
|
|
Anything else should not be used outside of WeasyPrint itself: we reserve
|
|
|
|
|
the right to change it or remove it at any point. Please do `tell us`_
|
|
|
|
|
if you feel like something should be in the public API. It can probably
|
|
|
|
|
be added in the next version.
|
|
|
|
|
|
|
|
|
|
.. _tell us: /community/
|
|
|
|
|
|
|
|
|
|
|
2012-06-04 20:52:33 +04:00
|
|
|
|
The ``weasyprint.HTML`` class
|
|
|
|
|
.............................
|
|
|
|
|
|
|
|
|
|
An ``HTML`` object represents an HTML document parsed by lxml_.
|
2012-02-29 23:00:19 +04:00
|
|
|
|
|
2012-06-04 20:52:33 +04:00
|
|
|
|
.. _lxml: http://lxml.de/
|
2012-02-29 23:00:19 +04:00
|
|
|
|
|
2012-03-15 19:01:09 +04:00
|
|
|
|
You can just create an instance with a positional argument:
|
2012-06-04 20:52:33 +04:00
|
|
|
|
``doc = HTML(something)``
|
|
|
|
|
The class will try to guess if the input is a filename, an absolute URL,
|
|
|
|
|
or a file-like object.
|
2012-02-29 23:00:19 +04:00
|
|
|
|
|
|
|
|
|
Alternatively, you can name the argument so that no guessing is
|
|
|
|
|
involved:
|
|
|
|
|
|
2012-06-04 20:52:33 +04:00
|
|
|
|
* ``HTML(filename=foo)`` a filename, relative to the current directory
|
2012-03-15 19:01:09 +04:00
|
|
|
|
or absolute.
|
2012-06-04 20:52:33 +04:00
|
|
|
|
* ``HTML(url=foo)`` an absolute, fully qualified URL.
|
|
|
|
|
* ``HTML(file_obj=foo)`` a file-like: any object with a ``read()`` method.
|
|
|
|
|
* ``HTML(string=foo)`` a string of HTML source. (This argument must be named.)
|
|
|
|
|
* ``HTML(tree=foo)`` a parsed lxml tree. (This argument must be named.)
|
2012-02-29 23:00:19 +04:00
|
|
|
|
|
2012-06-04 20:52:33 +04:00
|
|
|
|
Specifying multiple inputs is an error: ``HTML(filename=foo, url=bar)``
|
2012-03-15 19:01:09 +04:00
|
|
|
|
will raise.
|
2012-02-29 23:00:19 +04:00
|
|
|
|
|
2012-03-15 19:01:09 +04:00
|
|
|
|
You can also pass optional named arguments:
|
2012-02-29 23:00:19 +04:00
|
|
|
|
|
2012-03-15 19:01:09 +04:00
|
|
|
|
* ``encoding``: force the source character encoding
|
2012-06-04 20:52:33 +04:00
|
|
|
|
* ``base_url``: used to resolve relative URLs (eg. in
|
|
|
|
|
``<img src="../foo.png">``).
|
2012-02-29 23:00:19 +04:00
|
|
|
|
If not passed explicitly, try to use the input filename, URL, or
|
|
|
|
|
``name`` attribute of file objects.
|
2012-07-18 18:40:33 +04:00
|
|
|
|
* ``url_fetcher``: override the URL fetcher. (See `below <#url-fetchers>`_.)
|
2012-08-03 20:03:44 +04:00
|
|
|
|
* ``media_type``: the media type to use for ``@media``. Defaults to ``print``.
|
2012-02-29 23:00:19 +04:00
|
|
|
|
|
2012-06-25 18:12:48 +04:00
|
|
|
|
**Note:** In some cases like ``HTML(string=foo)`` you need to pass ``base_url``
|
|
|
|
|
explicitly, or relative URLs will be invalid.
|
|
|
|
|
|
|
|
|
|
``HTML`` objects have three public methods:
|
2012-02-29 23:00:19 +04:00
|
|
|
|
|
|
|
|
|
``HTML.write_pdf(target=None, stylesheets=None)``
|
|
|
|
|
Render the document with stylesheets from three *origins*:
|
|
|
|
|
|
|
|
|
|
* The HTML5 `user agent stylesheet`_;
|
|
|
|
|
* Author stylesheets embedded in the document in ``<style>`` elements or
|
|
|
|
|
linked by ``<link rel=stylesheet>`` elements;
|
|
|
|
|
* User stylesheets provided in the ``stylesheets`` parameter to this
|
|
|
|
|
method. If provided, ``stylesheets`` must be an iterable where elements
|
2012-06-04 20:52:33 +04:00
|
|
|
|
are ``CSS`` instances (see below) or anything that can be passed
|
|
|
|
|
as an unnamed argument to ``CSS()``.
|
|
|
|
|
|
|
|
|
|
If you use this ``stylesheet`` parameter or the ``-s`` option of the
|
|
|
|
|
command-line API, keep in mind that *user* stylesheets have a lower
|
|
|
|
|
priority than *author* stylesheets in the cascade_.
|
2012-02-29 23:00:19 +04:00
|
|
|
|
|
|
|
|
|
``target`` can be a filename or a file-like object (anything with a
|
|
|
|
|
``write()`` method) where the PDF output is written.
|
2012-06-04 20:52:33 +04:00
|
|
|
|
If ``target`` is not provided, the method returns the PDF content
|
|
|
|
|
as a byte string.
|
2012-02-29 23:00:19 +04:00
|
|
|
|
|
2012-06-25 18:12:48 +04:00
|
|
|
|
``HTML.write_png(target=None, stylesheets=None, resolution=96)``
|
|
|
|
|
Like ``write_pdf()``, but writes a single PNG image instead of PDF.
|
|
|
|
|
|
|
|
|
|
``resolution`` is counted in pixels in the PNG output per CSS inch.
|
|
|
|
|
Note however that CSS pixels are always 1/96 CSS inch.
|
|
|
|
|
With the default resolution of 96, CSS pixels match PNG pixels.
|
2012-02-29 23:00:19 +04:00
|
|
|
|
|
2012-06-04 20:52:33 +04:00
|
|
|
|
Pages are painted in order from top to bottom, and horizontally centered.
|
|
|
|
|
The resulting image is a wide as the widest page, and as high as the
|
|
|
|
|
sum of all pages. There is no decoration around pages other than
|
|
|
|
|
specified in CSS.
|
|
|
|
|
|
2012-06-25 18:12:48 +04:00
|
|
|
|
``HTML.get_png_pages(stylesheets=None, resolution=96)``
|
|
|
|
|
Render each page to a separate PNG image.
|
|
|
|
|
|
|
|
|
|
``stylesheets`` and ``resolution`` are the same as in ``write_png()``.
|
|
|
|
|
|
|
|
|
|
Returns a generator of ``(width, height, png_bytes)`` tuples, one for
|
|
|
|
|
each page, in order. ``width`` and ``height`` are the size of the page
|
|
|
|
|
in PNG pixels, ``png_bytes`` is a byte string.
|
|
|
|
|
|
2012-02-29 23:00:19 +04:00
|
|
|
|
|
|
|
|
|
.. _user agent stylesheet: https://github.com/Kozea/WeasyPrint/blob/master/weasyprint/css/html5_ua.css
|
2012-06-04 20:52:33 +04:00
|
|
|
|
.. _cascade: http://www.w3.org/TR/CSS21/cascade.html#cascading-order
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The ``weasyprint.CSS`` class
|
|
|
|
|
............................
|
|
|
|
|
|
|
|
|
|
A ``CSS`` object represents a CSS stylesheet parsed by tinycss.
|
|
|
|
|
An instance is created in the same way as ``HTML``, except that
|
|
|
|
|
the ``tree`` parameter is not available.
|
|
|
|
|
|
|
|
|
|
``CSS`` objects have no public attribute or method. They are only meant to
|
|
|
|
|
be used in the ``write_pdf`` or ``write_png`` method. (See above.)
|
2011-10-31 19:45:22 +04:00
|
|
|
|
|
2012-06-25 18:12:48 +04:00
|
|
|
|
The above warning on ``base_url`` and string input applies too: relative
|
|
|
|
|
URLs will be invalid if there is no base URL.
|
|
|
|
|
|
2011-12-12 18:22:24 +04:00
|
|
|
|
|
2012-07-18 18:40:33 +04:00
|
|
|
|
URL fetchers
|
|
|
|
|
............
|
|
|
|
|
|
2012-07-19 16:59:58 +04:00
|
|
|
|
Flask-WeasyPrint_ makes use of a custom URL fetcher to integrate WeasyPrint
|
|
|
|
|
with a Flask_ application.
|
|
|
|
|
|
|
|
|
|
.. _Flask-WeasyPrint: http://packages.python.org/Flask-WeasyPrint/
|
|
|
|
|
.. _Flask: http://flask.pocoo.org/
|
|
|
|
|
|
2012-07-18 18:40:33 +04:00
|
|
|
|
The URL fetcher is used for resources with an ``url`` input as well as
|
|
|
|
|
linked images and stylesheets. It is a function (or any callable) that
|
|
|
|
|
takes a single parameter (the URL) and should raise any exception to
|
|
|
|
|
indicate failure or return a dict with the following keys:
|
|
|
|
|
|
|
|
|
|
* One of ``string`` (a byte string) or ``file_obj`` (a file-like object)
|
|
|
|
|
* Optionally: ``mime_type``, a MIME type extracted eg. from a *Content-Type*
|
|
|
|
|
header. If not provided, the type is guessed from the file extension
|
|
|
|
|
in the URL.
|
|
|
|
|
* Optionally: ``encoding``, a character encoding extracted eg.from a
|
|
|
|
|
*charset* parameter in a *Content-Type* header
|
|
|
|
|
* Optionally: ``redirected_url``, the actual URL of the ressource in case
|
|
|
|
|
there were eg. HTTP redirects.
|
|
|
|
|
|
|
|
|
|
URL fetchers can defer to the default fetcher:
|
|
|
|
|
|
|
|
|
|
.. code-block:: python
|
|
|
|
|
|
|
|
|
|
from weasyprint import default_url_fetcher, HTML
|
|
|
|
|
|
|
|
|
|
def my_fetcher(url):
|
|
|
|
|
if url.startswith('graph:')
|
|
|
|
|
graph_data = map(float, url[6:].split(','))
|
|
|
|
|
return dict(string=generate_graph(graph_data),
|
|
|
|
|
mime_type='image/png')
|
|
|
|
|
else:
|
|
|
|
|
return weasyprint.default_url_fetcher(url)
|
|
|
|
|
|
|
|
|
|
source = '<img src="graph:42,10.3,87">'
|
|
|
|
|
HTML(string=source, url_fetcher=my_fetcher).write_pdf('out.pdf')
|
|
|
|
|
|
|
|
|
|
|
2011-12-12 18:22:24 +04:00
|
|
|
|
Logging
|
2012-08-11 11:07:54 +04:00
|
|
|
|
.......
|
2011-12-12 18:22:24 +04:00
|
|
|
|
|
2012-03-15 19:01:09 +04:00
|
|
|
|
Most errors (syntax error in CSS, unsupported CSS property, missing image, ...)
|
2011-12-12 18:22:24 +04:00
|
|
|
|
are not fatal and will not prevent a document from being rendered.
|
|
|
|
|
|
2012-05-24 19:28:42 +04:00
|
|
|
|
WeasyPrint uses the ``logging`` module from the Python standard library
|
|
|
|
|
to log these errors and let you know about them.
|
|
|
|
|
|
|
|
|
|
Logged messaged will go to stderr by default. You can change that by
|
|
|
|
|
configuring the ``weasyprint`` logger object:
|
2011-12-12 18:22:24 +04:00
|
|
|
|
|
|
|
|
|
.. code-block:: python
|
|
|
|
|
|
|
|
|
|
import logging
|
2012-05-24 19:28:42 +04:00
|
|
|
|
logger = logging.getLogger('weasyprint')
|
|
|
|
|
logger.handlers = [] # Remove the default stderr handler
|
|
|
|
|
logger.addHandler(logging.FileHandler('/path/to/weasyprint.log'))
|
2011-12-12 18:22:24 +04:00
|
|
|
|
|
2012-05-24 19:28:42 +04:00
|
|
|
|
See the `logging documentation <http://docs.python.org/library/logging.html>`_
|
|
|
|
|
for details.
|
2012-02-29 23:00:19 +04:00
|
|
|
|
|
|
|
|
|
|
2012-08-11 11:07:54 +04:00
|
|
|
|
Navigator
|
|
|
|
|
---------
|
|
|
|
|
|
|
|
|
|
WeasyPrint is a very limited web browser, running in your web browser.
|
|
|
|
|
Start it with:
|
|
|
|
|
|
|
|
|
|
.. code-block:: sh
|
|
|
|
|
|
|
|
|
|
python -m weasyprint.navigator
|
|
|
|
|
|
|
|
|
|
… and open your browser at http://127.0.0.1:5000/.
|
|
|
|
|
|
|
|
|
|
It does not support cookies, forms, or many other things that you would
|
|
|
|
|
expect from a “real” browser. It only shows the PNG output from WeasyPrint
|
|
|
|
|
with overlaid clickable hyperlinks. It is mostly useful for playing and testing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Errors
|
|
|
|
|
------
|
|
|
|
|
|
|
|
|
|
If you get an exception during rendering, it is probably a bug in WeasyPrint.
|
|
|
|
|
Please copy the full traceback and report it on our `issue tracker`_.
|
|
|
|
|
|
|
|
|
|
.. _issue tracker: http://redmine.kozea.fr/projects/weasyprint/issues
|
|
|
|
|
|
|
|
|
|
|
2012-02-29 23:00:19 +04:00
|
|
|
|
What’s next
|
|
|
|
|
-----------
|
|
|
|
|
|
|
|
|
|
If you want to change something in WeasyPrint or just see how it works,
|
|
|
|
|
it’s time to `start hacking </hacking>`_!
|