1
1
mirror of https://github.com/Kozea/WeasyPrint.git synced 2024-10-05 00:21:15 +03:00
WeasyPrint/pages/using.rst

281 lines
9.3 KiB
ReStructuredText
Raw Normal View History

2011-10-28 19:14:31 +04:00
Documentation
=============
* `Installing </install/>`_
* **Using**
* `Hacking </hacking/>`_
2011-12-13 16:42:40 +04:00
* `Features </features/>`_
2011-10-28 19:14:31 +04:00
Using WeasyPrint
~~~~~~~~~~~~~~~~
As a standalone program
-----------------------
Once you have WeasyPrint `installed </install/>`_, you should have a
2012-02-29 22:13:10 +04:00
``weasyprint`` executable. Using it can be as simple as this::
2011-10-28 19:14:31 +04:00
weasyprint http://weasyprint.org /tmp/weasyprint-website.pdf
2012-02-29 22:13:10 +04:00
You may see warnings on stderr about unsupported CSS.
2012-06-04 20:52:33 +04:00
The ``weasyprint`` command takes two arguments: its input and output.
The input is a filename or URL to an HTML document, or ``-`` to read
2012-02-29 22:13:10 +04:00
HTML from stdin. The output is a filename, or ``-`` to write to stdout.
More options are available:
``-e`` or ``--encoding``
Force the input character encoding (eg. ``-e utf8``).
``-f`` or ``--format``
Choose the output file format among PDF and PNG (eg. ``-f png``).
Required if the output is not a ``.pdf`` or ``.png`` filename.
``-s`` or ``--stylesheet``
Add a user CSS stylesheet to the document. (eg. ``-s print.css``).
Multiple stylesheets are allowed.
2012-08-03 20:03:44 +04:00
``-m`` or ``--media-type``
Set the media type to use for ``@media``. Defaults to ``print``.
``-r`` or ``--resolution``
For PNG output only. Set the resolution in PNG pixel per CSS inch.
Defaults to 96, which means that PNG pixels match CSS pixels.
``--base-url``
Set the base for relative URLs in the HTML input. Defaults to the inputs
own URL or the current directory for stdin.
2012-02-29 22:13:10 +04:00
``--version``
Show the version number.
``-h`` or ``--help``
Show the command-line usage.
2011-10-28 19:14:31 +04:00
As a Python library
-------------------
If youre writing Python code you can import and use WeasyPrint just like
any other Python library:
.. code-block:: python
2012-02-29 22:13:10 +04:00
import weasyprint
weasyprint.HTML('http://weasyprint.org/').write_pdf('/tmp/weasyprint-website.pdf')
2011-10-31 19:45:22 +04:00
2012-06-25 18:12:48 +04:00
The public API is made of two classes: ``HTML`` and ``CSS``.
2012-02-29 23:00:19 +04:00
API stability
.............
Everything described here is considered “public”: this is what you can rely
on. We will try to maintain backward-compatibility, although there is no
hard promise until version 1.0.
Anything else should not be used outside of WeasyPrint itself: we reserve
the right to change it or remove it at any point. Please do `tell us`_
if you feel like something should be in the public API. It can probably
be added in the next version.
.. _tell us: /community/
2012-06-04 20:52:33 +04:00
The ``weasyprint.HTML`` class
.............................
An ``HTML`` object represents an HTML document parsed by lxml_.
2012-02-29 23:00:19 +04:00
2012-06-04 20:52:33 +04:00
.. _lxml: http://lxml.de/
2012-02-29 23:00:19 +04:00
2012-03-15 19:01:09 +04:00
You can just create an instance with a positional argument:
2012-06-04 20:52:33 +04:00
``doc = HTML(something)``
The class will try to guess if the input is a filename, an absolute URL,
or a file-like object.
2012-02-29 23:00:19 +04:00
Alternatively, you can name the argument so that no guessing is
involved:
2012-06-04 20:52:33 +04:00
* ``HTML(filename=foo)`` a filename, relative to the current directory
2012-03-15 19:01:09 +04:00
or absolute.
2012-06-04 20:52:33 +04:00
* ``HTML(url=foo)`` an absolute, fully qualified URL.
* ``HTML(file_obj=foo)`` a file-like: any object with a ``read()`` method.
* ``HTML(string=foo)`` a string of HTML source. (This argument must be named.)
* ``HTML(tree=foo)`` a parsed lxml tree. (This argument must be named.)
2012-02-29 23:00:19 +04:00
2012-06-04 20:52:33 +04:00
Specifying multiple inputs is an error: ``HTML(filename=foo, url=bar)``
2012-03-15 19:01:09 +04:00
will raise.
2012-02-29 23:00:19 +04:00
2012-03-15 19:01:09 +04:00
You can also pass optional named arguments:
2012-02-29 23:00:19 +04:00
2012-03-15 19:01:09 +04:00
* ``encoding``: force the source character encoding
2012-06-04 20:52:33 +04:00
* ``base_url``: used to resolve relative URLs (eg. in
``<img src="../foo.png">``).
2012-02-29 23:00:19 +04:00
If not passed explicitly, try to use the input filename, URL, or
``name`` attribute of file objects.
2012-07-18 18:40:33 +04:00
* ``url_fetcher``: override the URL fetcher. (See `below <#url-fetchers>`_.)
2012-08-03 20:03:44 +04:00
* ``media_type``: the media type to use for ``@media``. Defaults to ``print``.
2012-02-29 23:00:19 +04:00
2012-06-25 18:12:48 +04:00
**Note:** In some cases like ``HTML(string=foo)`` you need to pass ``base_url``
explicitly, or relative URLs will be invalid.
``HTML`` objects have three public methods:
2012-02-29 23:00:19 +04:00
``HTML.write_pdf(target=None, stylesheets=None)``
Render the document with stylesheets from three *origins*:
* The HTML5 `user agent stylesheet`_;
* Author stylesheets embedded in the document in ``<style>`` elements or
linked by ``<link rel=stylesheet>`` elements;
* User stylesheets provided in the ``stylesheets`` parameter to this
method. If provided, ``stylesheets`` must be an iterable where elements
2012-06-04 20:52:33 +04:00
are ``CSS`` instances (see below) or anything that can be passed
as an unnamed argument to ``CSS()``.
If you use this ``stylesheet`` parameter or the ``-s`` option of the
command-line API, keep in mind that *user* stylesheets have a lower
priority than *author* stylesheets in the cascade_.
2012-02-29 23:00:19 +04:00
``target`` can be a filename or a file-like object (anything with a
``write()`` method) where the PDF output is written.
2012-06-04 20:52:33 +04:00
If ``target`` is not provided, the method returns the PDF content
as a byte string.
2012-02-29 23:00:19 +04:00
2012-06-25 18:12:48 +04:00
``HTML.write_png(target=None, stylesheets=None, resolution=96)``
Like ``write_pdf()``, but writes a single PNG image instead of PDF.
``resolution`` is counted in pixels in the PNG output per CSS inch.
Note however that CSS pixels are always 1/96 CSS inch.
With the default resolution of 96, CSS pixels match PNG pixels.
2012-02-29 23:00:19 +04:00
2012-06-04 20:52:33 +04:00
Pages are painted in order from top to bottom, and horizontally centered.
The resulting image is a wide as the widest page, and as high as the
sum of all pages. There is no decoration around pages other than
specified in CSS.
2012-06-25 18:12:48 +04:00
``HTML.get_png_pages(stylesheets=None, resolution=96)``
Render each page to a separate PNG image.
``stylesheets`` and ``resolution`` are the same as in ``write_png()``.
Returns a generator of ``(width, height, png_bytes)`` tuples, one for
each page, in order. ``width`` and ``height`` are the size of the page
in PNG pixels, ``png_bytes`` is a byte string.
2012-02-29 23:00:19 +04:00
.. _user agent stylesheet: https://github.com/Kozea/WeasyPrint/blob/master/weasyprint/css/html5_ua.css
2012-06-04 20:52:33 +04:00
.. _cascade: http://www.w3.org/TR/CSS21/cascade.html#cascading-order
The ``weasyprint.CSS`` class
............................
A ``CSS`` object represents a CSS stylesheet parsed by tinycss.
An instance is created in the same way as ``HTML``, except that
the ``tree`` parameter is not available.
``CSS`` objects have no public attribute or method. They are only meant to
be used in the ``write_pdf`` or ``write_png`` method. (See above.)
2011-10-31 19:45:22 +04:00
2012-06-25 18:12:48 +04:00
The above warning on ``base_url`` and string input applies too: relative
URLs will be invalid if there is no base URL.
2011-12-12 18:22:24 +04:00
2012-07-18 18:40:33 +04:00
URL fetchers
............
Flask-WeasyPrint_ makes use of a custom URL fetcher to integrate WeasyPrint
with a Flask_ application.
.. _Flask-WeasyPrint: http://packages.python.org/Flask-WeasyPrint/
.. _Flask: http://flask.pocoo.org/
2012-07-18 18:40:33 +04:00
The URL fetcher is used for resources with an ``url`` input as well as
linked images and stylesheets. It is a function (or any callable) that
takes a single parameter (the URL) and should raise any exception to
indicate failure or return a dict with the following keys:
* One of ``string`` (a byte string) or ``file_obj`` (a file-like object)
* Optionally: ``mime_type``, a MIME type extracted eg. from a *Content-Type*
header. If not provided, the type is guessed from the file extension
in the URL.
* Optionally: ``encoding``, a character encoding extracted eg.from a
*charset* parameter in a *Content-Type* header
* Optionally: ``redirected_url``, the actual URL of the ressource in case
there were eg. HTTP redirects.
URL fetchers can defer to the default fetcher:
.. code-block:: python
from weasyprint import default_url_fetcher, HTML
def my_fetcher(url):
if url.startswith('graph:')
graph_data = map(float, url[6:].split(','))
return dict(string=generate_graph(graph_data),
mime_type='image/png')
else:
return weasyprint.default_url_fetcher(url)
source = '<img src="graph:42,10.3,87">'
HTML(string=source, url_fetcher=my_fetcher).write_pdf('out.pdf')
2011-12-12 18:22:24 +04:00
Logging
2012-08-11 11:07:54 +04:00
.......
2011-12-12 18:22:24 +04:00
2012-03-15 19:01:09 +04:00
Most errors (syntax error in CSS, unsupported CSS property, missing image, ...)
2011-12-12 18:22:24 +04:00
are not fatal and will not prevent a document from being rendered.
WeasyPrint uses the ``logging`` module from the Python standard library
to log these errors and let you know about them.
Logged messaged will go to stderr by default. You can change that by
configuring the ``weasyprint`` logger object:
2011-12-12 18:22:24 +04:00
.. code-block:: python
import logging
logger = logging.getLogger('weasyprint')
logger.handlers = [] # Remove the default stderr handler
logger.addHandler(logging.FileHandler('/path/to/weasyprint.log'))
2011-12-12 18:22:24 +04:00
See the `logging documentation <http://docs.python.org/library/logging.html>`_
for details.
2012-02-29 23:00:19 +04:00
2012-08-11 11:07:54 +04:00
Navigator
---------
WeasyPrint is a very limited web browser, running in your web browser.
Start it with:
.. code-block:: sh
python -m weasyprint.navigator
… and open your browser at http://127.0.0.1:5000/.
It does not support cookies, forms, or many other things that you would
expect from a “real” browser. It only shows the PNG output from WeasyPrint
with overlaid clickable hyperlinks. It is mostly useful for playing and testing.
Errors
------
If you get an exception during rendering, it is probably a bug in WeasyPrint.
Please copy the full traceback and report it on our `issue tracker`_.
.. _issue tracker: http://redmine.kozea.fr/projects/weasyprint/issues
2012-02-29 23:00:19 +04:00
Whats next
-----------
If you want to change something in WeasyPrint or just see how it works,
its time to `start hacking </hacking>`_!