2012-10-08 15:48:22 +04:00
|
|
|
|
Tutorial
|
|
|
|
|
========
|
2011-10-28 19:14:31 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
As a standalone program
|
|
|
|
|
-----------------------
|
2011-10-28 19:14:31 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
Once you have WeasyPrint :doc:`installed </install>`, you should have a
|
|
|
|
|
``weasyprint`` executable. Using it can be as simple as this:
|
2011-10-28 19:14:31 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
.. code-block:: sh
|
2011-10-28 19:14:31 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
weasyprint http://weasyprint.org /tmp/weasyprint-website.pdf
|
2012-02-29 22:13:10 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
You may see warnings on *stderr* about unsupported CSS properties.
|
|
|
|
|
See :ref:`command-line-api` for the details of all available options.
|
2012-02-29 22:13:10 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
In particular, the ``-s`` option can add a filename for a
|
|
|
|
|
:ref:`user stylesheet <stylesheet-origins>`. For quick experimentation
|
|
|
|
|
however, you may not want to create a file. In bash or zsh, you can
|
|
|
|
|
use the shell’s redirection instead:
|
2012-02-29 22:13:10 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
.. code-block:: sh
|
2012-02-29 22:13:10 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
weasyprint http://weasyprint.org /tmp/weasyprint-website.pdf \
|
|
|
|
|
-s <(echo 'body { font-family: serif !important }')
|
2012-02-29 22:13:10 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
If you have many documents to convert you may prefer using the Python API
|
|
|
|
|
in long-lived processes to avoid paying the start-up costs every time.
|
2012-02-29 22:13:10 +04:00
|
|
|
|
|
2012-08-03 20:03:44 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
As a Python library
|
|
|
|
|
-------------------
|
|
|
|
|
.. currentmodule:: weasyprint
|
2012-08-03 20:03:44 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
Quickstart
|
|
|
|
|
..........
|
2012-08-03 20:03:44 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
The Python version of the above example goes like this:
|
2011-10-28 19:14:31 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
.. code-block:: python
|
|
|
|
|
|
|
|
|
|
from weasyprint import import HTML
|
|
|
|
|
HTML('http://weasyprint.org/').write_pdf('/tmp/weasyprint-website.pdf')
|
|
|
|
|
|
|
|
|
|
… or with the inline stylesheet:
|
|
|
|
|
|
|
|
|
|
.. code-block:: python
|
|
|
|
|
|
|
|
|
|
from weasyprint import import HTML, CSS
|
|
|
|
|
HTML('http://weasyprint.org/').write_pdf('/tmp/weasyprint-website.pdf',
|
|
|
|
|
stylesheets=[CSS(string='body { font-family: serif !important }')])
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Instantiating HTML and CSS objects
|
|
|
|
|
..................................
|
|
|
|
|
|
|
|
|
|
If you have a file name, an absolute URL or a readable file-like object,
|
|
|
|
|
you can just pass it to :class:`HTML` or :class:`CSS` to create an instance.
|
|
|
|
|
Alternatively, use a named argument so that no guessing is involved:
|
|
|
|
|
|
|
|
|
|
.. code-block:: python
|
|
|
|
|
|
|
|
|
|
HTML('../foo.html') # Same as …
|
|
|
|
|
HTML(filename='../foo.html')
|
2011-10-28 19:14:31 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
HTML('http://weasyprint.org') # Same as …
|
|
|
|
|
HTML(url='http://weasyprint.org')
|
2011-10-28 19:14:31 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
HTML(sys.stdin) # Same as …
|
|
|
|
|
HTML(file_obj=sys.stdin)
|
|
|
|
|
|
|
|
|
|
If you have a byte string, Unicode string, or lxml tree already in memory
|
|
|
|
|
you can also pass that, although the argument must be named:
|
|
|
|
|
|
|
|
|
|
.. code-block:: python
|
|
|
|
|
|
2012-10-09 19:48:36 +04:00
|
|
|
|
# HTML('<h1>foo') would be filename
|
2012-10-08 21:51:18 +04:00
|
|
|
|
HTML(string='''
|
|
|
|
|
<h1>The title</h1>
|
|
|
|
|
<p>Content goes here
|
|
|
|
|
''')
|
|
|
|
|
CSS(string='@page { size: A3; margin: 1cm }')
|
|
|
|
|
|
2012-10-09 19:48:36 +04:00
|
|
|
|
# Use a different lxml parser:
|
2012-10-08 21:51:18 +04:00
|
|
|
|
HTML(tree=lxml.html.html5parser.parse('../foo.html'))
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rendering to a single file
|
|
|
|
|
..........................
|
|
|
|
|
|
|
|
|
|
Once you have a :class:`HTML` object, call its :meth:`~HTML.write_pdf` or
|
|
|
|
|
:meth:`~HTML.write_png` method to get the rendered document in a single
|
|
|
|
|
PDF or PNG file.
|
|
|
|
|
|
|
|
|
|
Without arguments, these methods return a byte string in memory. If you
|
|
|
|
|
pass a file name or a writable file-like object, they will write there
|
|
|
|
|
directly instead. (**Warning**: with a filename, these methods will
|
|
|
|
|
overwrite existing files silently.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Individual pages, meta-data, other output formats, …
|
|
|
|
|
....................................................
|
|
|
|
|
|
|
|
|
|
.. currentmodule:: weasyprint.document
|
|
|
|
|
|
|
|
|
|
If you want more than a single PDF, the :meth:`~weasyprint.HTML.render`
|
|
|
|
|
method gives you a :class:`Document` object with access to individual
|
|
|
|
|
:class:`Page` objects. Thus you can get the number of pages, their size\ [#]_,
|
|
|
|
|
the details of hyperlinks and bookmarks, etc.
|
|
|
|
|
Documents also have :meth:`~Document.write_pdf` and :meth:`~Document.write_png`
|
|
|
|
|
methods, and you can get a subset of the pages with :meth:`~Document.copy()`.
|
|
|
|
|
Finally, for ultimate control, :meth:`~Page.paint` individual pages anywhere
|
|
|
|
|
on any type of cairo surface.
|
|
|
|
|
|
|
|
|
|
.. [#] Pages in the same document do not always have the same size.
|
|
|
|
|
|
|
|
|
|
See the :ref:`python-api` for details. A few random example:
|
|
|
|
|
|
|
|
|
|
.. code-block:: python
|
|
|
|
|
|
|
|
|
|
# Write odd and even pages separately:
|
|
|
|
|
# Lists count from 0 but page numbers usually from 1
|
|
|
|
|
# [::2] is a slice of even list indexes but odd-numbered pages.
|
|
|
|
|
document.copy(document.pages[::2]).write_pdf('odd_pages.pdf')
|
|
|
|
|
document.copy(document.pages[1::2]).write_pdf('even_pages.pdf')
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# Write one PNG image per page:
|
|
|
|
|
for i, page in enumerate(document.pages):
|
|
|
|
|
document.copy(page).write_png('page_%s.png' % i)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# Print the outline of the document
|
|
|
|
|
def print_outline(bookmarks, indent=0):
|
|
|
|
|
for i, (label, (page, _, _), children) in enumerate(bookmarks, 1):
|
|
|
|
|
print('%s%d. %s (page %d)' % (
|
|
|
|
|
' ' * indent, i, label.lstrip('0123456789. '), page))
|
|
|
|
|
print_outline(children, indent + 2)
|
|
|
|
|
print_outline(document.make_bookmark_tree())
|
|
|
|
|
# Output on http://www.w3.org/TR/CSS21/intro.html
|
|
|
|
|
# 1. Introduction to CSS 2.1 (page 2)
|
|
|
|
|
# 1. A brief CSS 2.1 tutorial for HTML (page 2)
|
|
|
|
|
# 2. A brief CSS 2.1 tutorial for XML (page 5)
|
|
|
|
|
# 3. The CSS 2.1 processing model (page 6)
|
|
|
|
|
# 1. The canvas (page 7)
|
|
|
|
|
# 2. CSS 2.1 addressing model (page 7)
|
|
|
|
|
# 4. CSS design principles (page 8)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# PostScript on standard output:
|
|
|
|
|
surface = cairo.PSSurface(sys.stdout, 1, 1)
|
|
|
|
|
context = cairo.Context(surface)
|
|
|
|
|
for page in document.pages:
|
|
|
|
|
# 0.75 = 72 PostScript point per inch / 96 CSS pixel per inch
|
|
|
|
|
surface.set_size(page.width * 0.75, page.height * 0.75)
|
|
|
|
|
page.paint(context, scale=0.75)
|
|
|
|
|
surface.show_page()
|
|
|
|
|
surface.finish()
|
2011-10-31 19:45:22 +04:00
|
|
|
|
|
2012-02-29 23:00:19 +04:00
|
|
|
|
|
2012-09-19 19:37:52 +04:00
|
|
|
|
.. _url-fetchers:
|
2011-12-12 18:22:24 +04:00
|
|
|
|
|
2012-07-18 18:40:33 +04:00
|
|
|
|
URL fetchers
|
|
|
|
|
............
|
|
|
|
|
|
2012-09-19 19:37:52 +04:00
|
|
|
|
WeasyPrint goes through an *URL fetcher* to fetch external resources such as
|
2012-10-08 21:51:18 +04:00
|
|
|
|
images or CSS stylesheets. The default fetcher can natively open file and
|
|
|
|
|
HTTP URLs, but the HTTP client does not support advanced features like cookies
|
2012-09-19 19:37:52 +04:00
|
|
|
|
or authentication. This can be worked-around by passing a custom
|
|
|
|
|
``url_fetcher`` callable to the :class:`HTML` or :class:`CSS` classes.
|
2012-10-08 21:51:18 +04:00
|
|
|
|
It must have the same signature as :func:`~weasyprint.default_url_fetcher`.
|
2012-07-18 18:40:33 +04:00
|
|
|
|
|
2012-09-20 18:29:33 +04:00
|
|
|
|
Custom fetchers can choose to handle some URLs and defer others
|
|
|
|
|
to the default fetcher:
|
2012-07-18 18:40:33 +04:00
|
|
|
|
|
|
|
|
|
.. code-block:: python
|
|
|
|
|
|
|
|
|
|
from weasyprint import default_url_fetcher, HTML
|
|
|
|
|
|
|
|
|
|
def my_fetcher(url):
|
|
|
|
|
if url.startswith('graph:')
|
|
|
|
|
graph_data = map(float, url[6:].split(','))
|
|
|
|
|
return dict(string=generate_graph(graph_data),
|
|
|
|
|
mime_type='image/png')
|
|
|
|
|
else:
|
|
|
|
|
return weasyprint.default_url_fetcher(url)
|
|
|
|
|
|
|
|
|
|
source = '<img src="graph:42,10.3,87">'
|
|
|
|
|
HTML(string=source, url_fetcher=my_fetcher).write_pdf('out.pdf')
|
|
|
|
|
|
2012-09-19 19:37:52 +04:00
|
|
|
|
Flask-WeasyPrint_ makes use of a custom URL fetcher to integrate WeasyPrint
|
2012-10-08 21:51:18 +04:00
|
|
|
|
with a Flask_ application and short-cut the network for resources that are
|
|
|
|
|
within the same application.
|
2012-09-19 19:37:52 +04:00
|
|
|
|
|
|
|
|
|
.. _Flask-WeasyPrint: http://packages.python.org/Flask-WeasyPrint/
|
|
|
|
|
.. _Flask: http://flask.pocoo.org/
|
|
|
|
|
|
2012-07-18 18:40:33 +04:00
|
|
|
|
|
2011-12-12 18:22:24 +04:00
|
|
|
|
Logging
|
2012-08-11 11:07:54 +04:00
|
|
|
|
.......
|
2011-12-12 18:22:24 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
Most errors (unsupported CSS property, missing image, ...)
|
2011-12-12 18:22:24 +04:00
|
|
|
|
are not fatal and will not prevent a document from being rendered.
|
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
WeasyPrint uses the :mod:`logging` module from the Python standard library
|
2012-05-24 19:28:42 +04:00
|
|
|
|
to log these errors and let you know about them.
|
2012-09-20 18:29:33 +04:00
|
|
|
|
Logged messaged will go to *stderr* by default. You can change that by
|
2012-05-24 19:28:42 +04:00
|
|
|
|
configuring the ``weasyprint`` logger object:
|
2011-12-12 18:22:24 +04:00
|
|
|
|
|
|
|
|
|
.. code-block:: python
|
|
|
|
|
|
|
|
|
|
import logging
|
2012-05-24 19:28:42 +04:00
|
|
|
|
logger = logging.getLogger('weasyprint')
|
|
|
|
|
logger.handlers = [] # Remove the default stderr handler
|
|
|
|
|
logger.addHandler(logging.FileHandler('/path/to/weasyprint.log'))
|
2011-12-12 18:22:24 +04:00
|
|
|
|
|
2012-10-08 21:51:18 +04:00
|
|
|
|
See the documentation of the :mod:`logging` module for details.
|
2012-02-29 23:00:19 +04:00
|
|
|
|
|
|
|
|
|
|
2012-09-20 18:29:33 +04:00
|
|
|
|
.. _navigator:
|
|
|
|
|
|
|
|
|
|
WeasyPrint Navigator
|
|
|
|
|
--------------------
|
2012-08-11 11:07:54 +04:00
|
|
|
|
|
2012-08-11 11:09:13 +04:00
|
|
|
|
*WeasyPrint Navigator* is a very limited web browser, running
|
|
|
|
|
in your web browser. Start it with:
|
2012-08-11 11:07:54 +04:00
|
|
|
|
|
|
|
|
|
.. code-block:: sh
|
|
|
|
|
|
|
|
|
|
python -m weasyprint.navigator
|
|
|
|
|
|
|
|
|
|
… and open your browser at http://127.0.0.1:5000/.
|
|
|
|
|
|
|
|
|
|
It does not support cookies, forms, or many other things that you would
|
|
|
|
|
expect from a “real” browser. It only shows the PNG output from WeasyPrint
|
|
|
|
|
with overlaid clickable hyperlinks. It is mostly useful for playing and testing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Errors
|
|
|
|
|
------
|
|
|
|
|
|
|
|
|
|
If you get an exception during rendering, it is probably a bug in WeasyPrint.
|
|
|
|
|
Please copy the full traceback and report it on our `issue tracker`_.
|
|
|
|
|
|
2012-09-20 18:29:33 +04:00
|
|
|
|
.. _issue tracker: https://github.com/Kozea/WeasyPrint/issues
|
2012-10-08 21:51:18 +04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
.. _stylesheet-origins:
|
|
|
|
|
|
|
|
|
|
Stylesheet origins
|
|
|
|
|
------------------
|
|
|
|
|
|
|
|
|
|
HTML documents are rendered with stylesheets from three *origins*:
|
|
|
|
|
|
|
|
|
|
* The HTML5 `user agent stylesheet`_ (defines the default appearance
|
|
|
|
|
of HTML elements);
|
|
|
|
|
* Author stylesheets embedded in the document in ``<style>`` elements
|
|
|
|
|
or linked by ``<link rel=stylesheet>`` elements;
|
|
|
|
|
* User stylesheets provided in the API.
|
|
|
|
|
|
|
|
|
|
Keep in mind that *user* stylesheets have a lower priority than *author*
|
|
|
|
|
stylesheets in the cascade_, unless you use `!important`_ in declarations
|
|
|
|
|
to raise their priority.
|
|
|
|
|
|
|
|
|
|
.. _user agent stylesheet: https://github.com/Kozea/WeasyPrint/blob/master/weasyprint/css/html5_ua.css
|
|
|
|
|
.. _cascade: http://www.w3.org/TR/CSS21/cascade.html#cascading-order
|
|
|
|
|
.. _!important: http://www.w3.org/TR/CSS21/cascade.html#important-rules
|