tldr/pages/common/pup.md

29 lines
617 B
Markdown
Raw Normal View History

2017-10-08 19:00:15 +03:00
# pup
2017-10-11 02:00:49 +03:00
> Command line HTML parsing tool.
2019-05-29 16:53:56 +03:00
> More information: <https://github.com/ericchiang/pup>.
2017-10-08 19:00:15 +03:00
2017-10-11 02:00:49 +03:00
- Transform a raw HTML file into a cleaned, indented, and colored format:
2017-10-08 19:00:15 +03:00
`cat {{index.html}} | pup --color`
- Filter HTML by element tag name:
2017-10-11 15:17:46 +03:00
`cat {{index.html}} | pup '{{tag}}'`
2017-10-08 19:00:15 +03:00
- Filter HTML by id:
2017-10-11 02:00:49 +03:00
`cat {{index.html}} | pup '{{div#id}}'`
2017-10-08 19:00:15 +03:00
- Filter HTML by attribute value:
2017-10-11 02:05:27 +03:00
`cat {{index.html}} | pup '{{input[type="text"]}}'`
2017-10-08 19:00:15 +03:00
2017-10-11 02:00:49 +03:00
- Print all text from the filtered HTML elements and their children:
2017-10-08 19:00:15 +03:00
2017-10-11 02:00:49 +03:00
`cat {{index.html}} | pup '{{div}} text{}'`
2017-10-08 19:00:15 +03:00
- Print HTML as JSON:
2017-10-11 02:00:49 +03:00
`cat {{index.html}} | pup '{{div}} json{}'`