2020-04-21 02:56:46 +03:00
|
|
|
# Scraping and verification
|
|
|
|
|
|
|
|
#### Contents
|
|
|
|
- [Evaluating JavaScript](#evaluating-javascript)
|
|
|
|
- [Capturing screenshot](#capturing-screenshot)
|
2020-04-21 03:17:10 +03:00
|
|
|
- [Page events](#page-events)
|
|
|
|
- [Handling exceptions](#handling-exceptions)
|
2020-04-21 02:56:46 +03:00
|
|
|
|
|
|
|
<br/>
|
|
|
|
|
|
|
|
## Evaluating JavaScript
|
|
|
|
|
|
|
|
Execute JavaScript function in the page:
|
|
|
|
```js
|
2020-04-21 03:17:10 +03:00
|
|
|
const href = await page.evaluate(() => document.location.href);
|
2020-04-21 02:56:46 +03:00
|
|
|
```
|
|
|
|
|
2020-04-21 03:17:10 +03:00
|
|
|
If the result is a Promise or if the function is asynchronous evaluate will automatically wait until it's resolved:
|
2020-04-21 02:56:46 +03:00
|
|
|
```js
|
2020-04-21 03:17:10 +03:00
|
|
|
const status = await page.evaluate(async () => {
|
|
|
|
const response = await fetch(location.href);
|
|
|
|
return response.status;
|
|
|
|
});
|
2020-04-21 02:56:46 +03:00
|
|
|
```
|
|
|
|
|
|
|
|
Get object handle and use it in multiple evaluations:
|
|
|
|
```js
|
2020-04-21 03:17:10 +03:00
|
|
|
// Create a new array in the page, write a reference to it in
|
|
|
|
// window.myArray and get a handle to it.
|
|
|
|
const myArrayHandle = await page.evaluateHandle(() => {
|
|
|
|
window.myArray = [1];
|
|
|
|
return myArray;
|
|
|
|
});
|
|
|
|
|
|
|
|
// Get current length of the array using the handle.
|
|
|
|
const length = await page.evaluate(
|
|
|
|
(arg) => arg.myArray.length,
|
|
|
|
{ myArray: myArrayHandle }
|
|
|
|
);
|
|
|
|
|
|
|
|
// Add one more element to the array using the handle
|
|
|
|
await page.evaluate((arg) => arg.myArray.push(arg.newElement), {
|
|
|
|
myArray: myArrayHandle,
|
|
|
|
newElement: 2
|
|
|
|
});
|
|
|
|
|
|
|
|
// Get current length of the array using window.myArray reference.
|
|
|
|
const newLength = await page.evaluate(() => window.myArray.length);
|
|
|
|
|
|
|
|
// Release the object when it's no longer needed.
|
|
|
|
await myArrayHandle.dispose();
|
2020-04-21 02:56:46 +03:00
|
|
|
```
|
|
|
|
|
|
|
|
#### API reference
|
|
|
|
|
|
|
|
- [page.$(selector)](./api.md#pageselector)
|
|
|
|
- [page.$$(selector)](./api.md#pageselector-1)
|
|
|
|
- [page.$eval(selector, pageFunction[, arg])](./api.md#pageevalselector-pagefunction-arg)
|
|
|
|
- [page.$$eval(selector, pageFunction[, arg])](./api.md#pageevalselector-pagefunction-arg-1)
|
|
|
|
- [page.evaluate(pageFunction[, arg])](./api.md#pageevaluatepagefunction-arg)
|
|
|
|
- [page.evaluateHandle(pageFunction[, arg])](./api.md#pageevaluatehandlepagefunction-arg)
|
|
|
|
|
|
|
|
<br/>
|
|
|
|
|
|
|
|
## Capturing screenshot
|
|
|
|
|
|
|
|
```js
|
2020-05-03 03:21:46 +03:00
|
|
|
// Save to file
|
2020-04-21 03:17:10 +03:00
|
|
|
await page.screenshot({path: 'screenshot.png'});
|
|
|
|
|
2020-05-03 03:21:46 +03:00
|
|
|
// Capture full page
|
2020-04-21 03:17:10 +03:00
|
|
|
await page.screenshot({path: 'screenshot.png', fullPage: true});
|
|
|
|
|
2020-05-03 03:21:46 +03:00
|
|
|
// Capture into buffer
|
2020-04-21 03:17:10 +03:00
|
|
|
const buffer = await page.screenshot();
|
|
|
|
console.log(buffer.toString('base64'));
|
2020-04-21 02:56:46 +03:00
|
|
|
|
2020-05-03 03:21:46 +03:00
|
|
|
// Capture given element
|
|
|
|
const elementHandle = await page.$('.header');
|
|
|
|
await elementHandle.screenshot({ path: 'screenshot.png' });
|
2020-05-03 17:59:41 +03:00
|
|
|
```
|
2020-04-21 02:56:46 +03:00
|
|
|
|
|
|
|
#### API reference
|
|
|
|
|
|
|
|
- [page.screenshot([options])](./api.md#pagescreenshotoptions)
|
2020-04-21 03:17:10 +03:00
|
|
|
- [elementHandle.screenshot([options])](./api.md#elementhandlescreenshotoptions)
|
2020-04-21 02:56:46 +03:00
|
|
|
|
|
|
|
<br/>
|
|
|
|
|
2020-04-21 03:17:10 +03:00
|
|
|
## Page events
|
2020-04-21 02:56:46 +03:00
|
|
|
|
2020-04-21 03:17:10 +03:00
|
|
|
You can listen for various events on the `page` object. Following are just some of the examples of the events you can assert and handle:
|
|
|
|
|
2020-04-21 21:28:08 +03:00
|
|
|
#### `"console"` - get all console messages from the page
|
2020-04-21 02:56:46 +03:00
|
|
|
|
|
|
|
```js
|
2020-04-21 03:17:10 +03:00
|
|
|
page.on('console', msg => {
|
|
|
|
// Handle only errors.
|
|
|
|
if (msg.type() !== 'error')
|
|
|
|
return;
|
|
|
|
console.log(`text: "${msg.text()}"`);
|
|
|
|
});
|
2020-04-21 02:56:46 +03:00
|
|
|
```
|
|
|
|
|
2020-04-22 10:32:46 +03:00
|
|
|
#### `"dialog"` - handle alert, confirm, prompt
|
2020-04-21 02:56:46 +03:00
|
|
|
|
2020-04-21 03:17:10 +03:00
|
|
|
```js
|
|
|
|
page.on('dialog', dialog => {
|
|
|
|
dialog.accept();
|
|
|
|
});
|
|
|
|
```
|
2020-04-21 02:56:46 +03:00
|
|
|
|
2020-04-21 21:28:08 +03:00
|
|
|
#### `"popup"` - handle popup windows
|
2020-04-21 02:56:46 +03:00
|
|
|
|
|
|
|
```js
|
2020-04-21 03:17:10 +03:00
|
|
|
const [popup] = await Promise.all([
|
|
|
|
page.waitForEvent('popup'),
|
|
|
|
page.click('#open')
|
|
|
|
]);
|
2020-04-21 02:56:46 +03:00
|
|
|
```
|
|
|
|
|
|
|
|
#### API reference
|
|
|
|
|
2020-04-21 21:28:08 +03:00
|
|
|
- [class: ConsoleMessage](./api.md#class-consolemessage)
|
2020-04-21 03:17:10 +03:00
|
|
|
- [class: Page](./api.md#class-page)
|
|
|
|
- [event: 'console'](./api.md#event-console)
|
|
|
|
- [event: 'dialog'](./api.md#event-dialog)
|
|
|
|
- [event: 'popup'](./api.md#event-popup)
|
2020-04-21 02:56:46 +03:00
|
|
|
|
|
|
|
<br/>
|
|
|
|
|
|
|
|
|
2020-04-21 03:17:10 +03:00
|
|
|
## Handling exceptions
|
|
|
|
|
|
|
|
Listen uncaught exceptions in the page:
|
2020-04-21 02:56:46 +03:00
|
|
|
```js
|
2020-04-21 03:17:10 +03:00
|
|
|
// Log all uncaught errors to the terminal
|
|
|
|
page.on('pageerror', exception => {
|
|
|
|
console.log(`Uncaught exception: "${exception}"`);
|
|
|
|
});
|
|
|
|
|
|
|
|
// Navigate to a page with an exception.
|
|
|
|
await page.goto('data:text/html,<script>throw new Error("Test")</script>');
|
2020-04-21 02:56:46 +03:00
|
|
|
```
|
|
|
|
|
|
|
|
#### API reference
|
|
|
|
|
2020-04-21 03:17:10 +03:00
|
|
|
- [event: 'pageerror'](./api.md#event-pageerror)
|
2020-04-21 02:56:46 +03:00
|
|
|
|
|
|
|
<br/>
|