hurl/docs/capturing-response.md
2024-01-10 17:13:46 +01:00

341 lines
8.7 KiB
Markdown

# Capturing Response
## Captures
Captures are optional values that are __extracted from the HTTP response__ and stored in a named variable.
These captures may be the response status code, part of or the entire the body, and response headers.
Captured variables can be accessed through a run session; each new value of a given variable overrides the last value.
Captures can be useful for using data from one request in another request, such as when working with [CSRF tokens].
Variables in a Hurl file can be created from captures or [injected into the session].
```hurl
# An example to show how to pass a CSRF token
# from one request to another:
# First GET request to get CSRF token value:
GET https://example.org
HTTP 200
# Capture the CSRF token value from html body.
[Captures]
csrf_token: xpath "normalize-space(//meta[@name='_csrf_token']/@content)"
# Do the login !
POST https://acmecorp.net/login?user=toto&password=1234
X-CSRF-TOKEN: {{csrf_token}}
HTTP 302
```
Structure of a capture:
<div class="schema-container schema-container u-font-size-2 u-font-size-3-sm">
<div class="schema">
<span class="schema-token schema-color-1">my_var<span class="schema-label">variable</span></span>
<span> : </span>
<span class="schema-token schema-color-2">xpath "string(//h1)"<span class="schema-label">query</span></span>
</div>
</div>
A capture consists of a variable name, followed by `:` and a query. Captures
section starts with `[Captures]`.
### Query
Queries are used to extract data from an HTTP response.
A query can be of the following type:
- [`status`](#status-capture)
- [`header`](#header-capture)
- [`url`](#url-capture)
- [`cookie`](#cookie-capture)
- [`body`](#body-capture)
- [`bytes`](#bytes-capture)
- [`xpath`](#xpath-capture)
- [`jsonpath`](#jsonpath-capture)
- [`regex`](#regex-capture)
- [`variable`](#variable-capture)
- [`duration`](#duration-capture)
- [`certificate`](#certificate-capture)
Extracted data can then be further refined using [filters].
### Status capture
Capture the received HTTP response status code. Status capture consists of a variable name, followed by a `:`, and the
keyword `status`.
```hurl
GET https://example.org
HTTP 200
[Captures]
my_status: status
```
### Header capture
Capture a header from the received HTTP response headers. Header capture consists of a variable name, followed by a `:`,
then the keyword `header` and a header name.
```hurl
POST https://example.org/login
[FormParams]
user: toto
password: 12345678
HTTP 302
[Captures]
next_url: header "Location"
```
### URL capture
Capture the last fetched URL. This is most meaningful if you have told Hurl to follow redirection (see [`[Options]`section][options] or
[`--location` option]). URL capture consists of a variable name, followed by a `:`, and the keyword `url`.
```hurl
GET https://example.org/redirecting
[Options]
location: true
HTTP 200
[Captures]
landing_url: url
```
### Cookie capture
Capture a [`Set-Cookie`] header from the received HTTP response headers. Cookie
capture consists of a variable name, followed by a `:`, then the keyword `cookie`
and a cookie name.
```hurl
GET https://example.org/cookies/set
HTTP 200
[Captures]
session-id: cookie "LSID"
```
Cookie attributes value can also be captured by using the following format:
`<cookie-name>[cookie-attribute]`. The following attributes are supported:
`Value`, `Expires`, `Max-Age`, `Domain`, `Path`, `Secure`, `HttpOnly` and `SameSite`.
```hurl
GET https://example.org/cookies/set
HTTP 200
[Captures]
value1: cookie "LSID"
value2: cookie "LSID[Value]" # Equivalent to the previous capture
expires: cookie "LSID[Expires]"
max-age: cookie "LSID[Max-Age]"
domain: cookie "LSID[Domain]"
path: cookie "LSID[Path]"
secure: cookie "LSID[Secure]"
http-only: cookie "LSID[HttpOnly]"
same-site: cookie "LSID[SameSite]"
```
### Body capture
Capture the entire body (decoded as text) from the received HTTP response. The encoding used to decode the body
is based on the `charset` value in the `Content-Type` header response.
```hurl
GET https://example.org/home
HTTP 200
[Captures]
my_body: body
```
If the `Content-Type` doesn't include any encoding hint, a [`decode` filter] can be used to explicitly decode the body response
bytes.
```hurl
# Our HTML response is encoded using GB 2312.
# But, the 'Content-Type' HTTP response header doesn't precise any charset,
# so we decode explicitly the bytes.
GET https://example.org/cn
HTTP 200
[Captures]
my_body: bytes decode "gb2312"
```
### Bytes capture
Capture the entire body (as a raw bytestream) from the received HTTP response
```hurl
GET https://example.org/data.bin
HTTP 200
[Captures]
my_data: bytes
```
### XPath capture
Capture a [XPath] query from the received HTTP body decoded as a string.
Currently, only XPath 1.0 expression can be used.
```hurl
GET https://example.org/home
# Capture the identifier from the dom node <div id="pet0">5646eaf23</div
HTTP 200
[Captures]
ped-id: xpath "normalize-space(//div[@id='pet0'])"
# Open the captured page.
GET https://example.org/home/pets/{{pet-id}}
HTTP 200
```
XPath captures are not limited to node values (like string, or boolean); any
valid XPath can be captured and asserted with variable asserts.
```hurl
# Test that the XML endpoint return 200 pets
GET https://example.org/api/pets
HTTP 200
[Captures]
pets: xpath "//pets"
[Asserts]
variable "pets" count == 200
```
XPath expression can also be evaluated against part of the body with a [`xpath` filter]:
```hurl
GET https://example.org/home_cn
HTTP 200
[Captures]
ped-id: bytes decode "gb2312" xpath "normalize-space(//div[@id='pet0'])"
```
### JSONPath capture
Capture a [JSONPath] query from the received HTTP body.
```hurl
POST https://example.org/api/contact
[FormParams]
token: {{token}}
email: toto@rookie.net
HTTP 200
[Captures]
contact-id: jsonpath "$['id']"
```
> Explain that the value selected by the JSONPath is coerced to a string when only one node is selected.
As with [XPath captures], JSONPath captures can be anything from string, number, to object and collections.
For instance, if we have a JSON endpoint that returns the following JSON:
```
{
"a_null": null,
"an_object": {
"id": "123"
},
"a_list": [
1,
2,
3
],
"an_integer": 1,
"a float": 1.1,
"a_bool": true,
"a_string": "hello"
}
```
We can capture the following paths:
```hurl
GET https://example.org/captures-json
HTTP 200
[Captures]
an_object: jsonpath "$['an_object']"
a_list: jsonpath "$['a_list']"
a_null: jsonpath "$['a_null']"
an_integer: jsonpath "$['an_integer']"
a_float: jsonpath "$['a_float']"
a_bool: jsonpath "$['a_bool']"
a_string: jsonpath "$['a_string']"
all: jsonpath "$"
```
### Regex capture
Capture a regex pattern from the HTTP received body, decoded as text.
```hurl
GET https://example.org/helloworld
HTTP 200
[Captures]
id_a: regex "id_a:([0-9]+)"
id_b: regex "id_b:(\\d+)" # pattern using double quote
id_c: regex /id_c:(\d+)/ # pattern using forward slash
name: regex "Hello ([a-zA-Z]+)"
```
The regex pattern must have at least one capture group, otherwise the
capture will fail. When the pattern is a double-quoted string, metacharacters beginning with a backslash in the pattern
(like `\d`, `\s`) must be escaped; literal pattern enclosed by `/` can also be used to avoid metacharacters escaping.
### Variable capture
Capture the value of a variable into another.
```hurl
GET https://example.org/helloworld
HTTP 200
[Captures]
in: body
name: variable "in"
```
### Duration capture
Capture the response time of the request in ms.
```hurl
GET https://example.org/helloworld
HTTP 200
[Captures]
duration_in_ms: duration
```
### SSL certificate capture
Capture the SSL certificate properties. Certificate capture consists of the keyword `certificate`, followed by the certificate attribute value.
The following attributes are supported: `Subject`, `Issuer`, `Start-Date`, `Expire-Date` and `Serial-Number`.
```hurl
GET https://example.org
HTTP 200
[Captures]
cert_subject: certificate "Subject"
cert_issuer: certificate "Issuer"
cert_expire_date: certificate "Expire-Date"
cert_serial_number: certificate "Serial-Number"
```
[CSRF tokens]: https://en.wikipedia.org/wiki/Cross-site_request_forgery
[injected into the session]: /docs/templates.md#injecting-variables
[`Set-Cookie`]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie
[XPath]: https://en.wikipedia.org/wiki/XPath
[JSONPath]: https://goessner.net/articles/JsonPath/
[XPath captures]: #xpath-capture
[JavaScript-like Regular expression syntax]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions
[options]: /docs/request.md#options
[`--location` option]: /docs/manual.md#location
[filters]: /docs/filters.md
[`xpath` filter]: /docs/filters.md#xpath
[`decode` filter]: /docs/filters.md#decode