Skip to content

Commit

Permalink
improve additional request docs
Browse files Browse the repository at this point in the history
  • Loading branch information
BurnzZ committed Apr 25, 2022
1 parent 5d82a37 commit 753e6ad
Show file tree
Hide file tree
Showing 2 changed files with 69 additions and 21 deletions.
85 changes: 65 additions & 20 deletions docs/advanced/additional-requests.rst
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,8 @@ The key take aways are:
Now that we know how :class:`~.HttpRequest` are structured, defining them doesn't
execute the actual requests at all. In order to do so, we'll need to feed it into
the :class:`~.HttpClient` which is defined in the next section.
the :class:`~.HttpClient` which is defined in the next section (see
:ref:`httpclient` tutorial section).
HttpResponse
============
Expand All @@ -117,6 +118,12 @@ executed. It's typically returned by the methods from :class:`~.HttpClient` (see
It's also the required input for Page Objects inheriting from the :class:`~.ItemWebPage`
class as explained from the :ref:`from-ground-up` tutorial.
.. note::
The additional requests are expected to perform redirections except when the
method is ``HEAD``. This means that the :class:`~.HttpResponse` that you'll
be receiving is already the end of the redirection trail.
Let's check out an example to see its internals:
.. code-block:: python
Expand Down Expand Up @@ -155,10 +162,11 @@ Let's check out an example to see its internals:
print(response.text) # {"data": "value 👍"}
print(response.json()) # {'data': 'value 👍'}
Despite what the example showcases, you won't be typically defining :class:`~.HttpResponse`
yourself as it's the implementing framework that's responsible for it (see
:ref:`advanced-downloader-impl`). Nonetheless, it's important to understand its
underlying structure in order to better access its methods.
Despite what the example above showcases, you won't be typically defining
:class:`~.HttpResponse` yourself as it's the implementing framework that's
responsible for it (see :ref:`advanced-downloader-impl`). Nonetheless, it's
important to understand its underlying structure in order to better access its
methods.
Here are the key take aways from the example above:
Expand Down Expand Up @@ -194,7 +202,7 @@ Here are the key take aways from the example above:
* body encodings
* Instead of accessing the raw bytes values `(which doesn't represent the
underlying content properly like the 👍 emoji)`, the :meth:`~.HttpResponse.text`
underlying content properly like the` 👍 `emoji)`, the :meth:`~.HttpResponse.text`
property method can be used which takes into account the derived **encoding**
when decoding the bytes value.
* The :meth:`~.HttpResponse.json` method is available as a shortcut to
Expand Down Expand Up @@ -246,7 +254,7 @@ The key take aways for this example are:
parsed from it.
* The :meth:`~.HttpResponse.selector` property method returns an instance of
:external:py:class:`parsel.selector.Selector` which allows parsing via
``css()`` and ``xpath()`` calls.
:meth:`~.HttpResponse.css` and :meth:`~.HttpResponse.xpath` calls.
* At the same time, there's no need to call :meth:`~.HttpResponse.selector`
each time as the :meth:`~.HttpResponse.css` and :meth:`~.HttpResponse.xpath`
Expand All @@ -260,9 +268,9 @@ HttpClient
The main interface for executing additional requests would be :class:`~.HttpClient`.
It also has full support for :mod:`asyncio` enabling developers to perform
additional requests asynchronously using ``asyncio.gather()``, ``asyncio.wait()``,
etc. This means that ``asyncio`` could be used anywhere inside the Page Object,
including the ``to_item()`` method.
additional requests asynchronously using :py:func:`asyncio.gather`,
:py:func:`asyncio.wait`, etc. This means that :mod:`asyncio` could be used anywhere
inside the Page Object, including the :meth:`~.ItemPage.to_item` method.
In the previous section, we've explored how :class:`~.HttpRequest` is defined.
Let's see a few quick examples to see how to execute additional requests using
Expand Down Expand Up @@ -296,21 +304,23 @@ Executing a HttpRequest instance
return item
As the example suggests, we're performing an additional request that allows us
to extract more images in a product page that might not otherwise be possible.
to extract more images in a product page that might not be otherwise be possible.
This is because in order to do so, an additional button needs to be clicked
which fetches the complete set of product images via AJAX.
There are a few things to take note of this example:
* Recall from the :ref:`httprequest-example` tutorial section that the
default method is ``GET``.
* We're now using the ``async/await`` syntax inside the ``to_item()`` method.
default method is ``GET``. Thus, the ``method`` parameter can be omitted
for simple ``GET`` requests.
* We're now using the ``async/await`` syntax inside the :meth:`~.ItemPage.to_item`
method.
* The response from the additional request is of type :class:`~.HttpResponse`.
.. tip::
See the :ref:`http-batch-request-example` tutorial section to see how to
execute a group of :class:`~.HttpRequest` in batch.
Check out the :ref:`http-batch-request-example` tutorial section to see how
to execute a group of :class:`~.HttpRequest` in batch.
Fortunately, there are already some quick shortcuts on how to perform single
additional requests using the :meth:`~.HttpClient.request`, :meth:`~.HttpClient.get`,
Expand Down Expand Up @@ -414,6 +424,31 @@ Here's the key takeaway in this example:
a :meth:`~.HttpClient.post` method is also available that's
typically used to submit forms.
Other Single Requests
---------------------
The :meth:`~.HttpClient.get` and :meth:`~.HttpClient.post` methods are merely
quick shortcuts for :meth:`~.HttpClient.request`:
.. code-block:: python
client = HttpClient()
url = "https://api.example.com/v1/data"
headers = {"Content-Type": "application/json;charset=UTF-8"}
body = b'{"data": "value"}'
# These are the same:
client.get(url)
client.request(url, method="GET")
# The same goes for these:
client.post(url, headers=headers, body=body)
client.request(url, method="POST", headers=headers, body=body)
Thus, apart from the common ``GET`` and ``POST`` HTTP methods, you can use
:meth:`~.HttpClient.request` for them (`e.g.` ``HEAD``, ``PUT``, ``DELETE``, etc).
.. _`http-batch-request-example`:
Batch requests
Expand Down Expand Up @@ -512,16 +547,26 @@ The key takeaways for this example are:
Nonetheless, you can still use the :meth:`~.HttpClient.batch_execute` method
to execute a single :class:`~.HttpRequest` instance.
.. note::
The :meth:`~.HttpClient.batch_execute` method is a simple wrapper over
:py:func:`asyncio.gather`. Developers are free to use other functionalities
available inside :mod:`asyncio` to handle multiple requests.
For example, :py:func:`asyncio.as_completed` can be used to process the
first response from a group of requests as early as possible. However, the
order could be shuffled.
Exception Handling
==================
Overview
--------
Let's a look at how we could handle exceptions when performing additional requests
in Page Objects. For this example, let's improve the code snippet from the previous
subsection named: :ref:`httpclient-get-example`.
Let's have a look at how we could handle exceptions when performing additional
requests inside a Page Objects. For this example, let's improve the code snippet
from the previous subsection named: :ref:`httpclient-get-example`.
.. code-block:: python
Expand Down Expand Up @@ -573,7 +618,7 @@ due to anything like `SSL errors`, `connection errors`, etc.
This should enable developers writing Page Objects to properly identify what
went wrong and act specifically based on the problem.
Let's take another example when performing batch requests as opposed to using
Let's take another example when executing requests in batch as opposed to using
single requests via these methods of the :class:`~.HttpClient`:
:meth:`~.HttpClient.request`, :meth:`~.HttpClient.get`, and :meth:`~.HttpClient.post`.
Expand Down Expand Up @@ -778,7 +823,7 @@ Downloader Implementation
Please note that on its own, :class:`~.HttpClient` doesn't do anything. It doesn't
know how to execute the request on its own. Thus, for frameworks or projects
wanting to use additional requests in Page Objects, they need to set the
implementation of how to download :class:`~.Request`.
implementation on how to execute an :class:`~.HttpRequest`.
For more info on this, kindly read the API Specifications for :class:`~.HttpClient`.
Expand Down
5 changes: 4 additions & 1 deletion web_poet/requests.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ async def request(
headers: Optional[_Headers] = None,
body: Optional[_Body] = None,
) -> HttpResponse:
"""This is a shortcut for creating a :class:`HttpRequest` instance and executing
"""This is a shortcut for creating a :class:`~.HttpRequest` instance and executing
that request.
A :class:`~.HttpResponse` instance should then be returned.
Expand Down Expand Up @@ -139,6 +139,9 @@ async def batch_execute(
"""Similar to :meth:`~.HttpClient.execute` but accepts a collection of
:class:`~.HttpRequest` instances that would be batch executed.
The order of the :class:`~.HttpResponses` would correspond to the order
of :class:`~.HttpRequest` passed.
If any of the :class:`~.HttpRequest` raises an exception upon execution,
the exception is raised.
Expand Down

0 comments on commit 753e6ad

Please sign in to comment.