- Add argument
$normalizeWhitespace
toCrawler::innerText()
- Add argument
$default
toCrawler::attr()
- Add
CrawlerAnySelectorTextContains
test constraint - Add
CrawlerAnySelectorTextSame
test constraint - Add argument
$default
toCrawler::attr()
- Add
$useHtml5Parser
argument toCrawler
- Add
CrawlerSelectorCount
test constraint - Add argument
$normalizeWhitespace
toCrawler::innerText()
- Make
Crawler::innerText()
return the first non-empty text
- Remove
Crawler::parents()
method, useancestors()
instead
- Add
Crawler::innerText
method.
- The
parents()
method is deprecated. Useancestors()
instead. - Marked the
containsOption()
,availableOptionValues()
, anddisableValidation()
methods of theChoiceFormField
class as internal
- Added an internal cache layer on top of the CssSelectorConverter
- Added
UriResolver
to resolve an URI according to a base URI
- Added argument
$selector
toCrawler::children()
- Added argument
$default
toCrawler::text()
andhtml()
- Added
Form::getName()
method. - Added
Crawler::matches()
method. - Added
Crawler::closest()
method. - Added
Crawler::outerHtml()
method. - Added an argument to the
Crawler::text()
method to opt-in normalizing whitespaces.
- Added PHPUnit constraints:
CrawlerSelectorAttributeValueSame
,CrawlerSelectorExists
,CrawlerSelectorTextContains
andCrawlerSelectorTextSame
- Added return of element name (
_name
) inextract()
method. - Added ability to return a default value in
text()
andhtml()
instead of throwing an exception when node is empty. - When available, the html5-php library is used to parse HTML added to a Crawler for better support of HTML5 tags.
- The
$currentUri
constructor argument of theAbstractUriElement
,Link
andImage
classes is now optional. - The
Crawler::children()
method will have a new$selector
argument in version 5.0, not defining it is deprecated.
- All the URI parsing logic have been abstracted in the
AbstractUriElement
class. TheLink
class is now a child ofAbstractUriElement
. - Added an
Image
class to crawl images and parse theirsrc
attribute, andselectImage
,image
,images
methods in theCrawler
(the image version of the equivalentlink
methods).
- [BC BREAK] The default value for checkbox and radio inputs without a value attribute have changed from '1' to 'on' to match the HTML specification.
- [BC BREAK] The typehints on the
Link
,Form
andFormField
classes have been changed from\DOMNode
toDOMElement
. Using any other type ofDOMNode
was triggering fatal errors in previous versions. Code extending these classes will need to update the typehints when overwriting these methods.
Crawler::addXmlContent()
removes the default document namespace again if it's an only namespace.- added support for automatic discovery and explicit registration of document
namespaces for
Crawler::filterXPath()
andCrawler::filter()
- improved content type guessing in
Crawler::addContent()
- [BC BREAK]
Crawler::addXmlContent()
no longer removes the default document namespace
- added Crawler::html()
- [BC BREAK] Crawler::each() and Crawler::reduce() now return Crawler instances instead of DomElement instances
- added schema relative URL support to links
- added support for HTML5 'form' attribute
- added a way to set raw path to the file in FileFormField - necessary for simulating HTTP requests
- added support for the HTTP PATCH method
- refactored the Form class internals to support multi-dimensional fields (the public API is backward compatible)
- added a way to get parsing errors for Crawler::addHtmlContent() and Crawler::addXmlContent() via libxml functions
- added support for submitting a form without a submit button