Skip to content

2.0.2

Compare
Choose a tag to compare
@benoit74 benoit74 released this 18 Jun 13:26
· 93 commits to main since this release
1452298

Added

  • Add --ignore-content-header-charsets option to disable automatic retrieval of content charsets from content first bytes (#318)
  • Add --content-header-bytes-length option to specify how many first bytes to consider when searching for content charsets in header (#320)
  • Add --ignore-http-header-charsets option to disable automatic retrieval of content charsets from content HTTP Content-Type headers (#318)

Changed

  • Simplify logic deciding content charset, stop guessing with chardet (#312)

Fixed

  • Rewrite only content with mimetype text-html when WARC-Resource-Type is html (#313)