Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Simplify inner_text implementation using lxml's text method
Rather than using regex to remove tags and attributes after the fact. https://lxml.de/api/lxml.etree-module.html#tostring This also eliminates the need to perform HTML unescaping. On my local machine, this reduces the time spent in the inner_text method on Don Quixote from 1.5 seconds to 0.04 seconds.
- Loading branch information