summaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'README.md')
-rw-r--r--README.md1
1 files changed, 1 insertions, 0 deletions
diff --git a/README.md b/README.md
index 5c98178..6495171 100644
--- a/README.md
+++ b/README.md
@@ -54,6 +54,7 @@ If the parsing process was unsuccessful the HTMLParser will return `false`
- **substituteEntities**: default value `false`, disables the `substituteEntities` flag of libxml. Will avoid substituting HTML entities. Like `á` to รก.
- **normalizeEntities**: default value `false`, converts UTF-8 characters to its HTML Entity equivalent. Useful to parse HTML with mixed encoding.
- **originalURL**: default value `http://fakehost`, original URL from the article used to fix relative URLs.
+- **summonCthulhu**: default value `false`, remove all <script> nodes via regex. This is not ideal as it might break things, but might be the only solution to [libxml problems with unescaped javascript](https://github.com/andreskrey/readability.php#known-issues).
## Limitations