summaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
authorAndres Rey <[email protected]>2017-03-26 11:34:29 +0100
committerAndres Rey <[email protected]>2017-03-26 11:34:29 +0100
commit01fb375746d7ef9178c4bf651774da67632b7454 (patch)
tree8d141fcefa0dc643d1188ee01ffbf1952ed9406a /README.md
parent361a1f73048a3f68af539344b0c294c9633f5820 (diff)
Added normalizeEntities flag.
Diffstat (limited to 'README.md')
-rw-r--r--README.md1
1 files changed, 1 insertions, 0 deletions
diff --git a/README.md b/README.md
index b9c877d..ee3bfd9 100644
--- a/README.md
+++ b/README.md
@@ -52,6 +52,7 @@ If the parsing process was unsuccessful the HTMLParser will return `false`
- **removeReadabilityTags**: default value `true`, remove the data-readability tags inside the nodes that are added during the rating phase.
- **fixRelativeURLs**: default value `false`, convert relative URLs to absolute. Like `/test` to `http://host/test`.
- **substituteEntities**: default value `false`, disables the `substituteEntities` flag of libxml. Will avoid substituting HTML entities. Like `&aacute;` to รก.
+- **normalizeEntities**: default value `false`, converts UTF-8 characters to its HTML Entity equivalent. Useful to parse HTML with mixed encoding.
- **originalURL**: default value `http://fakehost`, original URL from the article used to fix relative URLs.
## Limitations