diff options
author | Andres Rey <[email protected]> | 2017-03-26 11:34:29 +0100 |
---|---|---|
committer | Andres Rey <[email protected]> | 2017-03-26 11:34:29 +0100 |
commit | 01fb375746d7ef9178c4bf651774da67632b7454 (patch) | |
tree | 8d141fcefa0dc643d1188ee01ffbf1952ed9406a /README.md | |
parent | 361a1f73048a3f68af539344b0c294c9633f5820 (diff) |
Added normalizeEntities flag.
Diffstat (limited to 'README.md')
-rw-r--r-- | README.md | 1 |
1 files changed, 1 insertions, 0 deletions
@@ -52,6 +52,7 @@ If the parsing process was unsuccessful the HTMLParser will return `false` - **removeReadabilityTags**: default value `true`, remove the data-readability tags inside the nodes that are added during the rating phase. - **fixRelativeURLs**: default value `false`, convert relative URLs to absolute. Like `/test` to `http://host/test`. - **substituteEntities**: default value `false`, disables the `substituteEntities` flag of libxml. Will avoid substituting HTML entities. Like `á` to รก. +- **normalizeEntities**: default value `false`, converts UTF-8 characters to its HTML Entity equivalent. Useful to parse HTML with mixed encoding. - **originalURL**: default value `http://fakehost`, original URL from the article used to fix relative URLs. ## Limitations |