diff options
author | Andres Rey <[email protected]> | 2017-11-12 18:54:57 +0000 |
---|---|---|
committer | Andres Rey <[email protected]> | 2017-11-12 18:54:57 +0000 |
commit | adf7970f5daf324e51176fdd9600494e598627ea (patch) | |
tree | 19391e3c2c324c0410371477df5274e45d4e8986 /README.md | |
parent | ebf579008890032d8a4280c3442729576930d191 (diff) |
Update readme and changelog
Diffstat (limited to 'README.md')
-rw-r--r-- | README.md | 1 |
1 files changed, 1 insertions, 0 deletions
@@ -45,6 +45,7 @@ If the parsing process was unsuccessful the HTMLParser will return `false` ## Options - **maxTopCandidates**: default value `5`, max amount of top level candidates. +- **wordThreshold**: default value `500`, minimum amount of characters to consider that the article was parsed successful. - **articleByLine**: default value `false`, search for the article byline and remove it from the text. It will be moved to the article metadata. - **stripUnlikelyCandidates**: default value `true`, remove nodes that are unlikely to have relevant information. Useful for debugging or parsing complex or non-standard articles. - **cleanConditionally**: default value `true`, remove certain nodes after parsing to return a cleaner result. |