summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2018-10-21Fix incorrect tagName checkAndres Rey
2018-10-17Update comment of hasSingleTagInsideElementAndres Rey
2018-10-17Improve script node removing functionAndres Rey
2018-09-11Import new metadata search "algorithm"Andres Rey
2018-09-05Remove experimental ifAndres Rey
2018-09-05Update initial parsing and add isWhitespace trait function.Andres Rey
2018-09-02Add hasAttribute overrideAndres Rey
2018-09-02Check for visible nodes before parsingAndres Rey
2018-09-02Add isProbablyVisible functionAndres Rey
2018-09-02Remove DOMComments before anything elseAndres Rey
2018-09-02Remove single cell tablesAndres Rey
2018-09-02Rename hasSinglePNode to hasSingleTagInsideElement and accept tag as parameterAndres Rey
2018-09-01Avoid nesting paragraphsAndres Rey
2018-09-01Import the isPhrasingContent function. Might want to check the recursive ↵Andres Rey
loop there if it's actually doing what it should and if there's a better way to optimize it
2018-09-01Update logic when accessing next elementAndres Rey
2018-09-01Add unlikely candidateAndres Rey
2018-09-01Rename wordThreshold to charThreshold and throw deprecation noticesAndres Rey
2018-05-05Issue #63: Avoid diving by zero + test caseAndres Rey
2018-04-26Remove $parseSuccessful flagAndres Rey
2018-04-10Remove extra check for DOMDocument nodes + add commentAndres Rey
2018-04-10Merge pull request #58 from PedroAmorim/noticeparentOfTopCandidate2Andres Rey
Fix notice non-object on $parentOfTopCandidate for tumblr.com
2018-04-09Fix notice non-object on $parentOfTopCandidate for tumblr.comPedro Amorim
PHP notice on DOMElement $parentOfTopCandidate. Trying to get property of non-object in serc/Readability.php line 1000 Trying to get property of non-object in serc/Readability.php line 1009 Reproduced with this url: https://clipartx.tumblr.com/post/172752750628/orange-swirl-burnt-orange-orange Config: $config = new Configuration; $config->setWordThreshold(5) ->setSummonCthulhu(true) ->setFixRelativeURLs(true) ->setOriginalURL($url);
2018-03-21Clean <aside> tags on prepArticleAndres Rey
2018-03-19Apply StyleCI diffAndres Rey
2018-03-18Merge remote-tracking branch 'origin/development' into developmentAndres Rey
# Conflicts: # CHANGELOG.md # composer.json
2018-03-18Check for base urls before generating paths for the URL resolverAndres Rey
2018-03-18Merge branch 'master' into update-to-8525c6aAndres Rey
2018-03-18Use all the article text to determine how many characters were extracted.Andres Rey
2018-03-15Override setLogger function to be able to return configuration objectAndres Rey
2018-03-14Use instanceof DOMdocumentPedro Amorim
2018-03-14Fix error C14NPedro Amorim
I have the error: "Call to a member function C14N() on null" You could reproduce like this: - try to parse a url like http://www.dailymotion.com/video/x6ga6qi that doesn't return any content - this throw an exception and the logs show "[Parsing] Could not parse text, giving up :(" - now, call ->getContent() with the same object readability Previously, getContent would return "null" but now it call ->C14N() on a NULL object.
2018-03-12removed class doc-block + method-name-builder switched to sprintftopot
2018-03-10Add log messagesAndres Rey
2018-03-10Save attempts across different runs and try to return at least something ↵Andres Rey
before giving up.
2018-03-10Clean link tagsAndres Rey
2018-03-10Failsafe for weird titlesAndres Rey
2018-03-10Add _cleanClasses functionAndres Rey
2018-03-10Add missing DOMEntity classAndres Rey
2018-03-10StyleCI diff appliedtopot
2018-03-09Added: Configuration parameters array constructor injectiontopot
2018-03-06Rename getContentObject to getDOMDocumentAndres Rey
2018-03-06Save the full DOMDocument when finish processing + pull images of the ↵Andres Rey
article from the processed object, no the original one
2018-03-06Add data-src as a image path sourceAndres Rey
2018-01-27Make sure that we do not allow the DOMDocument reach the parsing algorithm ↵Andres Rey
(Because we use/abuse the parentNode call, and a DOMDocument does not have a parent)
2018-01-11Merge remote-tracking branch 'origin/logging' into loggingAndres Rey
2018-01-11Merge branch 'master' into loggingAndres Rey
# Conflicts: # CHANGELOG.md
2018-01-11Apply fixes from StyleCIAndres Rey
2018-01-11Merge pull request #38 from PedroAmorim/domEntityReferenceAndres Rey
Add missing DOM classes
2018-01-11Add missing DOM class DOMEntityReference.Pedro Amorim
Fix error: Uncaught Error: Call to undefined method DOMEntityReference::getAttribute() in vendor/andreskrey/readability.php/src/Readability.php:528
2018-01-11Remove the data-readability referencesAndres Rey