summaryrefslogtreecommitdiff
path: root/src/Readability.php
AgeCommit message (Collapse)Author
2018-09-02Remove DOMComments before anything elseAndres Rey
2018-09-02Remove single cell tablesAndres Rey
2018-09-01Avoid nesting paragraphsAndres Rey
2018-09-01Import the isPhrasingContent function. Might want to check the recursive ↵Andres Rey
loop there if it's actually doing what it should and if there's a better way to optimize it
2018-09-01Update logic when accessing next elementAndres Rey
2018-09-01Rename wordThreshold to charThreshold and throw deprecation noticesAndres Rey
2018-05-05Issue #63: Avoid diving by zero + test caseAndres Rey
2018-04-26Remove $parseSuccessful flagAndres Rey
2018-04-10Remove extra check for DOMDocument nodes + add commentAndres Rey
2018-04-10Merge pull request #58 from PedroAmorim/noticeparentOfTopCandidate2Andres Rey
Fix notice non-object on $parentOfTopCandidate for tumblr.com
2018-04-09Fix notice non-object on $parentOfTopCandidate for tumblr.comPedro Amorim
PHP notice on DOMElement $parentOfTopCandidate. Trying to get property of non-object in serc/Readability.php line 1000 Trying to get property of non-object in serc/Readability.php line 1009 Reproduced with this url: https://clipartx.tumblr.com/post/172752750628/orange-swirl-burnt-orange-orange Config: $config = new Configuration; $config->setWordThreshold(5) ->setSummonCthulhu(true) ->setFixRelativeURLs(true) ->setOriginalURL($url);
2018-03-21Clean <aside> tags on prepArticleAndres Rey
2018-03-19Apply StyleCI diffAndres Rey
2018-03-18Merge remote-tracking branch 'origin/development' into developmentAndres Rey
# Conflicts: # CHANGELOG.md # composer.json
2018-03-18Check for base urls before generating paths for the URL resolverAndres Rey
2018-03-18Use all the article text to determine how many characters were extracted.Andres Rey
2018-03-14Use instanceof DOMdocumentPedro Amorim
2018-03-14Fix error C14NPedro Amorim
I have the error: "Call to a member function C14N() on null" You could reproduce like this: - try to parse a url like http://www.dailymotion.com/video/x6ga6qi that doesn't return any content - this throw an exception and the logs show "[Parsing] Could not parse text, giving up :(" - now, call ->getContent() with the same object readability Previously, getContent would return "null" but now it call ->C14N() on a NULL object.
2018-03-10Add log messagesAndres Rey
2018-03-10Save attempts across different runs and try to return at least something ↵Andres Rey
before giving up.
2018-03-10Clean link tagsAndres Rey
2018-03-10Failsafe for weird titlesAndres Rey
2018-03-10Add _cleanClasses functionAndres Rey
2018-03-06Rename getContentObject to getDOMDocumentAndres Rey
2018-03-06Save the full DOMDocument when finish processing + pull images of the ↵Andres Rey
article from the processed object, no the original one
2018-03-06Add data-src as a image path sourceAndres Rey
2018-01-11Merge remote-tracking branch 'origin/logging' into loggingAndres Rey
2018-01-11Merge branch 'master' into loggingAndres Rey
# Conflicts: # CHANGELOG.md
2018-01-11Apply fixes from StyleCIAndres Rey
2017-12-22Check for node type when scanning for better topCandidatesAndres Rey
2017-12-10Improve logging messagesAndres Rey
2017-12-10Adding comments everywhereAndres Rey
2017-12-10Check for minimum html before parsing metadataAndres Rey
2017-12-10Initial approach to logger injectionAndres Rey
2017-12-05Search for 'data-orig' in image urlsAndres Rey
2017-12-03Add function to extract img srcs from other tags that might be used on lazy ↵Andres Rey
loading or other type of post load processing.
2017-12-02Apply fixes from StyleCIAndres Rey
2017-12-02Add small template on __toString magic methodAndres Rey
2017-12-02Search for excerpt in case it's not found on HTML metadataAndres Rey
2017-12-01Clean upAndres Rey
2017-12-01Clean upAndres Rey
2017-12-01Move load function below parse functionAndres Rey
2017-12-01Move the DOM classes to its own namespaceAndres Rey
2017-12-01Merge remote-tracking branch 'origin/v1.0' into v1.0Andres Rey
# Conflicts: # src/NodeClass/DOMNode.php # src/Readability.php
2017-12-01Add readabilityDataTable param with getters and settersAndres Rey
2017-12-01Minor cleanupAndres Rey
2017-12-01Fix phpdoc on settersAndres Rey
2017-12-01Apply fixes from StyleCIAndres Rey
2017-12-01Add ParseExceptionAndres Rey
2017-11-30Remove configuration setting from the __constructAndres Rey