summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2017-11-23Add getImages()Pedro Amorim
Get all images URL of current DOM at once.
2017-11-22Merge pull request #31 from PedroAmorim/fixUnsupportedOperandTypesAndres Rey
Fix "Unsupported operand types"
2017-11-22Add node to the elementsToScore array after converting it from a div to a p ↵Andres Rey
+ test case
2017-11-22Fix "Unsupported operand types"Pedro Amorim
Missing method "count" in if statement. PHP Fatal error: Uncaught Error: Unsupported operand types in vendor/andreskrey/readability.php/src/HTMLParser.php:622
2017-11-14Trim strings when detecting hierarchical separatorsAndres Rey
2017-11-12Apply fixes from StyleCIAndres Rey
2017-11-12Clean specific attributes during _cleanStylesAndres Rey
2017-11-12Refactor title matching in H2sAndres Rey
2017-11-12Add wordThreshold optionAndres Rey
2017-11-12Remove empty or just whitespace P elements during ratingAndres Rey
2017-11-12Add new regexp to check for whitespace include unicode version of  Andres Rey
2017-11-12Minor fix when pushing results to the $alternativeCandidateAncestors arrayAndres Rey
2017-11-11Minor fix when getting alternative top candidate ancestors + Remove DOMCommentsAndres Rey
2017-11-11Allow getting all node ancestorsAndres Rey
2017-11-11Filter empty children nodes when scanning for single P nodesAndres Rey
2017-11-11Switch to array iterator when replacing links on prepArticleAndres Rey
2017-11-11Remove nodes when there's only one DOMText node with no textAndres Rey
2017-11-11_cleanMatchedNodes to remove nodes based on regex during final cleanupAndres Rey
2017-11-09Remove extra brs between p nodes after processing the articleAndres Rey
2017-11-09Remove reverse traversing when scanning for brs and convert the DOMNodeList ↵Andres Rey
to an array before looping over it
2017-11-09Scan nodes in reverse in removing functions.Andres Rey
In other words: Node shifting is a bitch
2017-11-09Better detection of empty paragraphsAndres Rey
2017-11-08Remove BR cleaning on text nodes temporarilyAndres Rey
2017-11-07Clean style attributes inside tagsAndres Rey
2017-11-07Merge branch 'master' into ↵Andres Rey
development-update-to-f0edc77cb58ef52890e3065cf2b0e334d940feb2
2017-11-07Add article text direction to responseAndres Rey
2017-11-07Update logic to remove nodes when cleaning conditionallyAndres Rey
2017-11-07Mark datatables and avoid removing them during cleaningAndres Rey
2017-11-06Check for maxDepth before continuingAndres Rey
2017-11-06Get the article directionAndres Rey
TODO: Make the metadata array an object with getters and setters
2017-11-06Keep potential top candidate's parent node to try to get text direction of ↵Andres Rey
it later.
2017-11-05If the top candidate is the only child, use parent instead. This will help ↵Andres Rey
sibling joining logic when adjacent content is actually located in parent's sibling node.
2017-11-05Find a better top candidate node if it contains (at least three) nodes which ↵Andres Rey
belong to `topCandidates` array and whose scores are quite closed with current `topCandidate` node.
2017-11-05CleanupAndres Rey
2017-11-05Check for text node contents before converting them to P tagsAndres Rey
2017-11-05Add isElementWithoutContent functionAndres Rey
2017-11-05Clean extra fields when prepping the articleAndres Rey
2017-11-04Add hierarchical separators detection on titlesAndres Rey
2017-11-02Minor cleanupAndres Rey
2017-11-02Update the unlikelyCandidates regexAndres Rey
2017-10-05Apply fixes from StyleCIAndres Rey
2017-10-05Merge pull request #24 from jagermesh/UndefinedIndexAndres Rey
Fix for Notice: Undefined index: og:image
2017-10-05fix forSergiy Lavryk
Notice: Undefined index: og:image in /andreskrey/readability.php/src/HTMLParser.php, line 469
2017-09-14Add summonCthulhu config option + test casesAndres Rey
2017-06-15Safecheck for really bad HTMLAndres Rey
2017-05-31Apply fixes from StyleCIAndres Rey
2017-05-30Minor fixAndres Rey
2017-05-21Minor fixAndres Rey
2017-05-20Move the removeScripts and prepDocument functions inside the loadHTML ↵Andres Rey
function. Performance will suffer (as the system has to reparse the html eveytime it cycles) but is the only solution AFAIK.
2017-05-20Merge remote-tracking branch 'origin/pr-20-new-backup-approach' into ↵Andres Rey
pr-20-new-backup-approach