diff options
author | Andres Rey <[email protected]> | 2018-11-29 18:37:00 +0000 |
---|---|---|
committer | Andres Rey <[email protected]> | 2018-11-29 18:37:00 +0000 |
commit | 31059dd083d840a5054f726a2b6df03826fcf718 (patch) | |
tree | c8d76964483f9ce673652e6af2d237f627866c38 /src/Readability.php | |
parent | b04fa7769790d87ce9d557babd984a5c81dd2eb7 (diff) |
Update regex property extractor to avoid matching og:image tags multiple times and overwriting it's value (like og:image:width overwriting og:image)
Diffstat (limited to 'src/Readability.php')
-rw-r--r-- | src/Readability.php | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/src/Readability.php b/src/Readability.php index 7b7eed6..8c55e69 100644 --- a/src/Readability.php +++ b/src/Readability.php @@ -287,10 +287,10 @@ class Readability $values = []; // property is a space-separated list of values - $propertyPattern = '/\s*(dc|dcterm|og|twitter)\s*:\s*(author|creator|description|title|image)\s*/i'; + $propertyPattern = '/\s*(dc|dcterm|og|twitter)\s*:\s*(author|creator|description|title|image)(?!:)\s*/i'; // name is a single value - $namePattern = '/^\s*(?:(dc|dcterm|og|twitter|weibo:(article|webpage))\s*[\.:]\s*)?(author|creator|description|title|image)\s*$/i'; + $namePattern = '/^\s*(?:(dc|dcterm|og|twitter|weibo:(article|webpage))\s*[\.:]\s*)?(author|creator|description|title|image)(?!:)\s*$/i'; // Find description tags. foreach ($this->dom->getElementsByTagName('meta') as $meta) { |