# Readability.php ## News (August 2021) Andres Rey, the [original developer](https://github.com/andreskrey/readability.php) of Readability.php has kindly let us take over maintenance and development of the project. Please bear with us while we catch up with [Readability.js](https://github.com/mozilla/readability) changes. There'll be a new release (2.2.0) when we're ready. For the changes we've made so far in this repository, please see our [blog post](https://www.fivefilters.org/2021/readability/). ## About [![Latest Stable Version](https://poser.pugx.org/andreskrey/readability.php/v/stable)](https://packagist.org/packages/andreskrey/readability.php) [![Tests](https://github.com/fivefilters/readability.php/actions/workflows/main.yml/badge.svg?branch=master)](https://github.com/fivefilters/readability.php/actions/workflows/main.yml) [![Coverage Status](https://coveralls.io/repos/github/andreskrey/readability.php/badge.svg?branch=master)](https://coveralls.io/github/andreskrey/readability.php/?branch=master) [![StyleCI](https://styleci.io/repos/71042668/shield?branch=master)](https://styleci.io/repos/71042668) [![Total Downloads](https://poser.pugx.org/fivefilters/readability.php/downloads)](https://packagist.org/packages/fivefilters/readability.php) [![Monthly Downloads](https://poser.pugx.org/fivefilters/readability.php/d/monthly)](https://packagist.org/packages/fivefilters/readability.php) PHP port of *Mozilla's* **[Readability.js](https://github.com/mozilla/readability)**. Parses html text (usually news and other articles) and returns **title**, **author**, **main image** and **text content** without nav bars, ads, footers, or anything that isn't the main body of the text. Analyzes each node, gives them a score, and determines what's relevant and what can be discarded. ![Screenshot](https://raw.githubusercontent.com/andreskrey/readability.php/assets/screenshot.png) The project aim is to be a 1 to 1 port of Mozilla's version and to follow closely all changes introduced there, but there are some major differences on the structure. Most of the code is a 1:1 copy –even the comments were imported– but some functions and structures were adapted to suit better the PHP language. **Original Developer**: Andres Rey **Developer/Maintainer**: FiveFilters.org ## Requirements PHP 7.0+, ext-dom, ext-xml, and ext-mbstring. To install all this dependencies (in the rare case your system does not have them already), you could try something like this in *nix like environments: `$ sudo apt-get install php7.1-xml php7.1-mbstring` ## How to use it First you have to require the library using composer: `composer require fivefilters/readability.php` Then, create a Readability class and pass a Configuration class, feed the `parse()` function with your HTML and echo the variable: ```php use fivefilters\Readability\Readability; use fivefilters\Readability\Configuration; use fivefilters\Readability\ParseException; $readability = new Readability(new Configuration()); $html = file_get_contents('http://your.favorite.newspaper/article.html'); try { $readability->parse($html); echo $readability; } catch (ParseException $e) { echo sprintf('Error processing text: %s', $e->getMessage()); } ``` Your script will output the parsed text or inform about any errors. You should always wrap the `->parse` call in a try/catch block because if the HTML cannot be parsed correctly, a `ParseException` will be thrown. If you want to have a finer control on the output, just call the properties one by one, wrapping it with your own HTML. ```php
'; // I should not appear on the result