From 8c1aae6285323ecc2eeab7c05770c9a4772340cf Mon Sep 17 00:00:00 2001 From: fox Date: Tue, 26 Feb 2019 05:40:49 +0000 Subject: Update page 'ContentFilters' --- ContentFilters.md | 224 ++++++++++++++++++++++++++---------------------------- 1 file changed, 106 insertions(+), 118 deletions(-) diff --git a/ContentFilters.md b/ContentFilters.md index c5d3eda..6414cf3 100644 --- a/ContentFilters.md +++ b/ContentFilters.md @@ -1,118 +1,106 @@ -Filters -======= - -Filters are a very powerful and flexible tool which may significantly -ease the task of extracting useful information from the sea of data that -is RSS feeds. Filters are applied to articles based on [regular -expression](http://en.wikipedia.org/wiki/Regular_expression) match -against specified fields. After the match had been found, configured -actions are taken. Matching is case-insensitive, -[PCRE](http://php.net/manual/en/reference.pcre.pattern.syntax.php) -pattern syntax is used. - -Exact filtering algorithm has changed several times during tt-rss -development. Most current one is described in the section below. - -Filters in version:1.7.6 (current) ----------------------------------- - -### Order of loading and processing - -Filters are loaded in user-specified order and applied sequentially. It -is possible to reorder filters using drag and drop. If no manual sorting -is specified, filters are sorted alphabetically according to user -configured caption. If no caption is specified for any filter, loading -order is not guaranteed. - -### Filter objects - -Each filter object may contain an arbitrary amount of regular expression -rules and actions. Each expression may have inverse flag set, which -inverts matching result. On top of that, filter may also have an inverse -flag, which inverts the final matching. - -Filter object may be configured to successfully match when either one or -all rules match. - -Regular expressions may be matched against several article fields, such -as, title, content, author, etc. - -Do not include delimiters (e.g. /) when defining a rule's -regular expression. - -### Matching articles and applying actions - -Filter matching is performed during feed processing. - -Some actions may be applied only when the article is initially imported -from the feed. Other actions may be applied every time article is seen -in the originating feed. It is suggested to only rely on filters -applying to articles imported after the filter had been created - they will -not retroactively apply to your article database. - -Several actions are available: - -1. Delete article - do not import article from the feed, does not -actually delete anything from the database -2. Mark as read - imports article automatically marked as read -3. Set starred - sets article starred automatically on import -4. Assign tags - assigns custom tags on import -5. Publish article - sets article published automatically on import -6. [Modify score](wiki/Scoring) - modifies article overall score based on -the parameter, a signed integer number. Final article score is calculated after all filters had been applied and is a sum of all -matched scoring actions. I.e. if 3 actions matched the article: -5, +5, +10 the overall score will be 10. -7. Assign label - assigns specified label to the article on import -8. Stop / Do nothing - stops further filter processing for this article, no following filters will be checked nor rules applied. - -After all matching filters had been computed for the article, it is -either imported with modifications as specified by the rules, or dropped -if Delete article action has been found. - -### Filter testing - -Filter testing routine uses database regular expression engine which may -not be completely compatible with (usually supporting a subset of) -[PCRE](http://php.net/manual/en/reference.pcre.pattern.syntax.php) -library used for actually applying filters when tt-rss is processing -articles. This can cause filters not returning valid results while -testing but otherwise working correctly or vice versa. - -Filters in version:1.6.0 up to version:1.7.5 --------------------------------------------- - -In version:1.6.0 filters have been improved to support multiple -arbitrary match rules and actions per filter. You can migrate old -filters to new format using update.php —convert-filters -after updating the database schema to current version. It is also -possible to combine filters together. - -Also, date filtering and inverse filters have been removed. - -Please note that order of filters and rules being loaded is not -guaranteed, do not make filters depending on the output of previous -filter. You will run into problems. - -Filters in older versions -------------------------- - -One match and one action per filter are supported. Supported actions: -delete (do not import) article, mark article as read, set starred, -assign tag(s). Filters can be defined globally and for some specific -feed. - -Exact filtering algorithm is different between versions: - -### 1.2.7 and older - -First matching filter is used for the article. If two filters match one -regular expression, only one filter will be applied. Behaviour for -selecting the filter is undefined, thus creation of overlapping filters -is strongly not recommended. - -### 1.2.8 and newer - -Since 1.2.8, multiple and inverse matching are supported. All matching -filters are considered when article is being imported and all actions -executed in sequence. Inverse matching reverts matching result, e.g. -filter matching XYZZY in title with inverse flag will match all -articles, except those containing string XYZZY in title. +Filters are a very powerful and flexible tool which may significantly +ease the task of extracting useful information from the sea of data that +is RSS feeds. Filters are applied to articles based on [regular +expression](http://en.wikipedia.org/wiki/Regular_expression) match +against specified fields. After the match had been found, configured +actions are taken. Matching is case-insensitive, +[PCRE](http://php.net/manual/en/reference.pcre.pattern.syntax.php) +pattern syntax is used. + +Exact filtering algorithm has changed several times during tt-rss +development. Most current one is described in the section below. + +### Load and processing order + +Filters are loaded in user-specified order and applied sequentially. It +is possible to reorder filters using drag and drop. If no manual sorting +is specified, filters are sorted alphabetically according to user +configured caption. If no caption is specified for any filter, loading +order is not guaranteed. + +### Filter objects + +Each filter object may contain an arbitrary amount of regular expression +rules and actions. Each expression may have inverse flag set, which +inverts matching result. On top of that, filter may also have an inverse +flag, which inverts the final matching. + +Filter object may be configured to successfully match when either one or +all rules match. + +Regular expressions may be matched against several article fields, such +as, title, content, author, etc. + +Do not include delimiters (e.g. /) when writing regular +expressions. + +### Matching articles and applying actions + +Filter matching is performed during feed processing. + +Some actions may be applied only when the article is initially imported +from the feed. Other actions may be applied every time article is seen +in the originating feed. It is suggested to only rely on filters +applying to articles imported after the filter had been created - they will +not retroactively apply to your article database. + +Several actions are available: + +1. ``Delete article`` - do not import article from the feed, does not +actually delete anything from the database +2. ``Mark as read`` - imports article automatically marked as read +3. ``Set starred`` - sets article starred automatically on import +4. ``Assign tags`` - assigns custom tags on import +5. ``Publish article`` - sets article published automatically on import +6. ``[Modify score](wiki/Scoring)`` - modifies article overall score based on +the parameter, a signed integer number. Final article score is calculated after all filters had been applied and is a sum of all +matched scoring actions. I.e. if 3 actions matched the article: -5, +5, +10 the overall score will be 10. +7. ``Assign label`` - assigns specified label to the article on import +8. ``Stop / Do nothing`` - stops further filter processing for this article, no following filters will be checked nor rules applied. + +After all matching filters had been computed for the article, it is +either imported with modifications as specified by the rules, or dropped +if Delete article action has been found. + +### Testing filters + +Filter test dialog may not give entirely accurate results, especially for complex filters. It is suggested to +test filters using ``Feed debugger`` if you feel that some filter is somehow misfiring on a specific feed. + +## (deprecated) Filters in version:1.6.0 up to version:1.7.5 + +In version:1.6.0 filters have been improved to support multiple +arbitrary match rules and actions per filter. You can migrate old +filters to new format using update.php —convert-filters +after updating the database schema to current version. It is also +possible to combine filters together. + +Also, date filtering and inverse filters have been removed. + +Please note that order of filters and rules being loaded is not +guaranteed, do not make filters depending on the output of previous +filter. You will run into problems. + +## (deprecated) Filters in older versions + +One match and one action per filter are supported. Supported actions: +delete (do not import) article, mark article as read, set starred, +assign tag(s). Filters can be defined globally and for some specific +feed. + +Exact filtering algorithm is different between versions: + +### 1.2.7 and older + +First matching filter is used for the article. If two filters match one +regular expression, only one filter will be applied. Behaviour for +selecting the filter is undefined, thus creation of overlapping filters +is strongly not recommended. + +### 1.2.8 and newer + +Since 1.2.8, multiple and inverse matching are supported. All matching +filters are considered when article is being imported and all actions +executed in sequence. Inverse matching reverts matching result, e.g. +filter matching XYZZY in title with inverse flag will match all +articles, except those containing string XYZZY in title. -- cgit v1.2.3