Feed Creator

Feed Control

For more control over the feed, and to increase the frequency the source site is checked for updates, please use 'service shortcuts' after previewing the feed and choose Feed Control.

Try Feed Control now

Premium access

We also offer a premium plan for Feed Creator which increases more of the limits of the free plan. The premium access key is also valid for our Full-Text RSS premium service.

Limit	Free	Premium
Max number of items returned in each request	5	10
Feed URLs to merge	3	10
Max number of values in multi-value parameters (e.g. text filters)	3	10
Cache time	2 hours	5 minutes

Limits

Max number of items returned in each request
Free: 5, Premium: 10
Feed URLs to merge
Free: 3, Premium: 10
Max number of values in multi-value parameters (e.g. text filters)
Free: 3, Premium: 10
Cache time
Free: 2 hours, Premium: 5 minutes

Individual - $10/month (paid annually) Business - $25/month (paid annually)

The access key will be sent automatically in the order confirmation email.

Host it yourself

Don't want to rely on us? Comfortable running a PHP application on your own? You can host this yourself on your own server (one with at least PHP 7.2 installed).

Request parameters

When making HTTP requests, you can pass the following parameters to extract.php (either in a GET request or POST request). For most uses, supplying the page URL (url) and a selector (either in_id_or_class or item) is enough.

Example:

extract.php?url=https%3A%2F%2Fchomsky.info%2Farticles%2F&in_id_or_class=main_container

The form above uses the same parameters, so if you're unsure how to construct a URL, use the form and then examine the RSS feed URL that's produced.

Parameter	Value	Description
url	string (URL)	Web page URL containing links which we want to extract for our RSS feed. This is required.
in_id_or_class	string (attribute value)	Find links inside elements whose id or class attribute matches this value. This translates to the following XPath: `//a[@href and ancestor-or-self::*[@id="string" or contains(concat(" ",normalize-space(@class)," "), " string ")]]`
item	string (CSS selector)	Look inside element(s) matching this CSS expression (for example: `div.news .item`). Cannot be used in combination with `in_id_or_class`.
item_title	string (CSS selector) or 0	Extract item title from element matching CSS selector. This is applied within the context of elements selected by `item`. If omitted, the text of the first matching <a> element will be used. If set to 0, titles will not be included in the output. To use an element's attribute value rather than text content, use @attr, for example: 'img @alt'. To select the context element itself (element selected by `item`), pass ':scope'.
item_url	string (CSS selector), 0 or 1	Extract item URL from element matching CSS selector. This is applied within the context of elements selected by `item`. If omitted, the URL of the first matching <a> element will be used. If set to 0, URLs will not be included in the output. If set to 1, all item URLs will point to the input URL. To use an attribute value other than 'href', use @attr, for example: 'img @src'.
item_desc	string (CSS selector)	Extract item description from element matching CSS selector. This is applied within the context of elements selected by `item`. If omitted, the generated feed will not include item descriptions. To use an element's attribute value rather than text content, use @attr, for example: 'img @alt'. To select the context element itself (element selected by `item`), pass ':scope'.
item_date	string (CSS selector)	Extract item date from element matching CSS selector. This is applied within the context of elements selected by `item`. If omitted, the generated feed will not include item date. To get the date from an attribute value rather than text content of the element, use @attr, for example: 'time @datetime'.
item_date_format	string	If the date is not recognised correctly (e.g. treated as US format instead of European), you can specify the format it appears in here. This should follow the createFromFormat pattern, e.g. 'j-M-Y'.
item_image	string (CSS selector)	Extract item image URL from element matching CSS selector. This is applied within the context of elements selected by `item`. If omitted, no image will be included in the output. Examples: 'img', 'img.main', 'img @data-src' (extracts URL from data-src attribute instead of src)
guid	0 (default), 'url_title', 'url', 'title'	No guid field is included in the output by default. Pass 'url' to include a guid generated from the item URL, 'title' to have it generated from the item title, or 'url_title' use a combination of both.
url_contains[]	string (URL substring)	Filter out any item whose URL does not contain one of these substrings. This can appear multiple times. Example: `/article/`
text_contains[]	string (word or phrase)	Filter out any item whose title or description does not contain any of the supplied words or phrases. This can appear multiple times.
unique_url	1 (default), 0	If multiple matching items have the same URL, only the first encountered will be kept. Set to 0 to keep all matching URLs. Note: this is enabled by default, but always disabled when you ask for URLs to be excluded from output (`item_url=0`).
unique_title	1 (default), 0	If multiple matching items have the same title, only the first encountered will be kept. Set to 0 to keep all matching titles. Note: this is enabled by default, but always disabled when you ask for titles to be excluded from output (`item_title=0`)
strip	string (CSS selector)	Remove elements matching CSS selector. This will be processed before we start looking for items. To strip multiple elements, use commas. For example: `.header,.footer`
keep_qs_params	1 (default), 0, param1,param2	Determines the query string parameters that get preserved in item URLs. 1 = all preserved, 0 = none preserved. You can also pass individual parameters by name, e.g. 'articleId' to only preserve that parameter. Use a comma to supply more than one, e.g. 'id,category'.
strip_if_url[]	string	Filter out any item whose URL contains any of the supplied words or phrases. This can appear multiple times.
strip_if_text[]	string	Filter out any item whose title or description contain any of the supplied words or phrases. This can appear multiple times.
feed_title	string	The feed title to use in the generated feed. If omitted, we'll use whatever's in the <title> element of the web page requested. Note: this should be the actual title, not a selector.
max	number (limit: 20)	The maximum number of items to return. The limit can be changed in the config file.
format	rss (default), json	By default the output will be formatted as RSS 2.0. Set this to JSON if you prefer that.
order	document (default), reverse	Results are returned in document order by default (ie. in the order they appear in the source HTML). Set this to 'reverse' to reverse it.
ua	string	HTTP User-Agent header. How Feed Creator should present itself when fetching content. If a site only produces content for certain browsers, you can use this field to identify as that browser. This is sent in the HTTP request in a 'User-Agent' header. For example: "Mozilla/5.0 (x64; rv:77.0) Gecko/20100101 Firefox/77.0". (If self hosting, the default can be changed in the config file.)
referer	string, 0 or 1	HTTP Referer header. Set to 0 to disable sending a referer header. Set to 1 to use the page URL passed in `url`. Or specify a custom Referer header, e.g. "https://www.google.com". (If self hosting, the default can be changed in the config file.)
cookie	string	HTTP Cookie header. This should be used if the site needs you to be logged in to view content, or to bypass GDPR and cookie walls. For example: "euconsent=1; name2=value2;"
parser	html5 (default), libxml	Libxml is a slightly faster parser, but might not handle all pages correctly. If speed is important for you, try changing the parser to libxml - in most cases the results should be the same.
proxy	string	Self-hosted use only. If you add proxy servers to the config file, you can pass the name of the proxy server you want to use for this request, e.g. 'rotating-proxy-us'.
saved[]	string	Saved request parameters. In the config file (when self hosting), you can associate a set of request parameters with a given name. You can then pass that name in the request instead of the request parameters. This is useful if you have one set of parameters you want to apply to a number of generated feeds, or if you want to update parameters later without changing the generated feed URL.

Required parameters: url must be supplied.

Quick start

Check server compatibility.
Enter a page URL in the form above and click 'Preview' (or try one of the examples).
If the preview looks okay, use the RSS feed in your news reader or application.
The form only provides fields for basic parameters - to use the others you will have to modify the URL in your browser (try the examples for an idea).
That's it! (Although see below if you'd like to customise further.)

Configure

In addition to the parameters above, there's a configuration file which you can edit.

Features include:

Saved parameters - to avoid having to change the parameters in the URL if you have to update selectors.
Restrict access to saved parameters - this will disable access to the service except for sites you specify in config file.
Setting maximum title and description lengths.
Caching
Proxy server support

To change the configuration, edit custom/config.php and make any changes you like.

Customise this page

If everything works fine, feel free to modify this page by following the steps below:

Save a copy of index.php insdie the custom/ folder
Edit custom/index.php

Next time you load this page, it will automatically load custom/index.php instead.

Documentation and Support

We have documentation if you need help. You can also check out the forum or email us at help@fivefilters.org.

Feed Creator

Create a feed from web page elements

Result

{{ preview.title }}

{{ item.title }}