RRSS is a Rss/Rdf/Atom feed reader written in Ruby and using Sinatra and SQLite.
It reads simple configuration files in YML format, downloads and stores items in various SQLite databases and sports a nice web GUI to read and manage them (modify, comment, mark, etc...).
RRSS is also able to:
- download gzipped files
- use scripts for scraping a web page (placed in ./scrapes)
- use scripts for manipulating the downloaded file (placed in ./scripts)
- use regular expressions for manipulating the downloaded file (:regexp: option)
- export stored items in JSON or ATOM format (with mark status too)
- run in batch mode without starting the GUI
- GUI: set custom feed favicon
- GUI: use a custom skin (CSS)
- check the ruby version you have with
ruby -vand make sure it is >= 1.9 - obtain a copy of the project from github, you can either:
- download the ZIP: `wget https://github.com/acavalin/rrss/archive/master.zip && cd rrss-master`
- clone the repository: `git clone https://github.com/acavalin/rrss.git && cd rrss`
- install the bundler gem:
gem install bundler - now you can install all required gems with
bundle install - edit config.yml as you prefer
- create a feeds.yml config file (see feeds.yml.example)
- run the application with
./rrss.rb - point your browser to http://localhost:3333
- ????
- profit! ;^)
Configuration for the feed downloader and parser:
| Key | Descr |
|---|---|
| hash_keys | Item properties to be hashed for generating its unique id. Available keys are: :id, :link, :title, :descr, :date |
| max_item_days | Number of required days for an item to be marked as old |
| max_old_items | Maximum number of old items to keep |
| parse_timeout | Rss parsing timeout (in seconds) |
| period | Feed check default interval time (in minutes) |
| timeout | Download/scrape timeout (in seconds) |
Configuration for the feed manager (the web GUI):
| Key | Descr |
|---|---|
| check_interval | periodic feed check interval time (in minutes) |
| exit_grace_time | feed download grace time on exit (in minutes) |
| layout | layout css file name |
| port | webserver (GUI) listening port |
The file represents the feeds tree as an array of key-value options, here is an example (see also feeds.yml.example):
---
# this item is on the root folder
- :name: example1
:link: http://www.foo.com
:period: 10
:enabled: true
:regexp: [['HELLO', 'Hello'], ['hi', 'HI']]
- :name: example2
:url: http://bar.org/rss.xml
:period: 60
:enabled: true
# an open folder (w/o ':' at the beginning)
- folder:
# a collapsed subfolder (w/ ':' at the beginning)
- :subfolder:
# these two items are children of "subfolder"
- :name: example3A
:url: http://www.foo2.com/rss.php
:period: 720
:enabled: true
- :name: example3B
:url: http://www.bar2.com/atom.xml
:period: 720
:enabled: true
# these two items are children of "folder"
- :name: example4A
:url: http://www.fb.org/en/feeds/news.rss
:period: 1440
:enabled: true
- :name: example4B
:url: http://www.xyz.net/news.xml
:period: 1440
:enabled: true
Every feed has a set of options you can use to customize it:
| Key | Descr |
|---|---|
| :name: | feed feedname ([a-z_]) |
| :enabled: | enable the periodic download for this feed (default false) |
| :hash_keys: | array of item properties to be hashed for generating the unique id |
| :limit: | only consider this quantity of most recent downloaded items |
| 🔗 | clickable link on the feeds tree |
| :parse_timeout: | overrider default parse timeout (in seconds) |
| :period: | periodic download interval (in minutes) |
| :regexp: | array of pairs [regexp, replace_string] to manipulate items |
| :summary: | save and show the summary of the item (default true) |
| :timeout: | overrider default downaload timeout (in seconds) |
| :url: | url of the xml file to download |
| :validation: | apply feed validation during parsing (default true) |
As you can see in the previous example, a folder comes in two flavors:
- << name : >> renders an expanded folder on the feeds tree
- << : name : >> renders a collapsed folder on the feeds tree
Here is a useful command line combo to perform an easy OPML (indented XML) to YML conversion:
cat feedlist.opml | \
sed 's/<outline title="\(.*\)" text=".*">/- :\L\1:/' | \
sed 's/\( \+\)<outline text="\([^"]*\)".*htmlUrl="\([^"]*\)" xmlUrl="\([^"]*\)".*\/>/\1- :name: \L\2\E\n\1 :url: \4\n\1 :link: \3\n\1 :enabled: true\n/' | \
grep -v "<.outline>" > feeds.yml
When adding a new feed, keep in mind the retrival/manipulation steps the application will perform on the downloaded file:
- if ./scrapes/feedname exists and is executable then run it and capture its output
- otherwise download the file specified in :url:
- if ./scripts/feedname exists and is executable then use it to convert the previous output (it must read the input from stdin and print output to stdout)
- sequentially apply every eventual regexp specified in :regexp:
- convert contents to UTF-8, parse and store them to ./db/feedname.db
- autopurge old items (only read and unkept ones)
| Key | Function |
|---|---|
| h | show help |
| n | select next unread item |
| down | select next item |
| m/up | select previous item |
| home | select first item |
| end | select last item |
| u | toggle unread on selected item |
| k | toggle kept on selected item |
| esc | close/reset view |
| v | change view filter |
| l | show linear view in list mode |
| L | show linear view in thumbs mode |
| r | refresh feeds tree |
| R | mark all feed items as read |
| s | search items in current feed/folder |
You can download all desired feed items by using the following urls:
- http://ip_address:port/dump/feed_name.xml (atom feed)
- http://ip_address:port/dump/feed_name.json (json object)
Note: Feed/item preferences are included in the XML/Atom file within dc_type tags.
To set a custom favicon for a specific feed use the script set_favicon.rb:
set_favicon.rb feed_name favicon_uri
where feed_name is the name specified in :name: and the favicon_uri can be either an URL or a local file PATH.
You can run the download process of your feeds in batch mode using the script check_feeds.rb:
check_feeds.rb [dump_dir [format]]
if you supply a dump directory then the processed feed will be dumped on that place. The format can be either xml or json.
Webservers tend to block a localhost referrer for feeds that rely on external resources like images :'(
If you use Firefox, you can bypass this problem by installing the Referrer Control extension; you can find the full documentation on its wiki page
You just need to add a custom rule:
*localhost*, <any>, <remove>
and remember to set the default rule to Skip if you wish to preserve the browser default behaviour.
Here is a list of the specs, libraries and tools used to develop RRSS:
- RSS
- Ruby libs
- http://ruby-doc.org/stdlib-1.9.3/libdoc/timeout/rdoc/Timeout.html
- http://ruby-doc.org/stdlib-1.9.3/libdoc/open-uri/rdoc/OpenURI.html
- http://ruby-doc.org/stdlib-1.9.3/libdoc/digest/rdoc/Digest.html
- http://ruby-doc.org/stdlib-1.9.3/libdoc/fileutils/rdoc/FileUtils.html
- http://ruby-doc.org/stdlib-1.9.3/libdoc/open3/rdoc/Open3.html
- http://ruby-doc.com/stdlib-1.9.2/libdoc/zlib/rdoc/Zlib.html
- http://ruby-doc.org/stdlib-1.9.3/libdoc/base64/rdoc/Base64.html
- http://ruby-doc.org/stdlib-1.9.3/libdoc/stringio/rdoc/StringIO.html
- http://ruby-doc.org/stdlib-1.9.3/libdoc/logger/rdoc/Logger.html
- http://www.ruby-doc.org/stdlib-1.9.3/libdoc/json/rdoc/JSON.html
- Sinatra
- SQLite
- YAML
- Firefox referrer control