The Artima Developer Community
Sponsored Link

Python Buzz Forum
Plagg: The Good-Enough Aggregator

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Patrick Lioi

Posts: 45
Nickname: plioi
Registered: Aug, 2003

Patrick Lioi is a software developer in Austin, Tx.
Plagg: The Good-Enough Aggregator Posted: Sep 26, 2003 11:24 AM
Reply to this message Reply

This post originated from an RSS feed registered with Python Buzz by Patrick Lioi.
Original Post: Plagg: The Good-Enough Aggregator
Feed Title: Patrick Lioi on Python
Feed URL: http://patrick.lioi.net/syndicate/Python/RSS2.xml
Feed Description: Entries from the Python category of my personal site.
Latest Python Buzz Posts
Latest Python Buzz Posts by Patrick Lioi
Latest Posts From Patrick Lioi on Python

Advertisement

I've written an aggregator in hour-long spurts over the summer, and even though it's not really complete enough for most people to be satisfied with, I'm making it public anyway. Plagg is a cgi that will only display aggregated items that you haven't seen before. There is no "history" because I'd never use one. You click on the "rebuild" link to request a fetch, and all new items since the last fetch will be displayed.

plagg.zip contains a plagg/ directory, which contains plagg.css, plagg.html, plagg.py, WebEnvironment.py, and subscriptions.txt. You'll also need Mark Pilgrim's feedparser and Timothy O'Malley's timeoutsocket.py.

Be sure to change the first line of plagg.py to your own Python interpreter's location. Place the plagg/ directory somewhere in your server document root. Create a bookmark to plagg/plagg.html. View the page, and click on the "rebuild" link to check for new items.

The config file is currently very picky. subscriptions.txt is composed of a series of subscriptions. A subscription looks like this:

title: Patrick Lioi
rss: http://patrick.lioi.net/syndicate/RSS2.xml
link: http://patrick.lioi.net/
recent: http://patrick.lioi.net/archive/2003/09/24/104557
wait: 86400
pinged: 1064521292.0

Here, "title", "rss", and "link" have obvious meaning. "recent" is the link to the most recently aggregated item, and is used to determine which items are new during a fetch. "wait" is the minimum number of seconds that must pass between successive fetches to this feed, which keeps you from pounding a feed, and keeps plagg from taking too long to run during a fetch. "pinged" is the time of the last fetch.

When adding a new feed to subscriptions.txt, recent and pinged must be given dummy values, like so:

title: Patrick Lioi
rss: http://patrick.lioi.net/syndicate/RSS2.xml
link: http://patrick.lioi.net/
recent: This line to be set by plagg on the next run!
wait: 86400
pinged: 1

Subscriptions must be separated from each other by exactly one blank line. If plagg reports any errors, it is probably because your subscriptions.txt contains extra endlines somewhere.

One unfortunate side effect of the pinged/wait fields is that, over time, all subscriptions with the same wait period will get grouped together, so you might hit rebuild all day with no new items displayed, and then when you rebuild at 7:34 it'll take 30 seconds and present you with 40 new items. I occasionally go through the subscriptions file and set a few of their pinged values back to 1, to keep them from piling up too much. This will change when I decide to implement a better timing mechanism.

Read: Plagg: The Good-Enough Aggregator

Topic: Python eggs Previous Topic   Next Topic Topic: out of order

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use