Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
Shortcomings of today's RSS systems

By gpoul in Internet
Mon Nov 11, 2002 at 03:37:41 AM EST
Tags: Internet (all tags)
Internet

RSS (Really Simple Syndication) is a web content syndication format based on XML 1.0. In the current implementation it's a very capable format used to aggregate content from multiple news sources. The problem with this format is that the architecture on which it is deployed hasn't changed in a while and places too much load onto the infrastructure. (e.g. Joel's RSS problem)

In this article I try to point out different approaches to solve these problems and make RSS more suited to be used on mobile devices which are not always-on.


Status quo

RSS is specified by Dave Winer and its adoption is rising because the RSS specification is simpler to use and implement than competing solutions like RDF (Resource Description Framework). Today, news feeds are based on a polling architecture. The publisher always has a fixed number of the latest news items on his web server in an RSS file. Whenever the user requires updated items he has to fetch the complete file and analyze which items in this file are actually new and which were already included in one of the earlier chunks he fetched from the server. RSS 2.0 started to fix this problem by introducing a recommended unique identifier "guid" field to news items. If the request interval for the RSS file is too high it is possible to lose news items because the RSS file is rotated as new news stories are added on the server.

In a centralized server environment it might not seem to be a big deal if the file is always rotated as new stories are added. Bandwidth is available plentiful today and it is not a big problem for an always-on server to fetch news items every few minutes. This all looks different if you deploy such a personal news aggregator on a notebook because these devices are not always-on.

Proposed Solution

I only focus on the architecture surrounding the RSS specification and I only discuss the RSS fields that have a direct impact on it.

To create reliable aggregators you have to make the "guid" field required to be in the RSS file and to require publishers to make them unique for the channel. "guid" fields are used to identify duplicate news items in an RSS file that have already been received. Aggregators should not rely on the fact that "guids" are unique and should instead only assume that they're unique for the channel because different channels should not be able to conflict with each other if they provide the same "guids."

We can now specify these requirements:

  • Users should receive news only once.
  • Users should not lose news items, even if they're offline for extensive periods. (e.g. three week vacation)

The following sections describe three different architectures that all satisfy these requirements:

  • Generating dynamic RSS

    In this scenario the user session has to be saved on the server and the user has to be identified with basic authentication or through a cookie. The web server would return only the news items that are new since the last request. In this scenario it might also be possible to request all news items since a specified date. The upside of this solution is that it is a completely distributed solution. The main problem with this solution is that it requires work on the publisher's side and requires the server to serve dynamic pages. At the moment systems like Radio Userland use static hosting without any dynamic components because everything is uploaded from the client.

    Another solution that should be mentioned here uses a conditional HTTP GET to fetch RSS. This only retrieves the RSS file if it has changed since the last request.

  • Using NNTP to distribute RSS

    Another solution to handle this problem is to use NNTP to distribute RSS. RFC 977, the Network News Transfer Protocol, as published in February 1986 offers a NEWNEWS command that lists all news items (called articles in NNTP terminology) posted since a specified time and date. Publishers would simply publish their RSS to NNTP news groups and the existing NNTP infrastructure distributes it to all news servers on the network.

    This solution relies on an already deployed, reliable, and distributed infrastructure that can be used to distribute these items. This solution would also take off load from the web servers of the publishers and distribute it to the many news servers deployed by ISPs.

  • Using a Messaging Architecture

    The "coolest" solution of all is to use a messaging system to distribute RSS snippets. In this environment a user is able to register with a service and request his news channels. As long as the user is online his news aggregator will be fed news items as they're posted. When the user is no longer available the service has to recognize this event and save the user's feed status. As soon as the user is available again the service has to send all news items that were posted while the user was offline and continue sending new items as long as the user is online.

    Jabber is one of the most likely messaging systems to be used by such a system because it's an open standard and is easy and cheap to deploy. Content publishers would have to distribute their news items to their subscribers as RSS snippets.

I hope you find these ideas interesting and improve upon it. In essence I only addressed architecture issues for the distribution of RSS to the users because the currently used system of publishing RSS as static files to web servers is not convenient or useful to anyone except publishers.

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Related Links
o Joel's RSS problem
o Dave Winer
o RSS specification
o RDF
o conditiona l HTTP GET to fetch RSS
o Also by gpoul


Display: Sort:
Shortcomings of today's RSS systems | 46 comments (33 topical, 13 editorial, 0 hidden)
guid (4.00 / 1) (#11)
by marx on Sun Nov 10, 2002 at 02:45:55 PM EST

You introduce the abbreviation "guid" without defining it. Even if you explain how it's used, it's very irritating. What does it mean?

A quick search tells me it's probably "globally unique identifier" from the Win32 API. Since you explicitly say it may not be globally unique, it seems a poor and confusing name.

Join me in the War on Torture: help eradicate torture from the world by holding torturers accountable.

I'm not responsible for anything (4.00 / 2) (#12)
by gpoul on Sun Nov 10, 2002 at 02:58:16 PM EST

You know, I'm not responsible for anything that happens somewhere in the world.

At the top of the article I link to the RSS spec at http://backend.userland.com/rss and Dave Winer's the one who names his tags in his file format.

Please don't blame me for Dave Winer's poor choice of tag names :)

[ Parent ]

-1 Dump it. (1.23 / 13) (#13)
by dvchaos on Sun Nov 10, 2002 at 02:58:37 PM EST

Just cos I felt like it. :P

--
RAR.to - anonymous proxy server!
cool, eh? (none / 0) (#14)
by gpoul on Sun Nov 10, 2002 at 03:00:35 PM EST

Nothing more destructive to do right now fellow? ;-)

[ Parent ]
NNTP/USENET propogation (4.00 / 1) (#15)
by imrdkl on Sun Nov 10, 2002 at 03:17:50 PM EST

While the logical process and mechanisms of processing and reading news on USENET may map to the distribution of webpage headlines, and possibly even the web articles themselves, there would likely be some hoops to jump through, to make it fly. Perhaps you'd care to flesh out your proposal a bit? Do you think to create a new root usenet heirarchy? That's a pretty big deal, as I recall.

The way I see it, some news is important, and deserves full and instantaneous propogation insofar as possible. But my guess is that an awful lot of RSS/RDF messages are just repeats of what comes out of one of the major feeds. How are they going to feel about not just being "deeply linked", but also being regularly and fully propogated? Granted, there's likely sites out there which have a real, honest-to-goodness scoop once in a while, but a lot of what comes off of the RDF feeds is just linked memes, or less.

Finally, I have to wonder about the USENET possibilities in terms of readership and availabilty. I haven't looked at the statistics lately, but I it seems like USENET just isn't as widely-read these days, although it is propogated to places where webpages cant be reached, such as UUCP-based feeds to remote locations. But then, without the full article (and not just the RSS object), it's unclear how much information could be gleaned by a non-networked node.

All in all, my guess is that RSS/RDF over USENET is a non-starter, alth ugh it has appeal in some ways. Would you not agree?

Maybe over NNTP, but not USENET (none / 0) (#18)
by GGardner on Sun Nov 10, 2002 at 06:08:41 PM EST

You could set up a bunch of servers connected via NNTP that propogate RSS feeds, but are completely unconnected to the rest of Usenet. I doubt that there are any significant Usenet feeds over UUCP these days, the bandwidth is just to much. Don't throw out NNTP just because Usenet is a big cesspool.

[ Parent ]
Sounds kind of far-fetched (4.00 / 1) (#22)
by imrdkl on Sun Nov 10, 2002 at 07:47:20 PM EST

One of the primary benefits for using NNTP would be the existing implementation, imho, although the protocol might also be adaptable to propogation of web news. How would you create the group heirarchy?

[ Parent ]
Simply use some alt.* group to distribute it (none / 0) (#37)
by gpoul on Mon Nov 11, 2002 at 02:15:20 PM EST

You could use something like alt.rss.news.international to distribute RSS files for a specific topic.

If there are thousands of groups for explicit pictures I don't think getting groups to test with would be a major hurdle.

[ Parent ]

Usenet is a big cesspool? (none / 0) (#29)
by Bill Godfrey on Mon Nov 11, 2002 at 09:06:25 AM EST

Someone forget to tell usenet.

[ Parent ]
Re: Usenet is a big cesspool? (4.00 / 1) (#40)
by WWWWolf on Mon Nov 11, 2002 at 03:49:14 PM EST

Someone forget to tell usenet.
Yeah, we did have some trolls there, but then the trolls discovered this "web" thing when these "discussion sites" showed up. =)

-- Weyfour WWWWolf, a lupine technomancer from the cold north...


[ Parent ]
It's worse than that (4.00 / 2) (#19)
by ChazR on Sun Nov 10, 2002 at 06:21:51 PM EST

I strongly agree with your reservations about NNTP.

When I read the article, the proposal to use NNTP screamed "BAD IDEA" to all my design and architecture senses. I've abused protocols enough to trust my senses without really thinking.

Now I've spent 15 seconds thinking, and I've come up with a lot of reasons why NNTP is a stupid idea for this.

NNTP is essentially a security disaster. Anybody can say anything, while pretending to be someone completely different. For Usenet, this is a huge problem, but most articles don't matter, so we live with it. RDF/RSS needs some authentication. All it needs is for a muppet to start posting update notices randomly, and we have a nasty DoS attack on the whole system.

NNTP supports cancel notices. What happens when Dodgy Denzil's News Emporium starts cancelling notices from CNN, then posting first to get a "scoop"?

Lots of other reasons that I'm too bored to mention.

Before you subvert a protocol to do something new, please make sure that your new purpose is isomoprhic to a subset of the original purpose. Otherwise, you will get bitten in the arse.

Using a message-passing system would be possible. A subscribe/publish system would do almost everything needed. You just need good authentication at the subscribe step.

[ Parent ]

Funny how lots of ideas these days (none / 0) (#21)
by imrdkl on Sun Nov 10, 2002 at 07:14:18 PM EST

Can only get as far as reliable identity. People dont want just a guid anymore.

[ Parent ]
Problems with RSS over NNTP (none / 0) (#31)
by ajf on Mon Nov 11, 2002 at 09:51:37 AM EST

I'm not too worried about authentication or cancel messages. Cancel messages are already distrusted for the reasons you describe, even with messages that "don't matter". And I don't see why RSS over NNTP can't be signed the way source tarballs are on download servers; if the signature doesn't check out, you ignore the message. People could sign their Usenet messages today — it's just not that common because nobody thinks it's important enough most of the time. (Impersonators in human conversations are usually easy to spot because, once somebody notices something fishy is going on, their puerile behaviour gives them away.) For RSS it would be more important, obviously, and I think signed messages would work well.

I strongly disagree that this is "subverting" NNTP — apart from the fact that the messages aren't intended for human consumption, this is the reason NNTP exists.

The real problem with this proposal is lost messages. Usenet, contrary to what gpoul says in this part of the story, is not reliable at all. There's a pretty good chance RSS over Usenet would be as badly supported by the average ISP's news server as alt.binaries.* can be (ie, thousands of messages silently lost every day). On the other hand, if you want to run your own dedicated servers for this (whether they're NNTP servers or Jabber or any other protocol) you're just shifting the problem from the many bloggers to these few new aggregator proxies (for want of a better name). That will only make bandwidth expenses become a problem sooner.

There are definitely some advantages to the NNTP proposal. There is already a widespread network in place which effectively spreads out the bandwidth costs, and the server does not need to know anything about its clients (whether they're content producers or consumers). In principle the idea has a lot of merit, and I'd like to see somebody make the attempt, but there are definitely hurdles.


"I have no idea if it is true or not, but given what you read on the Web, it seems to be a valid concern." -jjayson
[ Parent ]

It's already been done (4.00 / 2) (#20)
by Talez on Sun Nov 10, 2002 at 06:51:57 PM EST

It also used good old HTTP rather than having to use NNTP commands.

I can't remember the exact name of it but heres the jist of it:

  • Each version of a news page has a version number
  • Each new item of news increases the version number of the RSS feed
  • News agent sends back the lastest version of the news it has
  • Server sends back news thats missing
Reasons why your standard is bad:
  • It uses NNTP which results in a RSS agent having to support it
  • It uses NNTP which will result in all content being returned in that stupid ASCII NNTP format. We now have a great majority of publishers using an XML standard delivered by HTTP. Why in the world would you even THINK about giving that away?
  • Your solution involves tracking EVERY user. Simply putting the two versions through diff and sending the difference makes much more sense and lowers the amount of stress placed on the server.
I appreciate your thought and innovation but please don't reinvent the wheel with what appears to be a complicated and messy solution.

Si in Googlis non est, ergo non est
Client Side (4.00 / 2) (#23)
by chigaze on Sun Nov 10, 2002 at 11:03:08 PM EST

I think this matches with what I was thinking when reading this article: Why does the server have to keep track of the clients?

All the server should have to do is keep track of the news items. Each client should send a request to the server saying: I have everything up to 'here', give me everything after that.


-- Stop Global Whining
[ Parent ]
Problems (5.00 / 3) (#33)
by ajf on Mon Nov 11, 2002 at 10:18:09 AM EST

The solution you're talking about requires code executing on the web server. A lot of RSS feeds are generated by an application running on some computer other than the web server where the site lives. A number of free and subscription-based blogging services don't support anything other than static files uploaded at content publishing time. More to the point, your solution essentially reinvents what NNTP offers here, while sacrificing the advantages of distributing the load.

I really don't know what you mean when you contrast "that stupid NNTP ASCII format" with XML, since there's absolutely no reason you couldn't send XML over NNTP. The reason you would "think about giving that away" (whatever "that" may be) is simple: sending one copy of your RSS feed to a news server using NNTP is a lot cheaper than sending one thousand copies of it to each of your subscribers via HTTP.

Finally, as chigaze mentioned, the NNTP server doesn't have to track anything about the clients at all.

"I have no idea if it is true or not, but given what you read on the Web, it seems to be a valid concern." -jjayson
[ Parent ]

Using HTTP (in)correctly (none / 0) (#24)
by ukryule on Sun Nov 10, 2002 at 11:10:59 PM EST

Another solution that should be mentioned here uses a conditional HTTP GET to fetch RSS. This only retrieves the RSS file if it has changed since the last request.
Why are you even mentioning using cookies and authentication, when using the basic features of HTTP seem to solve the problem? (i.e. using the last-modified and if-modified-since headers).

As the article you link to mentions, for static RSS files it requires no extra work on the server side, and isn't too much work on the client side.

Incidentally, if you dynamically generate your RSS you could extend the concept - by only returning the RSS data that has a date after the if-modified-since header you receive (e.g. if the file is made up of 3 headlines, you only return the 2 newest ones which haven't been viewed before). This is mild abuse of the HTTP protocol (the header implies you should either return all the content or a not-modified error), and also requires all the RSS clients to be expecting this behaviour (i.e. they don't forget the older stories because you haven't returned them). However, as long as it is expected behaviour in the RSS spec, i don't see a problem with it.

Doesn't meet requirement 1 (none / 0) (#25)
by gpoul on Mon Nov 11, 2002 at 12:07:06 AM EST

Using the conditional HTTP GET doesn't meet my requirement 1 that news should only be received once.

Your last paragraph however about using the headers to calculate seems to be interesting and worth to think further about.

[ Parent ]

Fair enough (none / 0) (#26)
by ukryule on Mon Nov 11, 2002 at 05:38:22 AM EST

However, if you're restricting yourself to HTTP and statically generated pages, then your requirement 1 is going to be tricky to solve.

If you have got dynamically generated RSS though, HTTP headers must be a better way to go. Setting cookies, or requiring authentication seems like overkill (not only will you be required to keep a database of who you've told what, but it can also require multiple requests, thus increasing bandwidth). If you don't like abusing the if-modified-since header, you could always create a new one (e.g. changes-since), although you obviously want to be careful about adding stuff like that ...

If you wanted to get *really* cunning, you could even respond with the output of 'diff' from the previous request to the current one - a content type of 'diff-to-old-file'. Needs a bit of thought on the server side, but if you're already using a content management system then it's just a question of them implementing the feature.

[ Parent ]

Solutions (4.75 / 4) (#27)
by carlfish on Mon Nov 11, 2002 at 06:56:24 AM EST

(I wrote the Conditional GET guide linked above)

Conditional GET was a good first start to the problem. Most news aggregators and content-management systems are moving quickly to support this (even Movable Type, which already generates static files, has been patched to make sure that during updates, it won't change the Last-Modified date on files that have changed.) I checked my access logs, and the day before I posted the article above, only 10% of requests for my RSS files were getting "304 Not Modified" responses. Ten days later, that was up to 44%. I suspect that figure will improve, many aggregator users haven't upgraded yet.

(My article didn't have much to do with the adoption, by the way, it was just a convenient summary of what lots of people were saying)

Gzip Content-Encoding (compressing files during transfer) can save more bandwidth, but it requires additional client/server negotiation, and most of us aren't friendly enough with our web-hosts to get them to install arbitrary Apache modules.

NNTP has the advantage that it's an existing protocol, but it has the disadvantage that the existing news infrastructure is totally unsuitable for secure, noise-free article distribution.

Dave Winer has implemented a server-level cache for RSS feeds, which works with his Radio Userland client-base. This is an RSS concentrator, it focuses the bandwidth costs on the cache rather than on the individual RSS-producing sites. Which is fine if, like Userland, you are charging your users for the privilege of access, but is probably less workable as a general solution.

Instant messaging is probably the most workable solution. The correct fix is to change the mechanism for update notification so that it is server-triggered instead of client triggered, removing the need for polling entirely, and ensure delivery of the update information is multicast rather than pointcast. It wouldn't be hard to write a module for Jabber that would allow anyone to subscribe to a newsfeed, so that when a site updates, it automagically pushes the details of the updated article to all subscribed clients via the network of Jabber servers.

An additional advantage of the Jabber-based system is that it allows immediate notification as news comes in, rather than having to wait an hour for your RSS aggregator to cycle.

Charles Miller

--

The more I learn about the Internet, the more amazed I am that it works at all.

Correction: "on files that -haven't- changed& (none / 0) (#28)
by carlfish on Mon Nov 11, 2002 at 07:17:45 AM EST

Oops.

[ Parent ]
Already done (none / 0) (#36)
by gehrehmee on Mon Nov 11, 2002 at 02:06:51 PM EST

Janchor. See also my ideas on what goals I want to persue with this project: VISION.html

[ Parent ]
Sign them (5.00 / 1) (#38)
by gpoul on Mon Nov 11, 2002 at 02:27:17 PM EST

This was already discussed in another thread here but to use NNTP you just have to sign all RSS files you distribute over NNTP. This way you could filter out junk and all that useless stuff.

The added benefit would be that even if the web site should be hacked they don't get the private key because you distribute RSS messages from your own home or mobile PC which also has the key.

I'm glad that you like the Jabber idea. - It seems that there are many people here who at least like one of my ideas. But it seems that there isn't a particular one that everyone likes :)

That's why I'm glad I wrote this article.

[ Parent ]

RSS aggregators for the masses (4.33 / 3) (#30)
by Echo5ive on Mon Nov 11, 2002 at 09:33:48 AM EST

For those who want to try out RSS feeds, here are a bunch of aggregators for various operating systems.

  • Feedreader: the one I currently use in Windows. It still lacks a bunch of features, but looks promising. Sits in your system tray and blinks when a feed is updated - you can set individual update intervals for each feeds, but currently not higher than 60 minutes, which will be changed for the next version.
  • AmphetaDesk: another Windows client, this one acts as a small web server that you access with your browser. It could probably be configured to be accessed from other computers as well, but I haven't tried that.
  • NetNewsWire Lite: a client for OS X. I don't have a Mac, but I've heard lots of good things about this client.
  • Straw: a client for linux/unix using Gnome 2. It depends heavily on libraries that are still unstable and themselves have lots of dependencies, so I haven't bothered to try this one yet. The screenshots look very nice,though.


--
Frozen Skies: mental masturbation.

Keeping state on the server (none / 0) (#34)
by mnot on Mon Nov 11, 2002 at 11:11:31 AM EST

Except for NNTP (which has its own issues), your solutions require the server to keep state about its clients. Although some sites do this with cookies, it ultimately doesn't scale to the scope of the Internet, which, to me at least, is a major goal for RSS.

What you may want to do is consider defining a mechanism like NNTP NEWNEWS for RSS. This would probably be most appropriate as a mechanism in the URI, e.g., http://www.example.com/feed.rss?since=thursday (trivial example)

Note that it's important not to dictate the URI structure in the format's specification; that's too rigid, and doesn't leave the choice to Webmasters, the people who should be making choices about URI structure (to accommodate local conventions, how the site is structured, etc.)

So, there should be an RSS module that states what URI convention should be in use, e.g., <rss_uri:convention>mnots_way_of_doing_it</rss:uri:convention> (again, a trivial example; it would be better to include a URI so that people can find out what mnot's way *is*, and it might just be better to use an element, e.g., <mnot:rss_uri_convention/>)

RSS flavours and RDF (5.00 / 4) (#35)
by Cato on Mon Nov 11, 2002 at 11:36:21 AM EST

RSS is more than just the Dave Winer spec (RSS 0.91 I think), and was originally defined (version 0.90) by Netscape not Dave Winer.  Other versions such as RSS 1.0 are also quite popular, because they are easily extensible to carry extra information for specific applications.

RDF (Resource Description Format) is an entirely separate standard that is used by some RSS flavours as well as by other XML-based specifications; however, some RSS flavours don't use RDF at all, and use DTDs instead.  Unfortunately the use of .rdf in filenames for RSS data means that some people talk about RDF as if it were RSS.

The difference between the flavours is not that great - once you have code that generates one type of RSS it's easy enough to generate another.  For an overview of the RSS specs and links to various RSS sites, including Ben Hammersley's blog, see http://twiki.org/cgi-bin/view/Codev/RichSiteSummary.

The cool thing about RSS is that you can apply it to blogs, Wikis, news sites, newsgroups (PHP code available at http://donkin.org/ ), email lists, and so on.  The not so cool thing is that it's not very widely used and I've yet to find a really nice desktop RSS reader - ideally, RSS would be just another protocol/format supported by POP3/IMAP email clients, and NNTP news readers.  That way you could just check your email/news and pick up your RSS feed updates at the same time.

Using other protocols such as NNTP or Jabber to deliver RSS feeds has some advantages, but in some cases it may be beneficial to lose old messages - just like reading blogs, I may not want to have 3 weeks of old messages when I get back from a holiday...

Just make a catch-up feature in reader (2.00 / 1) (#39)
by gpoul on Mon Nov 11, 2002 at 02:32:35 PM EST

As with your favorite usenet news reader you could make a catch-up feature in your RSS reader that would just mark everything as read.

This way you could have both things. Some people don't want to lose news items. - Or maybe you want to lose items in some channels (e.g. slashdot, freshmeat, ...) but not in others (e.g. CNN, k5, ...)

[ Parent ]

RSS for end users (none / 0) (#41)
by rompe on Tue Nov 12, 2002 at 07:52:21 AM EST

Two things nobody else pointed out:

First, people who use Gnus as their mail or news reader can also read RSS feeds like every other news source. No need for just another software.

Second, Abe Fettig is working on a projectcalled HEP. At the moment it is a basic RSS<->Mail gateway, but there are plans to involve several other layers including Jabber. Very cool.

And the third of the two things is that I thought about feeding and getting fed in my blog recently. My conclusion for now was that as long as RSS feeds are wide spread (they are relatively wide spread) it won't be easy to get programmers and users to use another syndication "standard". So my idea regarding the bandwith problem generated by all the unnessessary RSS polls is that the feeders should ping sites like weblogs.com or glo.gs (many blogs are already doing this) and as an aggregator you just have to poll one of these sites, grep for your wanted feeds, and only poll the positive matches.

That would be at least a short term solution.

Sorry, I have the feeling that my english is a bit broken - hope you read it anyway.

RSS reader list; simple solution (none / 0) (#42)
by Rademir on Tue Nov 12, 2002 at 08:52:07 PM EST

You may find this list of RSS readers interesting or useful. Please add any good options you see missing (it's a wiki page).

Okay, here's a simple solution that works for static files:

1) Use guids that are valid filenames.

2) Server maintains multiple static RSS files. One contains all N latest items. The second one has the latest N-1 items, the third N-2, etc. and the last has no items at all. The name of each is the guid of the most recent item *not* included in it. The longest one is also given a generic name like index.rss

3) The client requests the file named as the guid of the last item it got (with conditional GET of course). If the request returns a 404 error then the client grabs the generic file.

This does not solve the problem of missing items if you are off-line for a while. Imho, the best solution to that is to maintain an always-online robot that archives them. In general, i think having such always-on robots should become as normal as having an e-mail address.


Consciousness is our Oxygen Challenge


just base it on time (none / 0) (#43)
by bolthole on Sun Nov 24, 2002 at 02:17:24 AM EST

What the heck is so complicated... Just have a

"gimme articles seen since [date in UTC]"

And have the server return ITS current timestamp, with the list of articles.

It doesnt matter if the users device is set to the correct time. It just has to record what the SERVER thinks was the correct time, last time the device polled it.

Even old NNTP has some kind of equivalent, if I recall correctly, although I dont remember the specifics.


Content of RSS feed headers? (none / 0) (#44)
by swisswuff on Thu Dec 26, 2002 at 08:59:33 AM EST

Why do I think RSS is cool? Because it is extremely fast. I get the 'seemingly relevant bits' - the headers - and I decide what I want to read by going on the web to see the whole stuff. Using the RSS feeds both ways - serving and consuming them - I find that one-liner titles with, say, up to 160 characters, would be perfectly enough. No articles need to be transmitted with this. The total in information contained should be kept to a formal minimum. The necessary requirement is something not (yet) examined: What is an intelligent title? What is a title that conveys what the content is about? Coding content must be the result of a compromise, a resolution, maybe an agreement. Medical database information applies keywords out of a list that is called MESH (medical subject headings). It would surely help if there was an RSSSH (RSS Subject Headings) directory that tells you how to spell, how to encode, what format to use, what questions mean, what short statements mean. Anything beyond that would be better served by a content indexing mechanism such as Google. A number of 5-8 of these titles is also enough. The rest of the article should be found by browsing around, using search engines and website layout with 'topics' and 'full website text search'. Hence, a proposed 'maximum content' title of this feed here would be: [Technical] Shortcomings and proposed solutions of RSS systems Other high information containment ways to format RSS feeds could be: [Cynical] How to ruin American Enterprise [1984] Coffee, Tea or Should We Feel...

The Inquirer (none / 0) (#45)
by muirhead on Fri Aug 22, 2003 at 04:31:12 AM EST

There's a reciprocal link here to this story over at the Inquirer.

Whoo, does my head in.



FeedDemon (none / 0) (#46)
by rdenny on Thu Sep 11, 2003 at 09:34:23 PM EST

Nick Bradbury (of HomeSite and TopStyle fame) has developed an RSS aggregator/reader for Windows, FeedDemon. Though still in beta, it rocks. It can handle RDF (so-called RSS 1.0) and RSS 2.0 formats, and is extremely user friendly and well-integrated with the Windows desktop.

Shortcomings of today's RSS systems | 46 comments (33 topical, 13 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!