[Peerpress-xml] Re: [Peerpress-main] Anyone alive out there?

Tue Jan 2 11:54:40 CET 2001

It's interesting to see that people are still intrested in Peer Press. As for
what happened to it back in spring 2k, I'm not 100% sure, except for the fact
that I suddenly got a lot less free time, and the list discussion died.

I'd like to get back to doing *something* with PP, and I have some ideas on
how we can accomplish things quickly. Some of the things I propose might be
different from what I thought 8-9 months ago. Let's just say I've grown wiser
about what's required to make things work. My current goal would be to work
out something that would let people exchange news stories reasonably quickly.
The rest can wait.

I've been looking at existing protocols and formats again today, to try to
see where we stand and what we'd need to create ourselves. Happily, it seems
the situation is a lot better now than what it was the last time we discussed
this. Specifically, RSS 1.0 has done a lot of the work for us.

With this in mind, I suggest taking the following steps. I can do most of
these, although I'd prefer it if I didn't have to do all of them by myself:

1) Define an XML format for a response to a query of the type "What new items
do you have since <date>". It should list all the newsitems in a format that
specifies the date, headline, taxonomy, licensing terms, URL to get the full
item, and possibly other data (to be discussed). It seems to me that RSS 1.0
fills this role very well, although I haven't finished reading up. The good
thing is that RSS1.0 supports extension modules (they already have a proposed
one for Slash-specific data). So whatever we need in the summary format that
RSS1.0 doesn't support out of the box, we can add into a module.

Note: To do this completely, we need to resolve the licensing issues. I
remember discussing this with Raph at GUADEC, and what we came up with was to
basically have three levels of licensing:

* PP-exclusive, which means that only PP member sites can republish this (fair
  use of course still applies).
* For-pay relicensing, which means that PP member sites can republish this as
  above, and additionally, the author can be contacted for republishing outside
  of PP, presumably for a negotiated compensation.
* Free, which means the content can be republished freely inside and outside
  PP, needing only an attribution line.

We should discuss this more. The levels do need to be clearly defined, so
that people can filter on it, and auto-publish.

2) Define an XML format for a single item's full data. This might be a little
more work, since RSS1.0 doesn't currently do this in a very good way. We
might want to use XHTML1.0, or possibly a subset of it. It needs to include
roughly the text markup functionality of XHTML, some paragraph-level markup,
possibly simple tables, and have functions for linking to related media (with
anchor points in the text, much like HTML3.2's <fig> element). It would be
fairly easy to draw up our own format here, since we can just borrow bits
from other formats (or, alternately, just use the different formats through
namespaces, but that would be a little bit too patchwork-like for my tastes).

3) Figure out the optimal way to use HTTP to transfer stuff, and write some
prototype scripts that implement this. Yes, I was opposed to HTTP the last
time. No, I'm not any more. We should make a well-defined, standardized set
of URLs for fetching content from a site. This would involve specifying a
base URL, under which one could assume the URL schema to follow the standard.
The HTTP transfer should only be one of several possible ways to participate
in Peer Press, and it's the first just because it's the simplest to
implement. In the future, I'd like to see something like SOAP used for the
same purpose.

The functionality that should be present in the first generation scripts
should be the following:

* Get a list of articles (filter on at least date, license, author and taxonomy).
* Get an individual article.
* Get the individual media bits that are linked from an article (this isn't
  really a part of the functionality, since these are just URLs).
* Send feedback to an article (for editorial feedback, should go to site owner
  and author)

Optional functionality (that will need to be thought of, at least):

* Standard URL schema for proxying news from other sites. I'd like to set up a
  core Peer Press site at some point that will proxy all the registered
  members. Members should be encouraged to proxy for each other. This would
  be no more difficult than to specify a standard add-on to the base URL,
  under which the hierarchy would be the same.

* Please add necessary functionality here.

4) Create an XML format for listing sites. This would be used by the main PP
   site (and whoever wanted to mirror, as well as pushing into FreeNet or
   whatever) to list all sites that are PP members, and their base URLs. This
   should probably also contain stuff like a one-line description of the
   site, a taxonomy list, what types of licenses the content originating at
   the site uses, contact addresses, etc.

5) Create a standard HTTP access method for the site lists. This would be
   similar to the story list method, except filtering on sites'
   characteristics instead.

And that's it. With these five tasks finished, we have a functioning, basic
Peer Press network. Sites can syndicate content from each other, and users
can get a total list of all sites in PP (so that one can create nice custom
portals for people to build their own PP frontpages, etc.)

Basically, I just dropped all the fancy functionality for this proposal. It'd
go into 2.0, in my opinion. The advantage is that we could get a number of
member sites and show the utility, fast. This will in turn get more people
available for adding the fancy features for 2.0.

Please give me your comments. I'm especially interested in what features
should go into the XML DTDs for 1.0. I can crank out the XML DTDs myself,
probably, and if needed, I can also code up a small PHP implementation of the
HTTP transfer system, backed by mySQL or Postgres. Feedback is welcome, help
even more so. When the proof of concept is working, I'd love to see Perl and
Python modules, Apache modules in C, etc. all implementing the same scheme,
as well as support patches for Slash, Squishdot, Scoop, and mod_virgule, so
it'll be easy to deploy this.

-- 
Joakim Ziegler - Helix Code web monkey - joakim at helixcode.com - Radagast at IRC
      FIX sysop - free software coder - FIDEL & Conglomerate developer
            http://www.avmaria.com/ - http://www.helixcode.com/