[Peerpress-xml] Taxonomy.
Joakim Ziegler
joakim at helixcode.com
Sun Jan 7 06:58:18 CET 2001
As I started looking at the data schemas, I realized the need for a good
taxonomy system. Both sites and individual stories should have one or several
taxonomy keys attached to them, so people can filter easily.
I've looked at different taxonomy and archival systems, but haven't really
found any that seems to cover all our needs. The ones I've found are either
for archiving books (Dewey Decimal) or websites (the Dmoz.org hierarchy).
So it seems we're going to have to make our own. This doesn't have to be as
huge a task as it seems. We can create a base structure as v0.1, and grow it
organically for a few iterations, and then go over and revise, and release
that as 1.0.
There are actually a couple of hierarchies I'd like to use for Peer Press:
* Topical.
The main hierarchy for most purposes, I'm creating a draft for this. So
far, I've gone over and recorded the categorization used by a lot of
largish sites, both sites directly relevant to us (Kuro5hin.org),
general news sites (CNN.com, BBC News), as well as the Dmoz.org News by
subject subhierarchy. I'm seeing quite a few patterns here, and I'm hoping
that when I'm done recording this, I can pull them together, and synthesize
something like a three level hierarchy for categorizing news stories on
subject. Seeing as it's often difficult to choose just one category, it
should be allowed to specify at least two or three different categories per
story.
* Geographical.
We should create a geographical hierarchy to place stories in. Not all
stories would use this, since for instance general technical stories and so
on don't have a geographical association. However, a lot of stories do, and
this will be a practical filter. I'm hoping something along the lines of:
Planet->Continent/large scale region->Country->State/administrative
region->County->City/Town will be satisfactory. I'm not sure we need to
actually specify the hierarchy for anything below the State level, though,
as that would make it very large. Planet is somewhat ridiculous, but since
space news seem very popular, I thought it'd be a good idea to include.
That's about it, I suppose. Comments? Suggestions? Criticism?
--
Joakim Ziegler - Helix Code web monkey - joakim at helixcode.com - Radagast at IRC
FIX sysop - free software coder - FIDEL & Conglomerate developer
http://www.avmaria.com/ - http://www.helixcode.com/
More information about the Peerpress-xml
mailing list