[PP-main] categories - solved?

Joakim Ziegler joakim at simplemente.net
Fri Mar 31 20:42:41 CEST 2000

On Thu, Mar 16, 2000 at 12:58:19PM +1000, lists at itsg.net.au wrote:

> dmoz.org (the open directory project) has an extremely large set
> of well defined categories, which are available for free use
> under the open content license. they (the categories) also
> happen to be in XML (RDF), which is very convenient 8^)

> http://dmoz.org/rdf.html

> for more information, full downloads and samples.

After taking a look, I'm a little doubtful that categories from a Yahoo style
directory will map directly onto news item categorization. I'd be happy to be
proven wrong, though.

Anyway, I'm looking a bit at how AP and other established agencies do this,
and it seems there's a few things they have in common. All stories tend to
get categorized by at least geographical area and one or two general levels
of category (like politics->defense). The number of categories is
surprisingly low, and tends to be at its most detailed in the areas of
economy and finance (which is also the area where timelyness and ability to
sort and filter automatically is most important).

I'm a bit afraid of overcategorizing. There's no point in having several
hundreds of categories if each one gets a story every three weeks. They need
to be meaningful, and here needs to be enough stories in each one to form a
pattern. Of course, sub-categories is always a handy thing to have.

Joakim Ziegler - simplemente r&d director - joakim at simplemente.net
 FIX sysop - free software coder - FIDEL & Conglomerate developer
      http://www.avmaria.com/ - http://www.simplemente.net/

More information about the Peerpress-main mailing list