Wednesday, February 09, 2005

Lessons Learned: Getting data about the "nptech" tag

After poking around and convincing myself I couldn't find the answer on the web, I posted a request for tag analysis help to the del.icio.us discussion list. I wanted the helped I'd asked for here.

I found that I couldn't get what I wanted. At least not in some highly elegant organized way. I can port the html of the nptech pages into CSV and go from there with some initial analysis.

But I got a lot more.

Through the discussion (I have 30 emails in the thread though some of them were off-list), I learned a lot about a big flow in the experiment and ways to think about getting past to that flaw.*

The flaw, at least as I read it, is the relative obscurity and arbitrary nature of the "nptech" tag excludes any users who don't find out about the tag directly (via listserv or blog posts, for example) or who don't tag surf to it and then begin to use it.

So the initial project goals:

  1. collect bookmarks having to do with nonprofit technology;
  2. find unknown users;
  3. provide a weighting, represented by the number of users bookmarking an item, for bookmarks; and,
  4. collect descriptive words to inform a more formal taxonomy project.

are undermined by the exclusive nature of the proposed unifying tag.

This leads me to the conclusion that, once some tag analysis has been done, we should do some analysis on the most bookmarked URLs to stretch beyond the users who use "nptech." That will certainly inform the first three goals. But it will really inform that last. We will be able to collect additional words and word groups for the taxonomy process.

Today was, in my book, an absolutely fabulous learning day.

*It's my policy not to quote anything that I haven't been given express permission to quote and/or can't be linked to on the web.
update:Thanks to a hybernautics I found the delicious-discuss archives. You can view the thread tag analysis help that provided much of the thought behind this post.

(in: nptech_about, tagging, classification)