aboutness

A lot of writing in general, and a lot of web pages specifically, concerns some topic. That writing is about a topic. There are a bunch of different ways of figuring out what something is about, and many of these are hilariously wrong. But that isn’t what this post is about, this post is about ‘aboutness assertions,’ which is how you say what things are about once you’ve decided that something is about something.

sounds confusing? it gets worse but more interesting…

[note that this post will get re-edited from time to time based on other posts in this category.]

A topic is a concept. It can be a simple concept like ‘whales’ or ‘Wales’ that physically exists in space-time, or it can be a complex thing like ‘disintermediation’ or ‘velocity of light’ that describe properties of matter or ways that people understand the universe. There are a set of people who are skilled in figuring how to do these assertions, and they are generally called things like Information Architects or Taxonomists. People also have an intuitive understanding of what things are about, and this is the basis of folksonomy.

An aboutness assertion is a link between some resource and one of these topics. A ‘resource’ can be virtually anything that can be identified, but for the sake of argument lets say that it is a URL.1 The assertion is then the link between the topic and the resource.

There are a bunch of different things that you can derive aboutness assertions from, but the three ones that I’m going to talk about for the next couple of posts in this blog are content, taxonomy, and folksonomy. There’s also a fourth type of interesting aboutness assertions, which is remining existing aboutness assertions.

Content A grossly simplified view of what a search engine does is that it takes words in, and returns out links to documents that it asserts are about those words. Now, the kind of assertion it’s making — that the document contains those words — is very weak. The user had something in mind when they made that query, though — for the most part people aren’t just looking for documents that contain random words, they have a query formulated in their mind that they then turn into words when they enter it into an engine. This is as close to aboutness as you can get from the simplest search engines (things like google and msn search are beyond this.)

Taxonomy A Taxonomy (or Controlled Vocabulary generally), is a organized system of topics that people can use to make assertions about resources. This is ‘taxonomy’ as used as a term in information architecture, not ‘Taxonomy’ in the purer Information Science sense. These taxonomies can be general (like the Library of Congress Subject Headings or the Dewey Decimal System, which are both used in libraries), large and specific to a focus area (like the NIH’s MeSH headers for describing health issues), or small and specific to a single company or organization’s concerns or line of business. These taxonomies are generally maintained by skilled professionals called taxonomists, and what they do is analyze the concept areas that the vocabulary needs to cover, and then build a system. When people work on these systems, they use particular terms to index a document.

folksonomy A folksonomy, in contrast, lets people make aboutness assertions using whatever terms they want. For that reason, it’s frequently called ‘democratic classification,’ or ‘open source classification.’ It actually has no relation to open source as a concept whatsoever that I can tell, and the usefulness of folksonomy is for two main reasons. First, because for an individual person, they’re usually able to figure out why they made assertions. Second, because if you have enough people making assertions, the words they use to describe any particular topic form big ‘clouds’ of tags, that people can navigate.

folksonomy
classification
aboutness

1 Strictly speaking, it’s a URI and not a URL, but for these purposes the distinction between the two doesn’t matter.