Another ugly blaaahg

November 20, 2009

How do I deal with information? — A non-tech Zeitgeist background

Filed under: Computers, Gnome, Linux, Ubuntu, zeitgeist — Ketil @ 18:18

Sponsored by the GNOME FoundationDue to various circumstances, it has been way too long without a blog post about the hackfest I was lucky enough to be able to attend, thanks to Seif and the super nice people at the GNOME Foundation who sponsored me. There were some super talented people there, and I got to see first hand the skills of these guys.

During the hackfest I wrote up the start of an overview of what Zeitgeist and related software will do for the user, from a non technical point of view. That document is still a work in progress, but I decided to post the first part.

So, how do I deal with information on my computer?
First, off, we’re narrowing down “information” to be “activities I’m doing at my computer”. Expanding on this, “activities” is not limited to “having produced document x”, but also:

  • looking at web page y for 2 minutes
  • writing something in a chat
  • watching a movie, etc.

As it is right now, information is highly fragmented. Most applications keep their own logs, so you can browse your history. In e.g. a word processor, my history is usually limited to the last couple of documents. Furthermore, the history is not giving me a structured view of the time aspect of my work.

Information is also fragmented in the sense that traditional files are generally viewed as the primary items.

In an age that values sharing of information, many of the activities I do on my computer is not stored in traditional files, but rather scattered around the web. The history items in my web browser are just as important as a random file on my computer! So are my chat logs, my future and past tasks in my preferred task manager, my search history, etc.

All these things are important because of several aspects:

  • They could have historical significance
  • They could show me “how did I go from point A to point B?”
  • They could provide other people with insight into my work flow
  • They could show me exactly what I did when I last stopped working

In addition to this information being fragmented, it is also accessed in highly different ways, usually with a set of options for each application. In the end, it seems better to take notes, or rely on your memory to remember where you were, what you did, and how you did it.

Different approaches to categorising

When we sort our information in categories, we employ different ways, depending on the nature of the information. It basically comes down to structure, context and content

1) STRUCTURE

Most of our traditional pieces of information are sorted structurally. Most commonly, we decide on a structure by using our set of directories (commonly called “folders”). This approach has some pitfalls:

  • When I start working, how do I know what kind of categories will be useful a week or a month from now? I generally don’t know for sure, so I put up a directory tree using broad categories, and then I have to carefully revise those directories while I work. With large numbers of files, this work gets increasingly difficult.
  • Many files also belong in several categories. To solve this, what do I do? Do I make symbolic links to files, do I keep several copies of the files or do I move them around? If documents are often updated, it is important that I only deal with the latest version. Having several copies makes this a difficult task.
  • What about all the information about web history, chat logs, multimedia player history, etc? Keeping a basic file tree will often exclude important information that is not a traditional file

The most common solution to problems with poor planning and execution of file trees is a desktop search engine, which is able to return files that fit your search query. The search engine can look only in the name of the file, or it can also look into the contents of the file (like in a text document) and metadata (like camera information belonging to an image file, or information about the artist in an audio file).

Tracker is an advanced search tool that provides much of this, and can assist the user in finding the location of the files s/he is looking for.

Basic file search and having directories is old news that most people that are used to computers use all the time.

2) CONTEXT

Information context is basically: In what situation did I use this information? To answer this I need to look at different aspects:

  • How did I use it? (Which application did I use?)
  • For how long did I use it?
  • What did I use it together with (which files/items did I use together with this piece of information)

Context revolves around time, and around relationships to other pieces of information and to applications.

A question that structure in itself cannot answer is: “what did I do at the same time I was doing this?” Expanding on this, it also can’t answer something like: “What items of information do I usually use at the same time I’m using this piece?”

Zeitgeist is providing users with a way of viewing their work in a context, answering these types of questions, and interacting with the answers.

3) CONTENT

Content is both easy and hard. A human being has no problems in taking a brief glance at a piece of familiar information and make a fast decision that this belongs in this or that category.

For computer programs, this is not so easy. Computer programs aren’t good at deciding from the start what’s relevant and what’s irrelevant. On some pieces of information the topic is the giveaway, but in others it might be in the document structure. Getting more into detail, a computer will often have a hard time deciding what are the keywords that are worth looking for.

This is the reason that Google gives you hundreds or maybe thousands of results when you search the web. The search takes only a fraction of a second, but depending on how good the search query is, the average person will still have to manually sort out if this is the answer to the question that is being asked.

Searching the web involves a small amount of time having the computers search, and then a much larger amount of time is manual work, deciding between the search results.

Making a query on your computer asking: “List all items on my computer that relates to this?” is like doing hundreds of Google searches at once, all requiring some manual sorting afterwards.

Keeping track of content has partly been done by pre made file trees like I discussed in the structure part.

Since the content approach still cannot be solved by computer programs by themselves, there is need for some user interaction while work is being done.

Tracker is able to tag items in the search results. Many tags can be applied to the same item, giving them relevance in several categories.

Use cases

  • Jeff is working on a project for his job. One day, Jeff needs to assist on another project for 2 weeks, so he needs to stop his current work immediately. After the two weeks have gone, how should he resume his work flow?
  • Michaela gets tons of e-mail. She is trying to remember who sent her an e-mail for Christmas, so she can reply to them. How would she do this?
  • George downloads a file on his computer, but not being very good with computers, he can’t find the file afterwards. How should he look for it?
  • Judy is working on a project at work. At the same time she is planning a trip with a friend. The traditional files would likely be put in separate directories in her computers file tree, but what about the various browser tabs she has open. How does she differentiate which belongs to which category? The same applies to her various e-mails, chat logs, etc.
  • Jill has finished her report on a project, and her boss asks her questions about how the project came about. What information led to this report, and what pieces of information lead to the conclusion?

Stay tuned for the rest!

6 Comments »

  1. What is the point of categorization, tags, or any other kind of controlled vocabulary when full-text search indexing is so prevalent? Just write about your subject enough and in a varied vocabulary, and you can find what you need by remembering the right keywords.

    Comment by jim — November 24, 2009 @ 22:59

    • Thanks for your comment!

      It sounds like you are proposing that I word my documents to fit with future searches for it? That sounds rather backwards…

      Also, how does that help with video, audio or images? Or text that you don’t really have control of yourself, like chats, web pages, etc.

      If you talk with people who work with search, they’ll quickly explain that it’s extremely difficult to have a search engine make *good* guesses based on content. The biggest search engines in the world with all their computing power are not even getting it right, so where does that leave a little old laptop or netbook?

      Of course, full text search and search in meta information is a large part of the desktop paradigm that Zeitgeist and Tracker is setting up to handle.

      The things I’m discussing is meant to be a tool to help you. If you have a tool available, and choose not to use it, that’s totally your call. My view is that when I really need to find all the items that belong to project X, I don’t plan on refining my searches for half a day.

      A final note: There’s a lot more to Zeitgeist than what is addressed in this short first part of an overview, so you might find something that will appeal to you.

      Comment by Ketil — November 24, 2009 @ 23:25

    • But “varied vocabulary” is precisely the point. If you think the keyword is “directory” and I think it’s “folder,” our mutual full-text searches will be in vain.

      Comment by jrep — November 24, 2009 @ 23:25

  2. I am frequently searching for new infos in the internet about this matter. Thanks.

    Comment by Alkawnpaulk — November 25, 2009 @ 14:59

  3. Great site. Great information. Thank you

    Comment by jessy — December 21, 2009 @ 8:12


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: