When a terabyte is small

With Kamel and Owen, I am working on a paper involving database indexes. We had over a terabyte of space, and yet, in the middle of the production of the paper, we ran out of space. Only a year ago, I thought that one terabyte was large.

So, I ask our technician about getting a new drive. He comes back with a small 500 GB drive. I ask how much they cost, he says “$200.”

This is a new frontier for me. Producing a simple research paper required us to generate more than one terabyte of data. Moreover, we will generate much more data before the paper is finished.

Assuming I write, say, 4 research papers a year, this means that I will generate over 4 terabytes of data a year at my current rate which is going to cost me about $1600 in storage.

3 Comments

  1. I think this is one big obstacle for current research in IR. The time spent dealing with “infrastructure” is getting bigger. This leaves less time for real research. I think that, in the broad field of IR, “industry research” is going to produce much more results in the next years than “academia research”.

    Google’s Peter Norvig is quoted saying – Google does not have the best minds, they have a great infrastructure that allows them to experiment much faster.

    How can academia deal with this?

    Comment by Sérgio Nunes — 21/2/2008 @ 15:00

  2. LOL!!!
    You are probably not old enough to know that rule:
    No matter the size of the drive it is ALWAYS 95/98% full so for the “next run” (whatever this is) you have first to upgrade.
    This is probably even more “solid” than Moore’s law.
    In the very early 70s a 5 megabytes drive was “large”…

    Comment by Kevembuangga — 21/2/2008 @ 15:59

  3. BTW, why not using outsourced storage and computation power?
    The NYT did it:
    http://open.blogs.nytimes.com/tag/hadoop/

    (via Lukas Biewald http://www.lukasbiewald.com/?p=134 )

    Comment by Kevembuangga — 21/2/2008 @ 16:20

Sorry, the comment form is closed at this time.

« Blog's main page

23 queries. 0.345 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.