Tuesday, March 6th, 2007

One More Step Toward Infinite Storage

Filed under: Data Warehousing and OLAP — Daniel Lemire @ 19:03

Acccording to a new IDC study reported in Wired, the world had 185 exabytes of storage available last year and will have 601 exabytes in 2010. Meanwhile, the amount of “digital information” generated will grow from 161 exabytes last year to 988 exabytes in 2010.

Their point is that we lack the storage capacity to store everything. This seems to go against the theory that we nearly have infinite storage. But I do not think so. How can they tell how much storage is available? Do they include the NSA? Do they include Echelon? Do they include all of the secret agencies in the world storing massive quantities of data in general?

What might be a more interesting observation is that few people store everything as of now. And I do not expect that people will start storing everything soon. Storage costs must still come down a bit, and software must adapt. But in a few short years, everyone will store copies of everything. And managing all this data, whatever managing means, will become a big deal. And it will not be a nice database problem either because this data will not follow nice database schemas.

1 Comment »

  1. As a footnote, Library and Archives Canada is also worried about this, of course. Their mandate is to archive (some) of these exabytes - the ones that matter or can be considered part of the “National Heritage” (http://www.collectionscanada.ca/cdis/index-e.html). So (one of) their problem(s) is - how do we tell what matters and what doesn’t? Given limited management ability / space / archivists etc., do we archive / annotate Daniel’s and Andre’s blogs or Nelly Furtado’s MySpace site?

    As far as absolute numbers of exabytes go, I don’t think that’s an especially good measure for anything. YouTube videos take up quite a lot of space but there aren’t more than a few million. It’s the “objects” and the information about them that matters.

    Although the question of what a “digital object” actually consists of is also in question. Should it be the picture, or the picture with the text or the picture with the text in the blog…?

    Comment by Andre Vellino — 6/3/2007 @ 22:49

RSS feed for comments on this post.

Leave a comment

Warning: When entering a long comment, please ensure that you make copy of your text prior to submitting it. If the server should fail or if you hit a bug, you might lose your work. I am not responsible for your lost effort.

To spammers: I carefully review every single post and make sure that spam gets deleted. You are wasting your time if you are manually entering spam using this form. Read my terms of use to see what I consider to be abusive.

Example: I + II + IX= XII. (Yes, you have to enter a roman numeral.)

« Blog's main page

24 queries. 0.113 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.