Trading latency for quality in research

I am not opposed to the Publish or Perish mantra. I am an academic writer. I am what I publish. We all think of researchers as people wearing laboratory coats, working on exotic devices. And my own laboratory includes a one-million-dollar computer cluster with a SAN server as large as a fridge. I also generate much software. But you know what? The writing is what matters.

And publishing is easy. Write and submit many papers  conforming to the expectations of the editors. Eventually, some of your work will be accepted. And there are thousands of journals, conferences and workshops. Just write a lot.

Yet, don’t publish everything you write—even when what you wrote looks like a research paper. Hold on to it.  Because, publishing everything that looks like a research paper leads to what Feynman famously described as Cargo Cult Science. Indeed, there is a real danger that we become so good at faking science that we are no longer doing science at all! We become dishonest.

In our haste to be published…

  • we cut corners in our experiments, when we validate our ideas at all;
  • we pretend that our work is applicable in the real world, when it isn’t;
  • we don’t take the time to reproduce and reflect on known results;
  • we give the positive aspects of our research while omitting to mention the negatives;
  • we complexify the issues so that our research looks fancier;
  • we get lost in abstract nonsense.

If you want your work to really matter, you should be honest. You should not fool yourself and others. So what do we do? Maybe we should publish carefully. While barely reducing our output rate as academic writers, we can introduce extra steps to keep us more honest. What do we need?

  • Diverse point of views: it is easy to fool a small group of like-minded experts, but comparatively more difficult to fool the readers of my blog.
  • Time to reflect: if you read what you wrote months ago, and you don’t feel the urgency to communicate it more broadly, maybe it wasn’t all that good to begin with?

The problem is that once a paper is published in a journal or a conference, we tend to move on. Anyhow, we cannot easily revise our published work. Are there other models? Economists regularly publish working papers—commonly known in Computer Science as technical reports. But the difference between computer scientists and economists is that economists revise their working papers. And only when their work has stood the test of time, that is, has been available freely for months or years, do they submit it to conventional peer review.

This year, I will try the following experiment. Both on this blog and on my publication page, I will “publish” working papers and specifically ask readers to be critical of my work. Only after a couple of months have passed (or more) will I submit my work to a journal or conference.

This will introduce some latency in my publication output. Can I trade latency for quality? I plan to report back in a year on this (very public) experiment.

Further reading: Time for computer science to grow up by Lance Fortnow.

Where to get your ebooks?

If you read my blog, you probably like to read in general. Thus, if you don’t own an ebook device, you will soon. The choice is growing: the Amazon Kindle, the Sony Reader, the Apple iPad,… I bought a kindle because my wife won’t let me fill the house with books. And I hate to throw away perfectly good paper books.

Amazon has most of the market for now. Yet, using the kindle store—on the kindle—is painful. Moreover, Amazon ebooks are protected by Digital Right Management (DRM). Amazon sells you crippled ebooks that can stop working if you copy them too often. There are often better alternatives elsewhere.

And, in Canada, there is a two-dollar surcharge for every wireless download using the Kindle. Since most ebooks are 0.5MB or less, the wireless costs 4$ per megabyte! This is insulting! Moreover, if you buy a book by mistake—which is annoying common—Amazon will reimburse the cost of the book itself, but not the fee for the wireless download.

Thankfully, you can grab books compatible with the kindle (in Mobipocket format) elsewhere. Then you can drop the file on the kindle using the USB port.

  • You can get nearly 2000 of the great French classic for free on ebookgratuits. This include a large fraction of the work of Honoré de Balzac.
  • Project Gutenberg offers 30,000 free e-books in various languages (mostly English).
  • WebScription sells DRM-free ebooks in various format. Most books fall into the scifi, young adults and fantasy genres.

I am currently reading You’re Not Fooling Anyone When You Take Your Laptop to a Coffee Shop by Scalzi. I bought it at WebScription for six dollars. It is a compilation of Scalzi’s blog posts on his life as a writer. I am fascinated by how much it ressembles my own life. Well… Except for the fact that I don’t get paid when I publish a paper. Maybe I should put together a compilation of posts about my silly work life. Would anyone buy it for six dollars?

I am also reading Halting State by Stross which I bought on Amazon for ten dollars. I haven’t yet gotten into the mood of the novel.

Further reading:

Getting serious about online teaching

Earlier this month, Michael Mitzenmacher told us about the record number of students attending his Harvard class online-only. Yesterday, Dick Lipton predicted that online learning will replace campus learning : “I see no reason that On [Online Universities] could not do as good a job as Un [Campus Universities] with this basic goal [Educate Students].” In the comments, Lipton questions the importance of credentials and whether social interactions really need the campus.

I have already written much on the topic but let me reiterate my message:

  • In this new online world, professors are not content providers. They provide structure and motivation. They are role models. And most importantly, by their reputation, professors can provide certification. If someone gets a reference letter from Michael Mitzenmacher or Dick Lipton, I trust they know something about Computer Science, because I trust Michael Mitzenmacher and Dick Lipton. I suspect it is not easy to get these fellows to write fake reference letters because they have a high degree of independence (job security, good money, and so on) and their greatest asset is their reputation.
  • Students are trained to expect classrooms. Many students need structure and constant attention. That is not a good thing! We are effectively training students to be good employees working in large organizations with much structure. Yet, this world made of large and stable organizations has already fallen apart. We urgently need to teach students to learn on their own, using the Web.
  • Yes, there will always be campus classes, the same way there will always be physical libraries with actual books, and newspapers printed on paper.

Further reading:

You know your research is original when…

Many consider Frank Hebert’s Dune the most important work of science-fiction ever written. Consider that Star Wars is just a variation on Dune. Yet, it was rejected by more than twenty publishers, before being finally published. It is likely that publishers rejected Dune precisely because it was such a radical departure for the genre.

Of course, being rejected does not mean you are original. It could also mean that you are sloppy or uninteresting. However, there may be valid indications of your originality such as:

  • You have no competitor. Nobody quite does what you do.
  • You cannot be scooped. You read new issues of journals looking for fresh ideas, but without fear that someone made you irrelevant.

As MacLeod put it: Don’t try to stand out from the crowd; avoid crowds altogether.

Further reading: The secret behind radical innovation and A recipe for interesting Computer Science research papers.

Writing tools to improve your research productivity

Researchers—at least in Computer Science—spend most of their days at a desk typing. Picking the right software for writing is important.

Most of my writing time is spent on LaTeX documents. I have tried typical Word processors in the past, but they get in my way. Indeed, by mixing document content and document presentation, Microsoft Word makes it difficult to maintain consistency. Word is meant for short-lived (or throw-away) business documents. It is easy to get started and get 80% of the job done with Word. However, as the document gains complexity, as the number of revisions grow, as the number of collaborators expands, Microsoft Word becomes inadequate.

Oh! I still use OpenOffice or Google Docs to produce quick-and-dirty documents. But for anything that is meant to have lasting value, that is research, I refuse to fall into the Word processor trap. It causes some friction with colleagues, but it is a price I am willing to pay.

I believe every single graduate student should learn to write without a word processor. And serious science students should learn LateX. Even if you do not care for LaTeX, at least explore alternatives to Word such as Scrivener.

In any case, you are unlikely to need more than a text editor to write your prose:  Charles Stross, one of the best scifi writer alive, wrote many of his novels using a primitive text editor (Vim). If you have never written without Microsoft Word, how do you know that Word is not holding you back?

Right now, I write using a regular text editor (Smultron for MacOS) and the TeX Live 2009 distribution. I save all my documents to a subversion tree. Using a version control tool such as Subversion makes collaboration easy, and it allows me to go back in time years ago. It is a good setup.

Programming is also a form of writing. For my experimental work, I program in C++, Java or Python, often using Eclipse. I find it is slightly better for programming than my standard writing setup (using only a text editor). Eclipse has great qualities:

  • It stays out of the way. In particular, you can collaborate with people who are not using Eclipse without any problem.  For example, it will happily let you use handcrafted makefiles to compile your C++ programs.
  • It offers incremental compilation of Java programs. Basically, it compiles as you type.
  • It suggests corrections for many common compilation errors.

Essentially, while Java is still an awful language, Java with Eclipse is almost fun. Eclipse proves that sophisticated software can be helpful to programmers and writers.

Writing is hard and it will always be hard, no matter the tool. But at least, ease your pain!

See also Physical tools to improve research productivity.

The fundamental properties of computing

Physics works with fundamental properties such as mass, speed, acceleration, energy, and so on. Quantum mechanics has a well known trade-off between position and momentum: you can know where I am, or how fast I am going, but not both at the same time.

Algorithms (and their implementations) also have fundamental properties. Running time and memory usage are the obvious ones. In practice, there is often a trade-off between memory usage and the running time: you can a low memory usage, or a short running time, but not both. Michael Mitzenmacher reminded me this morning of another: correctness. On some difficult problems, you can get a low memory usage and a short running time if you accept an approximate solution.

I believe there are other fundamental properties like latency. Consider problems where the volume of the solution and of the input is large: statistics, image processing, finding some subgraph or sublist, text compression, and so on. In such instances, the solution comes out as a stream. You can measure the delay between the input and the output. For example, a program that compresses text by first scanning the whole text might have high latency, even if the overall running time is not large. Similarly, we can give the illusion that a Web browser is faster by beginning the Web page rendering faster, even if the overall running time of the rendering is the same. As another example, I once wrote a paper on computing the running maximum/minimum of an array where latency was an issue.

It would be interesting to come up with a listing of all the fundamental properties of computing.

Next Page »

17 queries. 0.405 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.