Grounded versus Pure Theory

My previous blog post generated quite a number of comments and much criticism. Let me summarize the main objections:

  • What I describe is not pure theory but bad research.
  • Pure theory is useful: consider the n log n lower bound on sorting.

My replies:

  • Our brains are bandwidth-driven machines, not standalone computers. You will only thrive given sufficient feedback. And peer review is a low-bandwidth high-latency feedback system.
  • Pure theory is low-bandwidth Science: few results depend on it, whether it is useful or powerful is entirely a matter of opinion. It is pure because it is not tainted by external feedback.
  • Theoretical results are the reason why we do Science.
  • Pure theorists are likely to describe themselves as engineers.
  • I have done and will do pure theory work. It is a very tempting trap.
  • If a new Engineering concept seems like a good idea, wait before you make a book out of it. Try it out in practice first.
  • If a theorem seems useful to you, wait before you make a career out of it. Can you relate it to anything in the world out there?

Why pure theory is wasteful

Pure theory is like exploring the universe by staying on Earth. Sure, it seems expensive at first to build space ships, but our brains are at their best when facing reality up close. Too many scientists work exclusively over models in their mind. Then they are surprised that nobody outside their clique finds what they do interesting.

And I am not thinking about Mathematics: Mathematics was founded by people who wanted to sell land by area, not perimeter… modern Mathematics came to be with Newton, who wanted to help the state manage its money better. I am thinking about Software Engineering researchers who never write software and never study people who write software. I am thinking about Semantic Web researchers who have been building models and ontologies for ten years, but who have never tested their ideas against the harsh reality. I am thinking about Algorithm Design people who claim one algorithm is better than another, but they never bothered to implement it. I am thinking about Machine Learning researchers who never bother to test their schemes with the terabytes of data we find everywhere.

All topics warrant research, but pure theory is not an acceptable methodology.

It doesn’t matter how beautiful your theory is, it doesn’t matter how smart you are. If it doesn’t agree with experiment, it’s wrong. (Attributed to Feynman)

A short review of Collective Intelligence in Action


I was recently asked by the publisher to review Collective Intelligence in Action. The author is Satnam Alag, a Bay area engineer with a Ph.D. from the University of California, Berkeley. Dr. Alag is VP of NextBio, a specialized search engine.

The first chapter is free and so is the source code used in the book.

The book is for Java developers who want to implement “Collective Intelligence” applications in Java. It tells us about extracting and applying data from blogs, wikis and social network applications. People who read this blog know that I am not one to praise, but this book succeeds brilliantly. If you are a Java engineer and work with Web technologies, you must get this book. It covers topics such as computing similarity measures using vector models, Naïve Bayes Classifiers, inverse document frequency (idf), Machine Learning (using the Weka API), building a crawler with regular expressions, collaborative filtering (with links to open source tools), and so on.

Even if you do not work with Java, if you care for high-end Web applications, this book is for you. It reminds me of Lyon’s Java Digital Signal Processing book. It offers the gist of what academia knows, but focuses on what people (engineers and researchers) do in practice.

The book is not meant for academia however. There are references, but no theorem.

The book is available for preorder on Amazon for $30. Go order it.

Disclaimer. I did not get paid to review this book, and I do not stand to gain anything if you buy the book. I have no relationship with the publisher or the author.

Further reading. A competing book is Programming Collective Intelligence: Building Smart Web 2.0 Applications by Toby Segaran. It uses Python instead of Java.

The ten-minute rule for presentations

Mike gives us 3 rules to improve our presentations. Two of the rules I knew: you have to practice and you should present pictures, not text, on your slides. The other rule is the 10-minute rule: you have to insert a break in your presentation every 10 minutes to refresh the audience.

I must admit that I am really bad at attending presentations. I usually fall asleep within 5 minutes. But, at least, if you try to start fresh every 10 minutes, you may catch me when I randomly wake up. But do not mind me: I must be an outlier. For one thing, I really prefer to read your papers rather than listen to a 50-minute talk. I have this strange belief that lectures are leftovers from an era when paper and ink were expensive. But, yes, I know that talks reach many people who would not otherwise read the papers.

Research stamina

Running a research project has more to do with a marathon than a sprint. Most good runners can nearly run forever if they avoid injuries and they stay hydrated and motivated.

Similarly, a creative worker can work nearly forever on a topic. A novelist can write 10 books in a saga. A researcher can produce 25 papers on a narrow topic. How do they do it?

  • You must not lose focus. It is easy to get interested in the a brand new idea and drop your current work. You should not change your focus without careful consideration.
  • You need a constant flow of new ideas. You should never focus exclusively on a narrow topic. You need the white noise. You need smart people making you think about alternatives. You need to draw analogies on what others are doing.
  • You must keep your job and sufficient funding to keep going. Obviously. Fortunately, many research topics require little more than your own salary.
  • You need to keep challenging yourself. The human mind degrades when subjected to routine tasks.

« Previous Page

18 queries. 0.376 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.