Proof that I am a stubborn bastard

  • I have not used Microsoft Office in over 5 years. I use Mac OS and Linux.
  • I never use my employer’s email service. Prior to Google Mail, I used a private provider and forwarded my work email there.
  • I have never driven to work, in the last 4 years.
  • As a researcher, I do not belong to any one community.
  • I keep teaching an university-level XML course, even though I have been ridiculed for teaching such lowly technical issues.

The Purity Scale in Science

This is how most people understand purity in Science:

As for myself, I measure purity on a bandwidth scale: the more feedback the researchers get, the less pure they are. I should maybe use another term.

(Thanks to Steven for pointing this comic to me.)

Distractions make you dumb

Sufficient focus is necessary to be smart. The corollary is that distractions may turn your brain into mulch. There several conditions to sufficient focus:

  • a sense of urgency: without a strong need to get the task done, long term focus is difficult;
  • the dismissal of external stimuli: either you make sure not to be disturbed, or you can filter out the distractions;
  • mental readiness: sometimes your mind will simply not focus before you rest.

From Graph Drawing to Tag-Cloud drawing?

Tag clouds are an interesting visualization technique because, unlike bar charts, you can easily display 30 or 50 weights in a compact figure. Moreover, because they are a 2D structure, you can more easily cluster similar tags together. The Tag-Cloud Drawing problem is the optimization of the layout of the tag clouds. It is somewhat related to the Graph Drawing problem.

Recently, Fujimura et al. showed how to scale tag clouds further… up to 5,000 attributes!

We use a topographical image that helps users to grasp the relationship among tags intuitively as a background to the tag clouds. We apply this interface to a blog navigation system and show that the proposed method enables users to find the desired tags easily even if the tag clouds are very large, 5,000 and above tags. Our approach is also effective for understanding the overall structure of a large amount of tagged documents.

I really think that tag-cloud drawing is a topic deserving of more attention. It is both a fun and practical problem.

Grounded versus Pure Theory

My previous blog post generated quite a number of comments and much criticism. Let me summarize the main objections:

  • What I describe is not pure theory but bad research.
  • Pure theory is useful: consider the n log n lower bound on sorting.

My replies:

  • Our brains are bandwidth-driven machines, not standalone computers. You will only thrive given sufficient feedback. And peer review is a low-bandwidth high-latency feedback system.
  • Pure theory is low-bandwidth Science: few results depend on it, whether it is useful or powerful is entirely a matter of opinion. It is pure because it is not tainted by external feedback.
  • Theoretical results are the reason why we do Science.
  • Pure theorists are likely to describe themselves as engineers.
  • I have done and will do pure theory work. It is a very tempting trap.
  • If a new Engineering concept seems like a good idea, wait before you make a book out of it. Try it out in practice first.
  • If a theorem seems useful to you, wait before you make a career out of it. Can you relate it to anything in the world out there?

Why pure theory is wasteful

Pure theory is like exploring the universe by staying on Earth. Sure, it seems expensive at first to build space ships, but our brains are at their best when facing reality up close. Too many scientists work exclusively over models in their mind. Then they are surprised that nobody outside their clique finds what they do interesting.

And I am not thinking about Mathematics: Mathematics was founded by people who wanted to sell land by area, not perimeter… modern Mathematics came to be with Newton, who wanted to help the state manage its money better. I am thinking about Software Engineering researchers who never write software and never study people who write software. I am thinking about Semantic Web researchers who have been building models and ontologies for ten years, but who have never tested their ideas against the harsh reality. I am thinking about Algorithm Design people who claim one algorithm is better than another, but they never bothered to implement it. I am thinking about Machine Learning researchers who never bother to test their schemes with the terabytes of data we find everywhere.

All topics warrant research, but pure theory is not an acceptable methodology.

It doesn’t matter how beautiful your theory is, it doesn’t matter how smart you are. If it doesn’t agree with experiment, it’s wrong. (Attributed to Feynman)

« Previous PageNext Page »

18 queries. 0.344 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.