Wednesday, July 23rd, 2008

Encouraging diversity in science

Filed under: Academia/Research — Daniel Lemire @ 9:04

Science follows a conservative process. It takes a long time for a fact or a law to be accepted. Several scientists must verify and reproduce the same results before acceptance is granted.

So goes the theory.

In practice, science is not such a clean process. Routinely, facts and theories become widely accepted quickly, without criticism. Mostly because they are convenient. Other proposals get shot down immediately: perhaps for good reasons, perhaps not. Negative results, including any challenge to the convenient—but poorly reviewed—facts, are frowned upon.

In some fields, there is a bias against simplicity. If you show that a simple technique works well, even if it works better than more complicated or expensive techniques, people will dismiss your work as too easy. I believe we should have the opposite bias: we should try to steer away from complicated solutions. Complicated techniques should have the burden of the proof: do we need something so difficult? But complexity is often convenient: it raises the barrier of entry to a field. If anyone can do your work using simple techniques, then why are you getting paid?

I believe that to minimize the effects of such biases, we should encourage diversity in science. Here are a few clues on how to get more diversity:

  • pick numerous and different reviewers: the composition of program committees should be different year after year;
  • encourage the multiplication of conferences, journals and workshops;
  • provide funding to more researchers (spread the money more evenly);
  • mix researchers from different organizations (universities, government, industry);
  • do not reward researchers who always publish in the same small set of conferences or journals (the same where they often act as reviewers);
  • mix researchers having different backgrounds.

Finally, I believe that we need to stress reproducibility a lot more. Researchers need to open up their data and their code. This will ensure that more people can check the facts. It should lead to better science and more diversity.

Monday, July 21st, 2008

We need a more negative culture

Filed under: Academia/Research — Daniel Lemire @ 7:58

There is a strong bias in science, at least in Computer Science, toward positive results. For example, showing that algorithm A is better than algorithm B, will get you published. Reporting the opposite result is likely to get your paper rejected.

One justification for the value of positive results is that it gives you more information. Indeed, there is infinite number of possibilities. Listing all the cases that are of no interest would take too long. We better focus on what works!

This argument is fallacious since it ignores one of the pillars of science: reproducibility. By taking away the possibility of publishing negative results, we basically throw away the most important reason why we require reproducibility: to verify what others have done.

Times and times again, I come across falsehoods in science. Typically, they occur when reporting experimental results that are either badly interpreted or badly implemented. Here is a typical scenario:

  • Researcher A publishes some paper where he makes some false statement.
  • The statement is compelling. It matches people’s intuition.
  • The work becomes well known and is repeatedly cited.
  • Other researchers build upon the falsehood. They either do not verify the statement (where is the profit in that?) or if they do, they avoid denouncing the falsehood.

Eventually, the statement because an accepted fact. Anyone who wants to challenge it has the burden of proof, and it is easy to cast doubts on any experimental procedure. I claim that this happens often. As someone who crafts my own experiments, I see it all the time. I am repeatedly unable to reproduce “accepted facts”. Yet, I never (or almost never) report these problems because trying to do so would ensure that whatever paper I produce is frowned upon. Moreover, I believe few people ever attempt to verify published results. What makes matters worse is that trying to reproduce experiments is never considered serious work in Computer Science. Often, it is quite a difficult task too: either the data or the code is missing or barely available.

What bothers me is not so much the falsehoods, but the fact that it tends to feed into the biases of entire communities. People expect certain things, and they filter out any “negative” result, and protect “positive” results even when such results are not solid. Entire fields are therefore being built on shaky foundations.

We have made some progress recently in Computer Science regarding reproducibility. There are more conferences and journals asking researchers to make their data and code available. However, I believe that culturally, we still have a long way to go.

Friday, July 11th, 2008

Do you think because you write, or write because you think?

Filed under: Academia/Research — Daniel Lemire @ 9:33

I used to believe that the pressure to publish what you did in research was inherently bad. About four years ago or so, I started to change my mind.

I now believe that the more you write, the more you think about the issues, and the more ideas you have. In short, productive researchers do not write a lot because they are brilliant, they are brilliant because they write a lot.

This statement has counterexamples, however. We all know of some researchers who produce papers after papers, all of them toying with the same set of narrow ideas, or all of them misguided. Hence, I will add a constraint. You must write a lot about different things.

But clearly, that is not enough. Many people who write textbooks, for example, happen to write a lot, and they write about different things, yet, they are not automatically brilliant researchers (though, I submit to you that they probably are brilliant individuals). Hence, I will add a final constraint: you must be ambitious and go where nobody has gone before.

So, let me summarize my recipe:

  • write a lot…
  • about different things…
  • and be bold.

My final point for the day: When I say that you must write a lot, I do not mean that you must publish a lot in peer-reviewed journals and conferences. Getting continual and high-quality feedback is essential, but I see no evidence that getting formally reviewed frequently is essential. In fact, it may even prove counterproductive as it may encourage you to become more conservative.

How do you get feedback, if not through peer review? For one thing, you can run experiments: nature will tell you whether you are wrong. For another, informal review of your work by friends or collaborators can be as good or better than formal peer review.

I also think that posting your work on the Web might be a very valid form of publication, especially if you have job security. Sometimes you know that your work is correct. At the very least, you know as well as any reviewer might. Or sometimes, your result might just not warrant the process. Maybe we should all create our own personal journals.

Monday, July 7th, 2008

I still don’t have the multiplication tables memorized

Filed under: Academia/Research — Daniel Lemire @ 17:21

I read this on slashdot:

I have a PhD in math, and I still don’t have the multiplication tables memorized

Now I know I am not the only one!

In other news,

  • I still deduce my age from my birth date (takes me a minute or so each time);
  • I was identified as having a learning disability when I entered school (since I could not recite my phone number nor tie my shoes) and put in a special class;
  • I still don’t know my office phone number;
  • I don’t know my bank account number, nor how much money there is in it;
  • I don’t know my Social Insurance Number;
  • I get the birthdays of my sons mixed up.

But I know what a soliton is, I can solve nonlinear differential equations by multiscale methods, and I can program my very own bitmap index from scratch in C++. Oh! and I can grow coreopsis and echinacea from seeds.

Let us face it: the purpose of school should not be to teach specifics. And you should never judge kids by what you expect them to achieve. Let them surprise you!

Friday, July 4th, 2008

Classifying research projects by depth

Filed under: Academia/Research — Daniel Lemire @ 10:02

Everything else being equal, picking the right problems is the key factor determining your success as a researcher (no matter how you define success). In a previous post, I proposed three categories of research problems:

  1. explain a previously unexplained observation;
  2. perfect an existing technique;
  3. invent a new problem.

It appears that all 3 categories are equally valid. Which technique you prefer is a matter of style.

Today, I would like to propose a new, orthogonal, categorization in terms of the depth of the problem you tackle. Some problems

  1. are narrow and well-defined, you can complete them in a few months;
  2. form a set of narrow and well-defined problems, likely to keep you busy for years.

I have tended myself toward the first category (see “my research process“). The benefit of a focused burst of research producing a distinct result should not be underestimated. The most obvious benefit is that you can quickly move on and thus, you can afford to try your hand at random problems. It is the equivalent of a hit-and-run. If you are the curious sort, it allows you to learn about a new topic, without investing your career in it. However, it makes applying for grants more difficult. You are also less likely to achieve some recognition because the depth of your contribution might be less.

The second category means that you must find yourself a niche and work over it for years. Indeed, preferably, not too many people in the world must be aware of these problems you have identified. The catch is: how can you know, ahead of time, that the topic and the problems you see now, will still be interesting in two or three years? Are you investing in vain? Presumably, if you can follow this strategy, grant applications and recognition may come more easily. But what happens if you get bored?

The two categories relate to how you read papers. If you read papers thinking “maybe I could build on their work”, then you will naturally tend to the first category. Reading a lot of papers on different topics favors random hit-and-run research projects. Are you reading the list of accepted papers looking for clues as to what you will work on next? Are you attending talks to pick up random new ideas?

However, if you tend to “pull” research papers out of the (virtual) library based on your own ideas, then you will more likely gravitate toward the deeper research projects. In this case, your mental filters are much stronger: you tend to filter out everything that does not directly relate to your goals. You may still attend many conferences, and read lists of accepted papers, but your brain will filter most of the data out.

Tuesday, June 24th, 2008

Good research: invent new problems or explain mysteries

Filed under: Academia/Research — Daniel Lemire @ 19:12

It is a lot of work to grind through a research project and get an interesting paper out of it. Mostly, you have to be patient enough and work everyday at it. If you follow a sane process, it is difficult to fail entirely.

Picking the right research question is very important however: it is difficult to recover from a bad choice of topic. There are at least 3 types of good research questions: 1) explain with a theoretical model a (puzzling) experimental observation 2) improve by at least an order of magnitude an existing technique 3) make up a new problem and be the first to propose a solution (I call it Turney’s way).

I now believe that options 1 and 3 are far better than option 2. To illustrate my opinion, here is a little scenario:

  • read a paper;
  • think to yourself: I could improve this idea ten times over;
  • get excited, dream of fame, start crafting a paper;
  • late on Friday night, realize your contribution is tiny;
  • keep going (because you have invested so much);
  • months later, publish a weak paper.

So I submit to you Lemire’s first rule of good research: you must either be trying to explain puzzling experimental results, or be inventing new problems. In some sense, it amounts to discarding the “engineering way” which is to constantly perfect existing techniques.

Further reader: I have written much about how I think one can write a good paper and about my usual research process.

Monday, June 23rd, 2008

Lowly tasks you should do

Filed under: Academia/Research — Daniel Lemire @ 8:29

Many of my colleagues never mark assignments. I tend to mark papers on nearly a weekly basis. Why am I doing this? Because I believe that marking assignments is the best way to identify the weaknesses in my courses and learn from my students.

Many researchers never implement their ideas. They let their students do the lowly implementation work. I almost always do at least some of the implementation in all projects I work on. Why am I doing this? Because I believe that you never really understand an idea, even your own, until you have put it in practice. You never know how it feels to ride a bicycle until you have done it once, no matter how great your mind is.

On an unrelated note, my friend Yuhong came over during the week-end. She is a brand-new Software Engineering professor at Concordia University. She bought my wife some gorgeous flowers. Nice.

Next Page »

32 queries. 0.251 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.