Monday, April 21st, 2008

Collaboration in Science: Three models

Filed under: Academia/Research — Daniel Lemire @ 17:43

Scientists collaborate frequently. Most science articles have at least two authors.

Some collaborations work well, others fail. The first step to understanding what went wrong is to categorize the collaboration. I distinguish three types:

  • Hierarchical collaboration: the student collaborates with his supervisor, the researcher collaborates with his manager. This form of collaboration is usually long-lived. It usually depends on the available funding and is usually more conservative in nature. The lower you are in the hierarchy, the more you work, usually.
  • Symmetric collaboration: two mathematicians write papers by exchanging conjectures over email. This form of collaboration does not scale well to large numbers: the communication overhead grows quadratically.
  • Topical collaboration: a philosopher writes a paper with a software engineer to describe the philosophy of software engineering. This form of collaboration can suffer from communication problems. The collaboration is usually project-centered. It might be risky research. I would expect this form of collaboration to be especially fruitful. Oddly enough, I cannot think of any famous example of topical collaboration in science.

See also The lonely researcher: a loser?

Friday, April 18th, 2008

The “e” prefix is obselete

Filed under: Science and Technology — Daniel Lemire @ 8:21

Nicholas Carr asked whether IT departments mattered. What is IT all about? e-Collaboration, e-Mail, e-Learning, e-Health, e-Business, and so on. Does the “e-” matter?

I am working on a graduate program in e-collaboration. At
some point, I had to stop and think… isn’t all collaboration
electronic? Even the construction workers use cell phones and PDAs.

Does anyone seriously sick fails to look their disease on Wikipedia, and enter related posting boards to meet other people who have the same disease?

Do you know any student who fail to use the Web to help them in their classes?

Do you know any business that is not also an e-Business? Even the shops at my local market have computers on their stands so that you can pay with a debit card.

Source: This idea came in an e-discussion with Daniel Tunkelang.

Thursday, April 17th, 2008

What is academic blogging about?

Filed under: Academia/Research — Daniel Lemire @ 8:56

From the lowly Ph.D. student at a small school, to the Havard professor, researchers are blogging. Here are some of the reasons why they blog:

  • Research is a social activity. Blogging allows us to keep and create links with diverse researchers whose varied interests keeps our mind open and fresh.
  • Blogging is a personal activity, whereas most of science is consensual. Hence, blogging helps to promote ideas that would not survive otherwise. It is easier to go against the grain in a blog then in a research journal.

My thesis is that blogging will ultimately be recognized as an activity encouraging true innovation.

References:

Tuesday, April 15th, 2008

Why aren’t there more scientific breakthroughs?

Filed under: Academia/Research — Daniel Lemire @ 9:45

Most research papers are boring. They rehash existing work with almost no new insight. Mihai Pătraşcu blamed me for saying that big conferences were for people without imagination: what I actually wrote was that focusing on prestigious conferences tends to encourage self-reinforcing biases. The recipe is simple: a senior researcher defines whatever he does as “the right way,” the young researchers follow the senior researcher for selfish reasons, and finally, the community grows and whoever questions it is rejected. Surely, 10,000 bright people cannot be wrong? We invested millions of dollars into this field, surely, it cannot be wrong? Eventually, you get a catastrophe like classical AI or String theory. These fields are not bad in themselves, but they grab most of the attention and most of the funding for long periods of time. In effect, they work to prevent any competition from rising up. Any good gardener knows better: monolithic cultures are weak, you need a diverse set of plants.

Science should be about fostering competing ideas. You should wish that many people will challenge your ideas. I believe that we should encourage diversity and true innovation.

Paul Graham tells us Why There Aren’t More Googles. Let me revisit his essay with my concern for the lack of scientific breakthroughs:

And yet it’s the bold ideas that generate the biggest returns. Any really good new idea will seem bad to most people; otherwise someone would already be doing it. And yet most program committees are driven by consensus. The biggest factor determining how a program committee will feel about your research idea is how other researchers feel about it. I doubt they realize it, but this algorithm guarantees they’ll miss all the very best ideas. The more people who have to like a new idea, the more outliers you lose.

I challenge program committees: the list of accepted paper should be diverse. If you see strong clusters of similar ideas, and most prestigious conferences have them, then you have failed to foster the next scientific breakthrough.

You may object: surely, if an idea is any good, it will survive rejection by a program committee or two? Of course it will, but if you did not encourage crazy research, it may come much later.

Friday, April 11th, 2008

Do you share and index your history?

Filed under: Science and Technology — Daniel Lemire @ 13:05

When you edit a document, some software will generate automatically a new version of the document and allow you to see what changed. If the software is sufficiently smart, you might even know when and by whom the change was made. Wikipedia is good at keeping traces of everything. Email and blogs leave traces. Videoconferencing does not usually leave traces: you cannot replay a Skype conference after it has concluded.

However, beyond keeping traces, software can share and index traces. Each email is a trace of a conversation, and it can be retrieved later, but your emails are not shared. Word processors allow you to send a document with recorded changes, but you cannot easily refer to a specific change in the document.

In any case, I made the following table:

Medium Keep history Share history Index history
Word Processing Sometimes Manually No
Email Yes Manually Yes
Phone, Videoconferencing (Skype), face-to-face No No No
Facebook, Twitter Yes Yes No
Blog Yes Yes Yes

Credit: This idea came after a discussion with Sébastien Paquet.

Wednesday, April 9th, 2008

Automatic domain name generation?

Filed under: Science and Technology — Daniel Lemire @ 21:35

I do not usually link to random research ppaers, but this one is worth a look: Kwyjibo: automatic domain name generation. Here is the abstract:

Automatically generating good domain names that are random yet pronounceable is a problem harder than it first appears. The problem is related to random word generation, and we survey and categorize existing techniques before presenting our own syllable-based algorithm that produces higher-quality results. Our results are also applicable elsewhere, in areas such as password generation, username generation, and even computer-generated poetry.

This is fun research.

Monday, April 7th, 2008

Second KDD Workshop on Large-Scale Recommender Systems (May 30, 2008 / August 24, 2008)

Filed under: Passed CFP — Daniel Lemire @ 13:04

The second Workshop on Large-Scale Recommender Systems will be held at KDD 2008 in Las Vegas. The topics of interest include:

  • Novel recommendation models, emphasizing accuracy, performance and asymptotic behavior
  • Scalability problems in recommender systems
  • Novel evaluation methodologies for recommendation quality
  • Efficient integration of multiple complementary predictors
  • Studies of content-filtering vs. collaborative filtering and their integration in large-scale environments
  • Explaining and presenting recommendations to end-users
  • Idiosyncrasies of the Netflix Prize dataset and lessons learned from its analysis
  • The Netflix Prize competition at large

I have written about the Netflix Prize competition before on this blog.

« Previous PageNext Page »

31 queries. 0.392 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.