AAAI 2008 (January 25, 2007 / July 13-17, 2008)

AAAI-08 will be held in Chicago. It is one of the largest annual conference in AI. They also have a special AI and the Web track.

Research productivity: what matters?

I stumbled upon this nice paper Social-Organizational Characteristics of Work and Publication Productivity among Academic Scientists in Doctoral-Granting Departments (Journal of Higher Education, 2007). I skimmed it and here are some sketchy conclusions:

  • Being a man and having lots of male graduate students is highly correlated with productivity.
  • Collaboration is strongly correlated with productivity. However, it is difficult to say whether there is any causality. It could be that when you are productive, more people seek to work with you.
  • Working on several projects is strongly correlated with productivity. However, is there any causal relationship? Maybe when you are more productive, you take on more projects?
  • Work climate and location only matters moderately or not at all.

Storytelling and research papers

I often read that good research papers should tell a story. There should be a continuous flow. We should care about the story, we should be eager to learn about what will happen in the next section.

I do not know about you, but I do not come across many such research papers. Mathemagenic points us to some references, going back to Plato, as to why storytelling is not taken seriously. It seems to me that our current approach to research papers assumes that knowledge is axiomatic: you can decompose your paper in facts that can be laid out in a formal language. Whether this is true or not, I do not care, the fact of the matter is… I do not write research papers to lay out facts. I wish it were that simple.

In any case, I decided to do some Googling on the topic to see if I could find new clever techniques to make my research papers more exciting (irrespective of the quality of the science), and I found this related piece of advice:

A good way [to describe your results] is to tell a story, an interesting one that puts everything into perspective re the existing literature and conveys how it is you succeeded where others failed. What was the key idea which nobody else spotted? It should not reflect the actual historical progress of your research (which may have been long and winding) but rather based on how your thinking should have gone with the benefit of hindsight. This is not quite the same as the shortest logical path (which would not be understood until after the paper is read), but rather involves an historical element with reference to works and ideas that the reader might already be familiar with.

What I could not find were good examples of storytelling in research papers. Anyone has a pointer?

Reference: Hints for New PhD students on How to Write Papers (Shahn Majid)

Is PageRank just good marketing?

Web search engines such as Google look at which page links to which page to determine what are the authoritative Web pages. A good algorithm in this context is one that is hard to fool: if you and your friends decide to mutually add link to each others, it should be hard to make much of a difference. Sérgio commented earlier on this blog that PageRank is known to be just a marketing. So I decided to go hunting. Up until now, I thought PageRank was a clever idea because it feels like it would be harder to fool it than just counting how many in-bound link a page has. It was not very long before I found a reference that supported Sérgio’s claim:

Log of indegree was highly correlated with Google-reported PageRank scores, and just as effective when predicting desirable company attributes. Further, we found that PageRank scores for sites within a known spam network were no lower than would be expected on the basis of their indegree. We encounter no compelling evidence to support the use of PageRank over indegree.

Reference: Upstill, T. and Craswell, N. and Hawking, D., Predicting fame and fortune: Pagerank or indegree, ADCS2003, 2003.

Anyone knows of any demonstrated benefit of PageRank over merely counting the number of inbound links? Is PageRank more resilient at all?

Update: do read the comments! They are more interesting than my post.

Why bother with Google? Go straight to wikipedia!

Véronis discovered something very interesting. About a third of the time, Google’s results include the Wikipedia link as the first link. His explanation is insightful:

How can this sudden interest in Wikipedia by both engines be explained? It is undoubtedly connected with the increasing difficultly engines have in calculating satisfactory ranking. The good old days of PageRank algorithms are over. (…) The explosion of blogs and news sites has changed the situation considerably.

If Web topology cannot cope anymore, this means we need to introduce time as a factor. Any taker on an hypergraph version of PageRank? How do you call a time-varying Markov process?

When has a problem been solved?

I have stated before that researchers should focus on new problems or on providing solutions that are at least an order of magnitude better than previous solutions. There is a catch to this statement: it says that if you are within an order of magnitude of the ultimate answer, then you should stop, unless, maybe, you can prove that you have achieved the ultimate solution. Proving you have the best possible solution whereas others were providing approximation does constitute a significant gain, certainly worth publishing, but this is rarely possible. Most real problems are too complex to allow our puny brain to prove that a solution is ultimate.

So do we just accept that being within an order of magnitude of the answer is good enough? If you are within an order of magnitude of perfection with respect to all indicators, simultaneously, then maybe you ought to stop. Yes?

Another catch to this is that you may not know exactly how far off you are from the best solution. It might be very difficult to study the characteristics of the ideal solution. What then? Do we still hold off on publishing incremental improvements to existing solutions? Do you call the problem solved if, over a long period of time, nobody was able to improve the state-of-the-art by an order of magnitude?

Food for thoughts: Recently, John Riedl asked on his blog whether we could tell when spam filters would get to be good enough. My immediate answer was to apply the Turing test: a spam filter is good enough when it has achieved a human-level of performance. Yet, I know this is not the answer. Nothing is ever perfect, but my level of performance is far from the ultimate goal. I doubt spam filters will ever pass my Turing test, but even if they did, I am likely not to be satisfied. One false positive is still one too many.

Physical factors making you smarter: white noise, carbohydrates, music, alcohol, and coffee?

picture by Pete Barr-Watsn

Disclaimer: this is not meant to be a scientific survey. However, if you disagree with my survey, please do add a comment!

Disclaimer 2: I drink a lot of coffee. I almost certainly reach a point where it impacts negatively my performance because I get too tensed to focus. However, I find it preferable to boredom.

Do not write like we taught you to!

picture by dullhunk

It is easy to think that the big deal these days has to do with multimedia (YouTube) or social networks (Facebook), but the written word is changing too! As someone who writes for a living, I am fascinated by how writing has changed drastically in recent years. Of course, the Web has changed the way we write in an obvious way: it has become less of a formal activity and more of a social one. However, even formal writing, such as the production of research papers, has changed a lot. We are in the middle of a revolution.

  • Documents are not standalone objects. Documents are commonly hyperlinked, and when they are not, it is increasingly easy to browse through the documents that reference it or through the documents referenced by it. PageRank is just one example of how links between documents are becoming as important as the documents themselves. I no longer read scientific papers on their own: I always read them as part of a stream of papers in a given area. The fact that I can download in about 5 minutes a dozen of papers on the same topic, makes a big difference. I very frequently look up the Web pages of the researchers I read, just to see what they worked on beside the paper I read. Most blog posts do not stand on their own: they are part of a worldwide exchange. Also, papers are no longer static objects: several times a year, I will write to an author I read and get feedback from him.
  • Transparency matters. As it becomes easier than ever to make information available, it becomes less acceptable to keep relevant matters secret or to lie. Bloggers are famous for sharing openly: lying on your blog is dangerous because so many people can check your facts. Increasingly, researchers are asked to make source code and data available. You can no longer write for a small community: people outside your little group are likely to stumble on your work as well.
  • Countries and organizations do not write, people do. A journal that accepted one of our papers objected yesterday that, in the reference section, we omitted the location of the publishers and where conference were held. But I do not care about where the results first appeared! They also asked us which organization was behind each of the proceedings paper we cited: I do not care! Several years ago, I was asked if, as a researcher, I had international collaborations. The question does not even make sense to me.
  • Metadata is more about selling than about describing. Many people still write abstracts as if they had to summarize their work. But I can grab your paper in about 2 minutes and read its introduction in 5 minutes. We no longer mail order science papers. So the abstract should tell me why I need to read your paper. The same hold true for blog posts: your title is not there to describe the blog post, but to tell us why we should care.

How to become smarter

picture by tatianes
  • Work on projects you love doing, even if only part of the time. You can only be as smart as you are motivated. I will never be a smart electrician.
  • Reading and learning are important, but people learn by doing, by tinkering.
  • Carry a notebook or a PDA, and use it to record ideas. Periodically discard most of your ideas.
  • Having a blog can’t hurt.
  • This is probably the most important point: hang around with smart people. If you live among monkeys, you might have a good life, but you will not earn a Ph.D. (except if you are studying monkeys!). Happily, you can easily hang around with smart people wherever you live thanks to the Internet. This is important because if you hang around with people who do great work, you will be motivated by emulation: nobody likes to feel like a loser among his peers.
  • Push yourself: try daring projects and learn to fail. Be ambitious! Do not waste your time with things you know how to do well. Go beyond. Aim as high as you can, while trying to stay on track.
  • Context is important when solving problems. I found that offices are nearly the worst place to work for me. My home office is much better. Sometimes, a coffee place can be a decent alternative office (presumably because of the white noise effect). Sometimes, using a pen is better than a keyboard. Sometimes, working with a laptop in your bed is better than working on a desk. Change, try new contexts!
  • Come back to important projects regularly. Do not get lost in the small stuff.
  • Urgency is an important factor. Somehow, being too happy about what you achieved can slow you down. This suggests that you should be critical of your own work, and you should not underestimate your competitors. Of course, you need to stay motivated, so do not overestimate your competitors or underestimate your own work either!
  • You will not cure cancer in one day. You will not become a pro golfer in a week. You can only solve big problems by dividing them up in small chunks. Always stay focus on the next small step. Do not stare mindlessly at the big picture.

Be physically smart:

  • Omega-3 is good for you and might make you smarter. Eating fish seems like a good idea.
  • When you are tensed, eat carbs (bread, cookies). Do not make things worse by drinking coffee.
  • Too much coffee tends to get your mind to speed up and you lose focus easily. You end up getting many things done, but you no longer have time for thinking about the hard problems.
  • When you need energy, eat proteins (cheese, meat, beans). Coffee alone will only help you temporarily, it does not get you through a lot of hard work.
  • Drink a lot of water: after all, your brain is mostly water.
  • Sleep a decent amount. Some people claim sleep-deprivation allows them to get more done, and it might be true, and I do not know of any evidence that sleep-deprivation hurts your brain, but being sleepy does slow you down and tends to get you to work on routine problems.
  • Taking long walks (at least 20 minutes) out in a quiet park, thinking about some deep issues, tend to set me up for good work for the rest of the day.

See also my post My research process.

For further reading and scientific evidence, read my posts Physical factors making your smarter: white noise, carbohydrates, music, alcohol, and coffee? and Thinking intelligence is innate makes you stupid.

Reference.

Subscribe to this blog
in a reader
or by Email.

Having scientific meetings with brilliant people… in your kitchen?

picture by rich_w

I had two important meetings today. One of them was with my good friend Harold Boley (of RuleML fame) and another well know professor. The other meeting was with an infamous professor who shall remain nameless.

What is most amazing about these meetings is that they happened in my kitchen, using Skype and the builtin webcam of my MacBook. And these meetings were efficient, to the point, content-rich, and pleasant. Moreover they were inexpensive. And I don’t mean financially. They required almost no time to prepare. They required no room, no building. They did not require any staff.

Of course, the bandwidth is not quite the same as a live meeting, but this can be a good thing: I do not care to smell your pheromones nor do I insist on seeing the details of your body posture. Moreover, the bandwidth is increasing at a crazy rate.

What does this mean for our future? It means that institutions are no longer required to get the system running. No vice-president, no staff. It means you can run the world from your kitchen. Or at least, get some research done.

See also my post Big schools are no longer giving researchers an edge?

Next Page »

18 queries. 0.434 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.