How to solve hard problems

Some people start out in life able to solve hard problems. Others cannot seem to do it. I believe that intelligence is not innate, but few people know how to work on hard problems. Some may learn by luck, or by observing smart people.

Here are a few things I was able to learn over the years:

  • Use your intuition, but keep it in check. Hard problems often require that you question every single assumption.
  • Start small and do not stare directly at the nasty problem. Always focus on an easy non-trivial next step.
    Trying repeatedly to solve a hard problem in one pass can be depressing, so get small victories. Try to learn something new about the problem every day.
  • Write a lot. Describe your false starts and explain why they are false starts. Doing so has benefits: you are less likely to go down these paths again and writing tends to bring forth new ideas. Do not worry about filling up notebooks: paper and ink are cheap.
  • Stand on the shoulders of giants: repeatedly go learn about related problems using wikipedia or Google Scholar. Jot down any result that may help you later.
  • Computers are very powerful assistants: use them to plot your problem or to test out theories quickly. It is sometimes amazing how much you can understand by looking at a plot.

Blogging is and will remain a fringe effect in science?

My friend Sébastien Paquet got me upset. He sent me a link to a post by David Crotty. What David says is that Wikipedia and blogging, the whole Web 2.0 fad, is not and will not have an impact in science. (Update: Not quite what David wrote.)

Ok David. I can respect your opinion on the matter. But it gets ugly when you bring Linux into the fold:

But when you step away from the enthusiasts and speak with the majority of scientists, you find out that they don’t have much interest in using many of these new technologies. The whole situation reminds me quite a bit of what one saw online regarding the Linux operating system 5 to 10 years ago. You saw great enthusiasm, and predictions that Linux was soon to take over the computing world. The rest of the world shrugged, and went back to their Windows computers to get their work done.

Back in 1998, using Linux required a fair amount of faith. Getting Linux running smoothly on a PC without help was matter of days. Few companies used Linux for job-critical applications.

In 2008, the Linux market represents $35.7 billion. Unless you manage to avoid using Google, you use Linux every single day. Walmart sells Linux-based PCs.

David, you have chosen a terrible analogy. Linux has succeeded beyond any sensible expectation. Nobody predicted that Linux would kill Windows. Linux was not out to kill Windows.

Nobody is predicting that blogs will replace journals. Scientists do not blog because they think it is the new media that will replace conferences and journals. However, blogging has and will continue to have a serious impact on science.

Google has broken my roman numeral captcha

Maverick Woo sent me an email to let me know that Google does roman numeral arithmetic.

I can’t help but imagine the discussion between between the Google engineer and his boss:

  • (Engineer) Hi boss! I plan a new feature for our search engine… roman numeral arithmetic!
  • (Harvard MBA) What a great idea! (Thinking to himself: I need to replace this guy.)

As a basis for comparison, it seems that Yahoo! does not have this feature.

Beside my blog’s captcha, where else do you ever use roman numerals?

Multicore programming? Yawn!

It looks like Intel is trying to push parallel programming. No doubt many colleges are going to keep surfing on the parallel-programming hype — to predict a new surge of interest in Computer Science. Alas, there is no upcoming multicore revolution in computer programming.

  • For a large fraction of enterprise problems, the bottleneck is at the database level. The ubiquity of Web servers and distributed databases (see CouchDB) imply that many such problems are already parallelized. Database techniques like partitioning have been around for years to help you parallelize your databases. This blog runs on a server with several processors, and it has done so for years. Nothing new on the horizon.
  • MapReduce and Hadoop help you parallelize many of the remaining hard data processing problems without having to mess with threads, locking and synchronization.
  • Many hard problems are memory-bound: they are hard because all of the data does not fit in memory. If your problem is memory-bound or IO-bound, throwing more processing cores at it may not help at all.

I have stated for a couple of years that storage, not processing power, is changing Information Technology. What is most amazing is our ability to record almost every single bit of information, and never have to delete or forget anything. On this topic, see my posts One More Step Toward Infinite Storage, Solid-state drives: when external memory becomes as fast as internal memory and What is infinite storage?

The truth is that we are not very good at dealing with large quantities of data. Anyone knows what to do when handed 50 terabytes of raw data? Few of us have the required skills to manage and leverage extremely large databases. Those will be the valuable skills in the future.

BIRTE 2008 (May 9, 2008 / August 24-30, 2008)

The Second International Workshop on Business Intelligence for the Real-Time Enterprise (BIRTE 08) will be held in conjunction with VLDB’08 in New Zeland. You can see the list of papers accepted at BIRTE 2006 on DBLP.

In today’s competitive and highly dynamic environment, analyzing data to understand how the business is performing, to predict outcomes and trends, and to improve the effectiveness of business processes underlying business operations has become critical. The traditional approach to reporting is not longer adequate, users now demand easy-to-use intelligent platforms and applications capable of analyzing real-time business data to provide insight and actionable information at the right time. The end goal is to improve the enterprise performance by better and timelier decision making, enabled by the availability of up-to-date, high quality information.

Reputation still holds in education… for how long?

Readers of this blog who think that I am a bit mad would do well to go read the latest Cringely:

(…) reputation still holds in education, though its grip is weakening. (…) MIT threw videos of all its lecture courses – ALL its lecture courses – up on the web for anyone to watch for free. This was precisely comparable to SGI (remember them?) licensing OpenGL to Microsoft. What is it, then, that makes an MIT education worth $34,986? Is it the seminars that aren’t on the web? Faculty guidance? Research experience? Getting drunk and falling in the Charles River without your pants?

Compare this quote with my posts The 2 myths that gets students into heavy-league schools, Who needs your lectures?, and It may not matter all that much where you go to college.

Yep. Not long ago people bought European electronics because it was supposedly better. Now? These days are long gone.

Large groups in science

Paul Graham wrote an essay that will get people talking: You Weren’t Meant to Have a Boss. The gist of the argument is that large groups like research centers, companies and universities are inefficient artificial constructs limiting people’s freedom. I believe this sentence says it all:

It will always suck to work for large organizations, and the larger the organization, the more it will suck.

Anyone who has worked in a large research projects knows that one of two things tends to happen. When the project is well structured, nothing interesting happens. Papers will get published, but no new insight will be produced. Sometimes the odd graduate student will have used his scholarship money to do something exciting, but he will have broken the rules by doing so. On the other hand, when the project has no strong leadership, people will tend to go work on their own and interesting results may come forth, but the big project will basically be an afterthought.

So, why do funding agencies keep on fostering large projects? Almost surely because it sounds good on paper, and it is easier to manage than a large number of small projects. I have yet to see any serious study showing that large projects are a better way to invest the tax payer’s money. Another argument I sometimes hear is that below-average researcher do better under the guidance of visionaries. I wonder whether any study supports this claim?

Do people in large research centers or large universities produce more? I think that you will find that highly productive groups in large centers work in small units that are largely independent of each other. Hence, I do not think that large centers or large universities are more productive. However, they may attract better people, mostly because going to work at a smaller place is riskier, Paul pointed out this problem:

The average MIT graduate wants to work at Google or Microsoft, because it’s a recognized brand, it’s safe, and they’ll get paid a good salary right away. It’s the job equivalent of the pizza they had for lunch.

I would add that a job at a larger place looks better on your resume. If, as a scientist, you choose to go work at a tiny university, people will assume, sometimes rightly so, that you were not offered a job at a larger organization.

As someone who worked at the largest research institution in Canada (by the number of researchers), I can tell you that large size does not make you better. Having access to a lot of smart people is nice, except that sharing an employer is not a great way to ensure fruitful communication. My own productivity was low until I started saying no to large projects. I find that I am most productive when I work on my own small projects with a few hand-picked people. I find that fighting to keep my freedom and independence is key to producing higher quality research.

Disclaimer. I don’t measure my own productivity solely by the number of papers written. Working 6 months on a single paper — as I have been doing recently — is not being unproductive, because we got interesting results all through the process. I don’t feel compelled to submit ten variants of the same paper to feel productive.

Even a tiny amount of beer makes you less productive?

According to an article in the New York Times, drinking beer is correlated negatively with scientific productivity. What is surprising is that even small quantities of beer are correlated with decreases in productivity.

But correlation is not causality. They have not shown that drinking beer makes you less productive. They have shown that people drinking beer are less productive. (It is not the same!)

Source: Scott Flinn.

Why is there no new Einstein?

On my blog, the best content is in the comments. Sébastien reminded me of this fact today by offering a link to an article in Physics Today by Lee Smolin. The gist of the paper is that scientists feel a lot of pressure to follow the lead of powerful senior scientists. It is much easier to be productive when you follow established techniques. Any prospective Einstein is crushed by the system.

What are your two biggest accomplishments?

There are many reasons for rejecting a paper. The authors might have failed to communicate their results efficiently. There may be a flaw in the science. Or the authors might have cheated. These flaws come from oversights, incompetence, and lack of ethics. But most importantly, they may all be motivated by the greed to publish more and faster.

Today is a bad day. I reviewed or rereviewed 5 papers from 3 different sources today. The best of these papers is a case of self-plagiarism. Three of the five papers were written by inexperienced students, or they appear to be, with minimal or no supervision from a senior researcher. The last one might have made a good blog post.

I believe that we are due for a revolution in science. We need to definitively stop counting the number of papers people produce. This game has run its course. If I interpret what I read correctly, it has become quite a bit counterproductive.

I propose that people list their two biggest accomplishments. It could be an experiment or a theorem they proved. To improve your case, you need to outdo one of your two biggest accomplishments to date. It does not matter if you publish 50 papers a year: you only improve your status if you outdo yourself in a big way.

Students can get started quickly. Senior researchers will have a harder time making progress. I submit to you that industry already works this way. Senior engineers are only as reputed as their two biggest most-difficult projects. It does not matter if you completed 120 small projects.

C.V.s would now fit in a single page. Tell us where you work, where you got your Ph.D. and what the two big things you did are. That is it.

Next Page »

18 queries. 0.438 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.