Some consider the P = NP problem to be the most important Theoretical Computer Science problem. It asks whether all problems whose solution can be verified quickly, can also be solved quickly. If you can answer this question, you win one million dollars.

The catch is that quickly is defined as in polynomial time. Thus, if you can solve a problem in time O(nx) where x is the number of electrons in the universe, then you are quick.

This is an annoyingly silly definition. Bubble sort is in O(n2). Yet, try replacing all sorting within Google’s infrastructure by this quick algorithm. Google would die.

Lance Fortnow asks us to take for granted that proving P = NP implies we have fast algorithms for all NP problems. William Gasarch just predicted that proving P ≠ NP would also help real world of algorithms.

Unknowingly to them, Zeilberger proved that P=NP on April 1, 2009. Yet, nothing happened. (He was kidding. Or was he?) Anyhow, enough with the dogma! While intermediate steps in the solution of the problem might be critically important to our understanding of computation, knowing that P = NP is inconsequential technologically.

This is as silly as claiming that sending men to Mars will cure cancer. It might very well that the research necessary to send men to Mars might lead to major breakthroughs, but whether we go to Mars or not has nothing to do with cancer.

Yes, please send men on Mars. Yes, please work on the P = NP problem. But stop claiming the answer would change the world.

Further reading: See my older post Is P vs. NP a practical problem?. Dick Lipton wrote a related blog post as well.

Michael Stonebraker is predicting that the dominance of the generic relational database is coming to an end. Having recently founded several database companies, he has a vested interested in this prediction .

Here is Stonebraker logic: we can outperform relational databases with specialized solutions. Therefore, users will migrate to specialized engines. In effect, specialized players such as Vertica will grab market shares from Oracle Database and Microsoft SQL Server.

Unfortunately, Stonebraker’s arguments are misleading. As far as performance is concerned, Stonebraker is obviously right: we are undergoing major changes. As pointed out by Daniel Tunkelang, you can store a lot of data in 32GB of RAM. Solid-state drives can be used to wipe out some IO bottlenecks. Yet, these technological changes will not change the game for two reasons:

  • We have always been able to outperform generic relational databases: (1) column stores have been around since the seventies when they were called transposed files (2) search engines have always used their own indexes (3) lightweight key-value engines like Tokyo Cabinet have always been around. Generic relational databases did not achieve dominance due to their superior performance.
  • Generic relational databases are frequently catching up to specialized engines. In particular, they are not limited to row stores. Curt Monash’s blog post on Oracle’s hybrid columnar approach makes this obvious. Nicolas Bruno, in Teaching an Old Elephant New Tricks, predicted that the lessons learned by start-ups such as Vertica will be integrated into traditional relational engines.

Further reading: I was motivated by the latest StorageMojo blog post. See also my blog posts Trading compression for speed with vectorization, Changing your perspective: horizontal, vertical and hybrid data models, Column stores and row stores: should you care? and Native XML databases: have they taken the world over yet?

You must do many silly things to get a large research grant:

  • You must know precisely what you will do for the next five years. Yet, in my experience, good researchers only have a vague idea of where they will be in 5 years. If you know the promising research directions you will encounter, you are an astrologer, not a scientist.
  • Your plan must be infallible. Yet, in my experience, research ideas that never fail are not research ideas at all! They are mere applications of known principles. There is no research without risk. The more ambitious the research, the riskier it is.
  • Your work must revolutionize your field. Yet, as stated above, it must also be  infallible.
  • Your work must have crucial applications. Make it up if you must! When Einstein came up with E=mc2, he should have pointed out the possibility of dropping nasty bombs on Japan. Again, we expect scientists to know the future.
  • You must work in large teams, packing up a broad range of researchers. Yet, there is no clear evidence that correlation exists between the resort to extramural collaboration and overall research performance. The goal—once more—is to minimize risks. It is unclear whether it actually reduces the risk at all. And the cost is added bureaucracy.
  • You must promise to train many new Ph.D.s. Somehow, flooding the job market so that there are hundreds of qualified applicants for every research job is highly rewarded.

Thankfully, Peter A. Lawrence has many good ideas on how to improve the system:

  • Grant applications should be short. Quick to prepare. Quick to process. Currently, it is not uncommon for scientists to spend months writing grant applications, and weeks reviewing them. Wasteful!
  • You should be able to apply for grants based on your record alone. The requirement to submit a project or research program is silly.
  • We should be critical of large groups. For some large projects, such as building a multi-billion-dollar machine, you need a group. To investigate the properties of some algebraic structure, you don’t.
  • You should only offer a handful of research papers for review. Pick your top 3 papers and don’t mention the rest. In Canada, NSERC applied this idea, but they did not go far enough. They ask us for our top contributions, and then they ask for the full list. By reviewing researchers on only their best work, we would lift the incentive to produce many weak papers.

I conclude from a quote by Michael Nielsen:

Some years back I constructed a list of papers I especially admired, and was surprised to discover that with only a few exceptions they were produced from unfunded research. This was sobering, since it suggest that receiving research grants was (…) anticorrelated with doing work of the highest quality. Grants seem to be good at sustaining an established area, but not very good at all at producing the conceptual innovations that start new subfields. (Michael Nielsen)

(And before anyone jumps at this… Michael did not write that having a large grant was incompatible with highly original research.)

If you seek approval above all else, you are unlikely to innovate outside the rigid bounds of the current system:

  • You do not convince existing journals to give more respect to this new field you created. You go out and create your own journals and conferences. John von Neumann did not wait for his colleagues to approve of his work on Computers. In fact, he had to use threats to get what he wanted.
  • You do not convince libraries to embrace e-commerce. You create Amazon.com.
  • You do not convince university librarians to stop worrying about what the publishers need. You go out and create Google Scholar.
  • You do not wait for Amazon.com’s management to approve the use of a recommender system. You do what Greg Linden did and squeeze the feature in during a test.
  • If you can prove Poincaré conjecture, you don’t wait for a journal to approve your work, you just post it on arxiv.

Changing a system from the inside is inefficient and failure-prone. Stop wasting your time and make the old system irrelevant. Do not seek approval. Go out and test your ideas yourself. Listen respectfully to others, but make up your own mind. Work within a large company, scientific community or University if you must, but never fall under the illusion that you can change it by playing fair. Cheating is the fundamental mechanism by which change happens. Cheaters are innovators.

How are you planning to cheat today?

Further reading: Is scientific publishing about to be disrupted by Michael Nielsen

Americans think that the best scientists come from their best universities. To learn more, consider where the influential mathematicians from:

got their degree in the USA 33%
got their Ph.D. in the USA 58%
working in the USA 68%

The American scientific dominance relies partially on the ability of the USA to attract and retain the best and the brightest:

  • 33.6% of European PhDs were attracted to faculty or research positions in the US.
  • 55% of non-European foreign PhDs were attracted to faculty positions in the US.
  • Only 10% of American Ph.D.s go work abroad.

Reference: Influential Mathematicians: Birth, Education and Affiliation

Further reading: It may not matter all that much where you go to college and Big schools are no longer giving researchers an edge?

Next Page »

Powered by WordPress