Wednesday, December 31st, 2008

Where are the academic podcasts?

Filed under: Academia/Research — Daniel Lemire @ 11:42

This blog is also a podcast. Few people notice. I have not posted any audio in months. (If you have never listened to me: I am better at writing English than at speaking it.)

Where are the good academic/research podcasts? What I found so far fitted in these categories:

  • Promotional material for schools or research centers.
  • Capture of audio/video events (such as lectures).

I have never liked lectures. I was the kind of student to never pay any attention in class. Some live talks are good, but most are not.

Podcasting is different however because it can be edited afterhand. You can cut the parts where you are rambling. You can redo part of the podcast if the first take was not good.

But where are the researchers producing good podcasts?

What makes recommender systems work?

Filed under: Science and Technology — Daniel Lemire @ 9:08

Why can we predict tastes? There are several possible explanations:

  • Intrinsically, individuals have predictable tastes. To test this theory, we would need to isolate each individual. Collect their opinions. Then attempt to make predictions. (You also need to prevent the recommender system from giving feedback to the user!)
  • People are influenced by the perceived popularity of items. People tend to like a movie more when they think it is a blockbuster. A  more general statement is: Preferences are constructed in the process of elicitation.

Jon Dron linked to a 2007 New York Times article by Duncan J. Watts which claims that the second factor can easily dominate. The article is based on experiments. However, I have several arguments to support this claim.  Highly cited research papers are often no more interesting than regular papers. Except that you must also cite them, otherwise people will complain that you are missing a reference. The same is true with music or books. I often read novels based because many people read them: that is how, for example, I found la compagnie des glaces or Dune.

Hence,  collaborative filtering is a circular process:

  • People like certain items (mostly) because they perceive that similar people like them.
  • Collaborative filtering matches people with such items.

Of course, that is a bit pessimistic: people are also influenced by the intrinsic values. However, how can you differentiate the social perception from the intrinsic value of an item? Is this novel entertaining, or is it popular? Outside a laboratory, we cannot tell these factors apart!

How well would collaborative filtering work if individuals were isolated? It would work poorly. There are millions of books and research articles. Many are quite good. Without a social component to push us in some directions, we would have diverse tastes.

Monday, December 29th, 2008

Grabbing attention or building a reputation?

Filed under: Academia/Research — Daniel Lemire @ 11:44

Daniel Tunkelang has been writing on the attention economy (here and here for example): everyone is fighting to have your attention, and you only have so much to offer.

Attention is easy to measure:

  • You can record the number of people subscribing to your blog.
  • You can count the number of people citing your research papers.
  • You can point to your number of followers on Twitter or your number of friends on Facebook.

However, I do not blog or write research papers merely to grab attention. Instead, I seek to increase my reputation. While attention fluctuates depending on your current actions, reputation builds up over time based on your reliability, your honesty, and your transparency. To build a good reputation, you do not need to do anything extraordinary: you just need to be consistent over a long time.

Of course, you need to get some attention if you are building a reputation. However, on the long run, the saying build it and they will come, is true. Being present and doing good work is enough. You do not need flashy presentations. Remain lean and mean. Avoid high maintenance operations. Do good quality work.

Saturday, December 27th, 2008

We never invent anything new, yet progress is made!

Filed under: Academia/Research, Science and Technology — Daniel Lemire @ 15:50

Practical innovation explains how per-capita wealth increased eightfold during the last century. Yet, we are constantly reminded that we never invent anything new:

  • Most movies are remake or variations on older movies.
  • Most research papers are variation on a theme.
  • Most products and services are variations on existing products and services.

Even if I invent something recognized as drastically novel, I am sure someone will say “oh! but we did that 20 years ago.” A recent example are the tag clouds which have no equivalent in my pre-Web textbooks. Yet, I am sure we can show that they are variation on much older visualization techniques.

If nothing really new is ever invented, why do we observe so much innovation? As I watch the 1993-1993 X-Files (season 1), I am amazed at how backward these FBI agents appear:

  • They do not carry a cell phone and must run back to a land line to get help. (Update: toward the end of the season, we learn that Scully has a cell phone.)
  • Even though agent Scully uses a DOS word processor (probably Word Perfect), all their archives are on paper or microfilms.
  • While Scully has a modem, it is not clear that she uses any networked application. Agent Mulder does not use a computer? I found no sign of the Web.
  • Files are not shared. For example, Mulder has his own files (the X-files) and apparently, others must come to him to have (paper) copies.

The key? Progress is incremental. Human beings are gregarious for a reason: our innovation process is social. Innovation occurs when new ideas are put into action by society.

For researchers wanting to create innovation, there are several implications:

  • You can only be an effective researcher if you are an effective communicator.
  • You should be connected as much as possible to the rest of society. It is not sufficient to impress the ivory tower: you must reach out.

Thursday, December 25th, 2008

My (short) activity report for 2008

Filed under: Academia/Research — Daniel Lemire @ 18:59

I heard on radio today that the Christmas break should be used to review the past year, and decide where you want to go. Good idea!

What did I do?

  • I published the Lemur Bitmap Index C++ Library.
  • I published lbimproved, a C++ library for Fast Nearest-Neighbor Retrieval under the Dynamic Time Warping.
  • Owen presented our paper Histogram-Aware Sorting for Enhanced Word-Aligned Compression in Bitmap Indexes at DOLAP 2008 (arXiv:0808.2083).
  • I presented our paper Tri de la table de faits et compression des index bitmaps avec alignement sur les mots (French for Fact Table Sorting and Word-Aligned Compression for Bitmap Indexes) at BDA’08 (arXiv:0805.3339). It was my first time in ten years publishing in French.
  • Hazel presetned Pruning Attributes From Data Cubes with Diamond Dicing at IDEAS’08 (arXiv:0805.0747).
  • Our paper Hierarchical Bin Buffering: Online Local Moments for Dynamic External Memory Arrays finally appeared in ACM Transactions on Algorithms. It is available from arxiv: cs.DS/0610128. I published some years ago the C++ source code.
  • Kamel presented our paper Collaborative OLAP with Tag Clouds: Web 2.0 OLAP Formalism and Experimental Evaluation at WEBIST 2008 (arXiv:0710.2156).
  • I published my first online graduate course online (INF 6104) on Information Retrieval.
  • I taught Data Management in XML (INF 6450) and an undergraduate course on Information Retrieval (INF 6460).
  • I published hundreds of blog posts, and some of my post have been relatively popular.

Naturally, there was more to 2008, but you get the idea.

What are my plans for 2009?

  • I will keep experimenting with different types of blogging. My goal is to get better at connecting my main research activities and my blogging.
  • I will continue to work on bitmap indexes. I will also branch out from bitmap indexes to columnar databases in general. Expect a few research papers.  Expect my Lemur Bitmap Index C++ Library, to grow beyond bitmap indexes.
  • I will pursue my interest in collaborative data analysis.
  • I will publish an online course on Data Warehousing.

Again, there are a few more projects, but these are important goals.

My low-tech research tools

Filed under: Academia/Research — Daniel Lemire @ 11:14

I carry a pocketbook and a pen everywhere. At night, my pocketbook is by my bed. All creative workers should carry notebooks.

Organizing and collecting ideas are different tasks. My pocketbook is strictly for collection. Every few days,  I start a new page: a list of reminders on one side, and diagrams on the other side. Important ideas get processed and stored on my laptop. I throw away used pocketbooks.

It is difficult to find quality pocketbooks. Here are my recommendations:

  • My pocketbook must last a few months. Paper must be thick and of good quality. I prefer unlined paper. I need a ribbon marker to quickly find the current page. Paperblanks make some good and inexpensive Pocket Companions fitting the bill. Alas, Paperblanks does not sell directly to customers. The retail info on the Paperblanks’ web site is helpful.
  • I prefer black gel pens.  Rechargeable pens create less garbage and they are often of better quality. For about a year, I have had good luck with my Zebra gel pens.

Note: I do not profit in any way if you buy a Paperblanks pocketbook or Zebra pen.

Tuesday, December 23rd, 2008

Where do presidents and prime ministers go to school?

Filed under: Academia/Research, Science and Technology — Daniel Lemire @ 9:21

In his most recent essay, After the credentials, Paul Graham tells us that in South Korea where “college entrance exams determine 70 to 80 percent of a person’s future.” Fortunately, the Americans know better: “Where you go to college still matters, but not like it used to.”

Paul writes good essays, but they are thin on research. How much is your alma matter a predictor of your success? The research is available. For example, in Regression and Matching Estimates of the Effects of Elite College Attendance on Career Outcomes, Brand and Halaby write:

Our results suggest that in terms of college quality, there is not only no direct effect on mid- and late-career attainment, but no significant effect at all.  This study questions the consequential belief that an elite college education necessarily translates into privileged socioeconomic status throughout the life course.

To sum it up: If you are a privilege kid, you will do well even if you go to a local college.

Because my research budget for this blog is $0, I will do my own survey about a special job: the presidency in the USA and the office of prime minister in Canada. Do state leaders attend a small set of colleges?

Let us review where the American presidents got their first degree:

What about Canadian prime ministers?

Based on this evidence alone, if I were to coach a kid for a political career, I would ignore where he gets his degree. This makes sense. You  become president or prime minister several years after earning your degree. By the time you have the experience required for the job, any college premium is gone.

See also my post The 2 myths getting students into heavy-league schools.

Disclaimer: I am a graduate of the University of Toronto, maybe the most prestigious university in Canada.

Next Page »

26 queries. 1.351 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.