Monday, May 21st, 2007

Bureaucracy is the ennemy of Science, or is it?

Filed under: Academia/Research — Daniel Lemire @ 9:59

Wikipedia defines bureaucracy has having the following three properties:

  • well-defined division of labor;
  • consistent patterns of recruitment and stable linear careers;
  • authority and status are differentially distributed among actors.

Yet, we should all suspect, at least intuitively, that bureaucracy is the enemy of Science. Science is all about discovery and innovation. Bureaucracy is all about control and stability. At best, it is a strange mix.

I’d like to propose that, to foster better Science, we need this type of organization:

  • no formal division of labor;
  • varied recruitment and careers;
  • authority is emerging and ephemeral.

Most researchers I know spend a great deal of their time filling out forms to get grants, or managing students or assistants in order to sustain the linear growth of their careers. Recruitment follows very strict patterns, at least in academic circles. It seems that more and more researchers are actually managers. In 2007, Science mostly occurs inside bureaucracies! What does it mean? Does the innovation really occur inside these bureaucracies? Could the Chinese industry, with a much weaker recent history of academic research, prove more innovative than occidental industries? What am I going to eat for lunch?

Thursday, May 17th, 2007

My favorite Web 2.0 applications

Filed under: — Daniel Lemire @ 8:18

As Sylvie points out, it is difficult to keep track of all of the Web 2.0 applications out there. Maybe it is worth it sharing our favorites?

  • Flickr for finding and sharing pictures. I also like Google’s Picasa Web. I use neither very much. I have used YouTube to share videos however. I think that multimedia sharing sites are here to stay.
  • Del.icio.us is not bad for sharing bookmarks, but I think that everyone agrees that it could be better designed. Peter prefers Stumbled Upon. Myself, I could never quite get any bookmarking service to work for me. The reason is simple: I see the action of bookmarking something as an “event”, not as the construction of a list. My set of favorite sites is a stream… not a list! So blogging is better suited for bookmarking. Plus, I like to explain why I like or do not like a certain site. Tagging is not enough.
  • Spresent is not bad as a PowerPoint replacement. It seems odd that Google docs and spreadsheet does not include something like Spresent.
  • Swivel is the best Web 2.0 data browsing site. Do check it out!

It seems there is still plenty of opportunities for Web 2.0 entrepreneurs. But the list of Web 2.0 applications is already quite large. There is no question, in my mind, that the Web is the platform now and for the future. Most companies focusing on desktop applications are missing the boat. The Web can do almost anything. Exceptions include:

  • Non trivial programming. I am not expecting a Web 2.0 site where you can drop your Fortran or C++ code. However, I think that we could see far more programming out there. Why can’t I program in Python live in a Web 2.0 site? And maybe design my own applications? Part of the concern is resource hogging and that’s difficult, but not impossible, to manage. It is odd that Web 2.0 applications are designed on the desktop. Where are the good Web 2.0 text editors and IDEs?
  • Non trivial drawing. Drawing and editing picture is a fancy and memory intensive task. This is not likely to move to the Web for now. However, I am surprised that we do not see more Web 2.0 drawing and image editing tools.
  • Games. Mostly, the Web failed at moving from 2D to 3D so games remain desktop applications. This may change, eventually… especially now that all computers, almost, have fancy graphics cards.

Why doesn’t this surge of Web 2.0 foster more interest for Computer Science? Indeed, just as we are reinventing the software industry, Computer Science is becoming the new Physics. Maybe because there is very little Computer Science (in the strict sense of the term) involved in designing a Web 2.0 application? Another explanation is that the design of a good Web 2.0 is just that, design. The main difficulty is in coming up with an elegant solution to problems. Algorithms, data structures, and so on, must be in the picture, but they are a very minor component of the work. Learning the programming skills is overall not difficult. Designing something beautiful is the whole trick. Also, you have to leverage the social network.

Wednesday, May 16th, 2007

AVI 2008 (December 5, 2007 / May 28-30, 2008)

Filed under: Passed CFP — Daniel Lemire @ 20:01

AVI 2008 will be held May 28-30, 2008 in Napoli, Italy. You could do much worse as far as locations go! They invite short (4 pages) and long (8 pages) papers (ACM style). The list of topics is just what you’d expect.

Amazon.com to Launch DRM-Free MP3 Music

Filed under: — Daniel Lemire @ 16:51

This was unavoidable. Amazon.com will sell DRM-Free MP3 Music.

If Amazon.com is good at one thing, that’s selling stuff over the Web. They are the best at it. They have brand recognition. They have excellent technology and solid engineering. They are an innovative, hard-to-catch company. Now? They go DRM-free. This means that you download the MP3 and you just copy and paste it wherever you want it. No messing around with copy protection.

Apple has been doing ok with iTunes despite their DRM approach which I find repulsive, but their business is not selling music. So far, as far as I know, nobody has managed to sell music for a profit, with or without DRM. If anyone can do it, it is Amazon.com. And if they pull it out without DRM, this will be a major setback for DRM initiatives. Why would a customer ever accept DRM when he doesn’t have to?

Journal of Interesting Negative Results in Natural Language Processing and Machine Learning

Filed under: Academia/Research — Daniel Lemire @ 13:02

I often rant about the bias we find in modern science toward positive results. That is, the typical research paper in Computer Science is about some new technique that improves over the previous techniques. Not everyone focuses on these sort of papers, but they are the easiest to get accepted and they are often not very difficult to write. It is often easy to pick a problem and find some way to improve some existing technique. Is this worth your time though? And more importantly, why would a care about a reader?

(I am guilty of writing such papers myself too!)

Well, some people agree because I just found out about the Journal of Interesting Negative Results in Natural Language Processing and Machine Learning. What a cool title! It is also a pretty serious venture since I recognize a few names such as Guy Lapalme and Stan Matwin.

Tag-Cloud Drawing: Software Available for Download

Filed under: — Daniel Lemire @ 9:51

Owen made available the source code related to our recent WWW 2007 workshop paper Tag-Cloud Drawing: Algorithms for Cloud Visualization. It should run fine under MacOS and Linux. Firefox needed. There is a mix of Java code and C code. Take your pick: lemur-tagging-0.0.zip or lemur-tagging-0.0.tar.gz. This include the source code and it should run under Linux and MacOS. A port to Windows is not difficult to achieve.

Monday, May 14th, 2007

Web Mining 2.0 (June 30, 2007 / September 21, 2007)

Filed under: Data Warehousing and OLAP, Passed CFP — Daniel Lemire @ 20:10

There is going to be a Web Mining 2.0 Workshop at ECML/PKDD 2007 (Poland). I like the three challenges set forth by the organizers of the workshop:

  1. New data types appear, for which there exist currently no out-of-the-box data mining solutions, for instance for the triadic hypergraph structure of folksonomies or for documents in wikis that permanently change over time.
  2. The majority of Web 2.0 users have no skills in knowledge engineering and data mining. Tool support targeted directly at the end user has thus to hide the complexity usually involved in the different data mining steps (eg, data cleaning, parameter settings).
  3. Mobile Web 2.0 applications have the potential to offer huge amounts of different types of data: localization is added to temporalization.
« Previous PageNext Page »

37 queries. 0.866 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.