Tuesday, February 8th, 2005

The Geomblog: "To chop a tea kettle"

Filed under: Academia/Research — Daniel Lemire @ 9:10

Geomblog gives a rather pessimistic view of scientific conferences:

And so the implicit content of many a conference paper is not, as one might think, “Here is my research.” Rather, it is: “Here am I, qualified and capable, performing this role, which all of us here share, and none of us want to question too closely. So let’s get it over with, then go out for a drink afterwards.

The main issue here is that attending conferences is simply not so useful anymore because communication appear to be one-way: you go to the conference to give your talk, but you have come not to expect feedback because feedback is rarely given. Well, at least, the question period after most talks is pretty shallow and you rarely see interesting discussions arise.

A related issue is the quality of the peer review which is definitively very low in many cases: you get a hastily written 3 lines as a review of your work. Whether these 3 lines are positive or negative is irrelevant, the point is that these lines were hastily written and often indicate that the reviewer didn’t have time to really read your paper, let alone check it for accuracy or do some research on the topic.

However, I don’t think this should be a major concern. Yes, conferences are becoming more of a ritual and less of a scientific communication hotspot. Yes, traditional peer review is falling apart. However, scientific communication is alive and well. Maybe it is even becoming more civilized in a way: flying a thousand kilometers to present some work, and being afraid someone might make fun of it, that’s not something we should fight for.

I remember a conference I attended a few years back. Several weeks before the conference, I read the abstracts and found out that one young student was presenting work that had already been done in a number of places. I sent a polite email pointing out additional references. As it turns out, the student decided to ignore my email and was publicly blasted… but the point is that email is one form of feedback that might have replaced the public humiliation some miss. Myself, and people I know, were told of mistakes in their papers through email. This seems like a very efficient approach. If you find mistakes in my work, or want to question it, I think the I would prefer it if you email me first… most people would.

Monday, February 7th, 2005

Die trackback, die!

Filed under: Science and Technology — Daniel Lemire @ 16:59

From now on, all trackbacks to this blog are moderated thanks to the moderate-trackbacks plugins. Spammers have really a lot of time to waste. Good thing the wordpress community is very strong and fighting back.

Now, the simplest thing is: do not to use trackback. It is a weak protocol (in a spam infested world) and I’ll probably not moderate these very often especially if the queue gets very long. Ping my blog instead (pingback specs make spamming difficult).

Update: Downes has a recent post on a related topic: Trackback is Dead, Use PubSub (though I published this post before he published his!).

Update 2: about 2 hours after installing the plugin, I’ve got 18 trackback deleted. This is 9 spams an hour. And I’m a low traffice web site…

Friday, February 4th, 2005

How to change or modify your Linux kernel under gentoo

Filed under: — Daniel Lemire @ 14:29

Here’s a quick guide to upgrading or modifying your kernel under gentoo. I assume you have genkernel installed (do “emerge genkernel”).

First of all, if you only want to add or remove elements to your kernel, or change options, you can do this as root:

genkernel --no-clean --menuconfig all

Do your changes and reboot.

If you want to change kernel, then look under /usr/src. Suppose the source code of your new kernel is in /usr/src/newkernel, then do

genkernel --menuconfig --kerneldir=/usr/src/newkernel all

Configure your new kernel the way you like it. Then do

rm /usr/src/linux
ln -s /usr/src/newkernel /usr/src/linux

If you have nvidia (binary) drivers, do

emerge nvidia-glx nvidia-kernel

Finally, possibly after mounting /boot (”mount /boot”) edit /boot/grub/grub.conf, basically just changing the name of the kernel throughout. Reboot and you have a new kernel compiled to your needs for your machine.

If you think this is hard, think again! When was the last time you changed your Windows kernel? Adding stuff to the Windows kernel is relatively (too) easy, but so is it easy with gentoo (single command followed by a reboot) whereas actually changing your Windows kernel is not so easy and is typically only done when you upgrade all of Windows: a task better left to experts.

KDD 2005 (February 18, 2005 / August 21-24, 2005)

Filed under: Science and Technology — Daniel Lemire @ 13:05

The KDD 2005 calls for papers are out. Among other things, there is an industry track.

XPath support in Java 1.5

Filed under: — Daniel Lemire @ 11:28

Things are getting somewhat better in Java land. You can no do some XPath work in Java, see this sample code I wrote this morning (it is not standalone though):

    String xpathexpression = "//xdoc[dtd!='']/fname/text()";
    XPath xpath = XPathFactory.newInstance().newXPath();
    InputSource indexname_input = new InputSource(indexname);
    NodeList nl = (NodeList) xpath.evaluate(xpathexpression,
                              indexname_input, XPathConstants.NODESET);
    for (int i = 0; i < nl.getLength(); ++i) {
      System.out.println("loading document " + (i + 1) + " of " + nl.getLength());
      System.out.println("It uses DTD: "+xpath.evaluate("../../dtd",nl.item(i)));
      String xmlfile = nl.item(i).getNodeValue();
      String xmlPath = baseurl + datadir + xmlfile;
    }

However, I was disappointed to see that the new “foreach” construct in Java doesn’t apply to NodeList objects… I’m sorry, but I getting more and more convinced, with every version, that Java is an ugly hack. I mean, you have a collection of nodes, a standard one at that, and you can’t “foreach” it… what gives?

What is the “foreach” construct: Java 1.5 introduces the idea, well known in many languages, of the “foreach” construct. In effect, if you have a set of elements and want to go through them one at a time, using “for(int i=0; i < length; ++i)” is ugly and error-prone. It is much better to do ” for (element in set) “. Java 1.5 now has this as “for (type element : set)”. This being said, I was under the impression that the Java people had been careful to make sure that the “foreach” construct would work with all standard collections of objects… not so, alas.

Thursday, February 3rd, 2005

Why I have the best and most beautiful wife in the world!

Filed under: Family and Health — Daniel Lemire @ 21:02

Today was my birthday. I’m old, or at least, getting older.

Why is my wife so great? Well, she is beautiful, a great mom and very smart. Also, she gave me a MP3 player today: that’s right, I got a nice Benq Joybee 110. I’m very happy.

Back to the serious stuff. When you get such a device, you got to make it work with Linux. So I plug the device… and it immediately shows up when I type “dmesg”… something like this appears…

USB Mass Storage device found at 3
usb 1-3: USB disconnect, address 3
ohci_hcd 0000:00:02.2: wakeup
usb 1-3: new full speed USB device using address 4
scsi2 : SCSI emulation for USB Mass Storage devices
  Vendor: BenQ      Model: Joybee 110        Rev: 1.00
  Type:   Direct-Access                      ANSI SCSI revision: 02
SCSI device sda: 506368 512-byte hdwr sectors (259 MB)
sda: assuming Write Enabled
sda: assuming drive cache: write through
 /dev/scsi/host2/bus0/target0/lun0: p1
Attached scsi removable disk sda at scsi2, channel 0, id 0, lun 0
Attached scsi generic sg0 at scsi2, channel 0, id 0, lun 0,  type 0

In case this doesn’t work out and you are using gentoo, then make sure you have hotplug installed, if not, do it now:

emerge hotplug
rc-update add hotplug default

And while you are at it, install coldplug too so that USB devices are recognized during boot, not just when they are inserted:

emerge coldplug
rc-update add hotplug boot

Ok, back to the output of dmesg, it seems the device is at ” /dev/scsi/host2/bus0/target0/lun0″, how do I mount this?

 > ls /dev/scsi/host2/bus0/target0/lun0
disc  generic  part1

Aah! Ok… so maybe I should try mounting “/dev/scsi/host2/bus0/target0/lun0/disc”. Let’s see if I can get some info about it…

> fdisk /dev/scsi/host2/bus0/target0/lun0/disc
Commande (m pour l'aide): p
Disque /dev/scsi/host2/bus0/target0/lun0/disc: 259 Mo, 259260416 octets
16 têtes, 32 secteurs/piste, 989 cylindres
Unités = cylindres de 512 * 512 = 262144 octets
                           Périphérique Amorce    Début         Fin      Blocs    Id  Système
/dev/scsi/host2/bus0/target0/lun0/part1   *           1         989      253152+   6  FAT16

Ok, so, it is now telling that “/dev/scsi/host2/bus0/target0/lun0/part1″ is a FAT16 (Microsoft-style) disk. I suspect that I could actually reformat the disk to anything I want at this point. Fine, I go into /etc/fstab, and I add the following line:

/dev/scsi/host2/bus0/target0/lun0/part1 /mnt/joybee vfat defaults,noauto,users,sync 0 0

(See update below, using /dev/scsi/host2/bus0/target0/lun0/part1 is a bad idea!)

It seems to me the “sync” option is important: don’t delay writes in case the device is unplugged by accident. Then, after creating the directory “/mnt/joybee”, I mount it like so…

mount /mnt/joybee

Next, the following python script can be used to copy the content of a m3u file to the device:

import shutil,re
f = open('indiscover.m3u') #only contains file paths
# optionnally, I could clear /mnt/joybee/mp3
for file in f:
  file = file.rstrip()
  print file
  shutil.copy(file,'/mnt/joybee/mp3')

Of course, the script could be a lot smarter, but I’ve got a wife to kiss. And voilà! Who said anything about Linux being hard to use?

Am I done? Not really, my kernel has no support for either supermount or automount, so I’ll need to fix this (back in a few hours). The problem right now is that I need to type “mount /mnt/joybee” when I plug the device and “umount /mnt/joybee” before I unplug it. I bit annoying.

In order to automount, make sure you compile your kernel with support for automount (with genkernel, go under File Systems>Kernel Automounter). You also need to install autofs:

emerge autofs
rc-update add autofs default

Then add the following line to file /etc/autofs/auto.master (not /etc/auto.master!!!):

/misc /etc/autofs/auto.misc --timeout 1

and add the following line to /etc/autofs/auto.misc (not /etc/auto.misc!!!):

joybee -users,sync,fstype=vfat,rw :/dev/scsi/host0/bus0/target0/lun0/part1

That’s pretty much it, then you should be able to cd to /misc/joybee and see your files.

In practice though, I’m not sure it is so great to have automount. Maybe I can simply modify my script above so it mounts and umounts as it needs since I’m unlike to “cd” to my player very often. Indeed, there are problems with automount, at least on my machine. If I try to reload autofs because I’ve changed the configuration, it goes dead and it can’t recover (short of rebooting which I never do). I’ve read somewhere that I must make sure nothing is automounted before I play with autofs. Seems somewhat a weak design. They claim that if nothing is automounted, you can safely stop the deamon: seems to fail here. However, from the man page, it seems that reloading the deamon should be rarely needed. Anyhow, seems like submount would be a better alternative?

Update: Using hard coded paths like /dev/scsi/host2/bus0/target0/lun0/part1 is bad since they will change from time to time. On my machine, it can become /dev/scsi/host3/bus0/target0/lun0/part1 and so on. I believe that if you have “udev” installed (do “emerge udev”), then it gets mapped to /dev/sda1 “always” according to some magically rules I haven’t checked. So, use “/dev/sda1″ throughout above for better results.

Update 2: On recent kernels with udev, you simply do “mount /dev/sda1 /mnt/usb” and you are in business. The following line should appear in all /etc/fstab files these days.
/dev/sda1 /mnt/usb auto noauto,user,umask=111 0 0
Also, it seems like software called hal is able to automount your devices in the /media directory.

Lemire speaks at the CRG (February 11th 2005)

Filed under: Data Warehousing and OLAP, Passed CFP — Daniel Lemire @ 16:05

I’ve been invited to speak at the Centre de recherche en géomatique on my work on OLAP. I’ll be speaking with Yvan Bédard (world famous OLAP researcher) who recently became a NSERC chair. I’m very flattered.

Bienvenue au prochain séminaire
du Centre de recherche en géomatique

VENDREDI 11 février 2005
de 15h30 à 17h00
Salle multimédia 0170
Pavillon Louis-Jacques-Casault
Université Laval

Nous avons le plaisir d’accueillir les conférenciers suivants :

1) Yvan Bédard, Ph.D.

Titre de l’exposé : Mariage de la géomatique et de l’informatique décisionnelle : le programme de recherche de la nouvelle Chaire industrielle CRSNG en bases de données géospatiales décisionnelles

Résumé : Les applications SIG traditionnelles reposent sur les concepts des systèmes transactionnels qui s’avèrent inefficaces pour les décideurs qui ont besoin d’information agrégée, synthèse et multi-temporelle. Or, il existe depuis plusieurs années des solutions pour produire ce type d’information pour les données non-spatiales. Ces dernières reposent sur les bases de données dites « multidimensionnelle » et elles sont à la base de l’informatique décisionnelle (ex. data warehousing, datamart, OLAP, data mining). Toutefois, ces solutions sont inefficaces pour les données géospatiales et ne permettent pas d’en tirer tout leur potentiel. Le but de cette conférence est de présenter un programme de recherche qui vise à offrir des solutions théoriques, méthodologiques et technologiques facilitant la mise en place de bases de données géospatiales décisionnelles, soit le programme de recherche de la nouvelle chaire industrielle CRSNG en bases de données géospatiales décisionnelles.

2) Daniel Lemire, Ph.D.

Titre de l’exposé : Accélération des bases de données multidimensionnelles par ondelettes et tri des valeurs d’attribut

Résumé : Les bases de données multidimensionnelles (OLAP) sont maintenant omniprésentes dans l’industrie représentant un marché mondial de plus de 4 milliars de dollars américains. Au sein du groupe Lemur, nous nous intéressons plus particulièrement aux problèmes de performance et de mise à l’échelle. D’une part, nous nous intéressons à l’accélération des requêtes sur plage par l’utilisation de méthodes hiérarchiques inspirées des ondelettes obliques. D’autre part, nous nous intéressons au tri des valeurs d’attribut dans les modèles hybride (HOLAP) comme méthode pour compresser les données tout en accélérant les requêtes.

Entrée libre
Un 5 à 7 suivra les conférences
au local 1306 du pavillon Louis-Jacques-Casault

« Previous PageNext Page »

32 queries. 0.443 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.