Monday, April 4th, 2005

Lupy: Python Lucene

Filed under: — Daniel Lemire @ 19:26

Update: See PyLucene instead which relies in Java Lucene.

Some crazy folks ported the famous search engine Lucene to Python and the result is called Lupy!

Lupy is a is a full-text indexer and search engine written in Python. It is a port of Jakarta Lucene 1.2 to Python. Specifically, it reads and writes indexes in Lucene binary format. Like Lucene, it is sophisticated and scalable. Lucene is a polished and mature project and you are encouraged to read the documentation found at the Lucene home page.

Who needs Java? No really, who needs Java?

Subscribe to this blog
in a reader
or by Email.

American site leaks Jean Brault testimony

Filed under: — Daniel Lemire @ 16:50

According to the Toronto Sun, an American website breached the publication ban set forth by the Gomery commission (follow the wikipedia link if you don’t know what this is about).

AN AMERICAN website has breached the publication ban protecting a Montreal ad exec’s explosive and damning testimony at the AdScam inquiry. The U.S. blogger raised the ire of the Gomery commission this weekend by publishing extracts from testimony given in secret by Jean Brault last Thursday.

It took me about 60 seconds to find and read the blog in question. I’m not going to help you in any way, except to tell you that it is on the Web out there, and what is on the Web can be found easily, most of the time.

Do publication bans even make sense in the Web era?

In this particular case, having this leak can prove very useful: what if the commission doesn’t lift the ban quickly? What’s the point of a censored public inquiry? I think that the most immediate consequence here is that you can’t keep the information from the public so easily. This is a good thing. Information is freedom.

However, it would have been better for the authors of the leak to keep quiet a few weeks… maybe even a month or so. Individuals have a right to privacy and a fair trial. This is why we had a ban so these people could go to trial without being already guilty by association.

However, the judge should have made the inquiry private at this point. In the information age, you can’t have a secret public inquiry. The judge assumed that only the media can spread information quickly. He is outdated: the blogosphere has far more bandwidth. And you can’t enforce a ban on the blogosphere. You just can’t.

Inexpensive ubiquitous mass storage is closer than you think!

Filed under: Data Warehousing and OLAP — Daniel Lemire @ 11:48

I started using Google Mail (GMail) last year because I want to be able to read my mail from everywhere, all the time. Google offered me 1000 MB of free storage and one of the greatest user interface for a mail client. Oh! Did I mention it has nearly perfect spam filtering, without any effort on my part?

I wondered what would happen when I would reached 1000 MB of mail. For me, that’s about 2 years of incoming mail, maybe a bit less.

Well, my account has now 2057 MB of storage. That’s about 3 years worth of storage. It seems like Google increases your limit as need arises.

DaWaK 2005 (April 15th 2005 / August 22 - 26 2005)

Filed under: Data Warehousing and OLAP, Passed CFP — Daniel Lemire @ 8:09

The Call for Papers for DaWaK 2005 (7th International Conference on Data Warehousing and Knowledge Discovery) is available. The conference will be held in Denmark.

The objective of the 7th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2005) is to bring together researchers, developers and practitioners to discuss research issues and experience in developing and deploying data warehousing and knowledge discovery systems, applications, and solutions. This year the conference will also focus on autonomic aspect of data warehousing and knowledge discovery. Moreover, the conference will be supplemented with invited talks, panel discussion and industrial papers.

Friday, April 1st, 2005

The fate of reduce() in Python 3000

Filed under: — Daniel Lemire @ 18:29

I learned from Will that Guido is taking out functional programming functions from Python

About 12 years ago, Python aquired lambda, reduce(), filter() and map(), courtesy of (I believe) a Lisp hacker who missed them and submitted working patches. But, despite of the PR value, I think these features should be cut from Python 3000.

This doesn’t say that functional programming will disappear, just that Guido is cleaning up the language. You’ll still be able to cleanly pass functions as arguments.

« Previous Page

36 queries. 1.366 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.