Friday, September 16th, 2005

Career Swings

Filed under: — Daniel Lemire @ 18:27

Read in the latest Communications of the ACM (Sept. 2005, Vol. 48, No. 9, page 10):

Research firm Gartner Inc. predicts up to 15% of todays’ tech workers will drop out of the profession in five years, not including those who retire or die. (…) demand for technology developpers is forecast to shrink by 30%.

Repeat after me: long term predictions about the job market are worthless. Basically, they often take first order trends and extrapolate. Why? Because they are written by people with only a very basic understanding of numerical analysis and time series.

If the system is stationary, then the prediction might hold true. However, technology is hardly a stationary system. Right now, there isn’t much happening in technology and the oil industry looks like a safe bet if you need to invest. In two years, there might be a new disruptive technology like the web has been, requiring companies to massively reinvest in their IT architecture or factories. Like a new way to manage massive data sources or ultraefficient solar panels.

Hint to students: if you are interested in technology, don’t go to a business, medical or law school. Get a CS, Math or Engineering degree. You may not earn as much initially, but you might very well have the last laugh. Don’t base your life on useless predictions. Back when I was a High School student, I was told that 75% of new jobs would be tech. jobs when I’d graduate. I was then told that there would be a severe shortage of science Ph.D.s. Both of these predictions were overwhelmingly wrong. The truth is not that science and technology is a bad choice, the truth is that job market predictions are terribly inaccurate.

Myself, I cannot believe that in 2015, we’ll all be lawyers, business managers, salesman, and medical doctors. I cannot believe that technology will stand still and mathematics beyond basic algebra will be a lost art. I cannot believe my two sons will have business degrees and make three times my salary by managing a bunch of underpaid Indian programmers.

Call me a fool, if you want, but I’m slightly more optimistic. If I’m proven wrong, then I’ll retire early and write science fiction novels describing the world as I think it should have been.

Wednesday, September 14th, 2005

Why C++ is a bad language: here’s how to convert floats and integers to a string

Filed under: — Daniel Lemire @ 14:16

What do you do when a student asks you a valid question and you think your answer ought to be wrong but that’s the only answer you have? You turn it into a blog post!

How do you convert an integer to a string in C++? I mean, it is a pretty basic operation, right?

Here’s what I wish it was…

string s = "some text = ";
s.append(2);

Well no. The basic string object in C++ is just a container. The following code is the C++ way to to convert numbers to strings.

#include <sstream>
#include <iostream>
using namespace std;
int main() {
stringstream sstr;
int integer = 2;
float floater = 2.2f;
sstr < < "La vie est belle" << integer << floater << endl;
string result = sstr.str();
const char * cresult = result.c_str();
cout << result << endl;
return 0;
}

Ok, I cheated a bit, but you get my point.

Google Blog Search

Filed under: — Daniel Lemire @ 10:09

Google Blog Search is out. You have Atom/RSS feeds for your favorite queries:

Can I subscribe to search results?

Yes. At the bottom of each page of search results you can find several links, offering the top 10 or 100 results as either Atom or RSS feeds. Just grab the links from here and subscribe to them in the news aggregator of your choice and you will get updates whenever new posts are made that match your query.

We are only visiting the shore of mathematics

Filed under: — Daniel Lemire @ 7:51

I like Doron Zeilberger’s 66th Opinion:

all what human mathematics does is apply implicit exponential-time algorithms, called “heuristics” to find some trivial pebbles on the shore of the (even decidable part!) of the mathematical ocean.

In short, a mathematician solves trivial problems, a mathematician with a computer solves semi-trivial problems, but we are only visiting the shore of mathematics.

It is very insightful. One could look at the current state of higher mathematics, observe that progress is slowing and conclude that we have pretty much covered the realm of useful mathematics. In truth, we have maybe covered the realm of mathematics we could handle with a human brain. And current computers probably can’t help us too much.

Tuesday, September 13th, 2005

What would you put in a Computer Science Curriculum?

Filed under: Science and Technology — Daniel Lemire @ 16:46

Dan Zambonini wrote a cool paper: What would you put in a Computer Science Curriculum?

His premise is as follows:

We get a number of resumés trickling through every week, with a fair proportion coming from Computer Science graduates. I look at the list of modules they’ve studied, and although they sound very interesting, there seems to be little relevance to the current jobs market.

Again, the problem is that there is confusion between Computer Science and what either Information Technology or Software Engineering: degrees in either of those prepare you for the job market, a degree in Computer Science doesn’t. Or at least, not in the way Dan expects.

This sets aside, his list of requirements for what could be described as an Applied Computer Science degree is interesting:

  • The basics of Programming (variables, data types, references, pointers, scope, error handling, iteration, core algorithms - searching, sorting, etc.)
  • Basic mathematics, basic statistics
  • Patterns and Anti-Patterns (With real world examples, not just theory)
  • Real world Databases (Normalisation and De-normalisation, SQL, Indexing)
  • Basics of good code architecture: Loose Coupling, etc.
  • OO Design, Interfaces, etc.
  • The importance and tools of Planning: Spec’ing,, UML etc.
  • Architectures: client/server, SOA, P2P, etc.
  • A ‘Big’ language or two (Java, C#, C/C++)
  • A scripting/’agile’ language or two (PHP, Perl, Python, Ruby)
  • XML (DOM/SAX, XSLT/XPath, etc.)
  • Economics, Business Studies, Costing Projects, Commercial pressures
  • Copyright, Privacy, Data Protection
  • Project/Time Management
  • Internationalisation, Localisation, Encoding, Unicode
  • Grammar, punctuation, concise and clear writing
  • Interface Design, Usability, Accessibility, HCI
  • Security
  • Code Reading
  • Common Protocols (TCP/IP, HTTP, SMTP, FTP)
  • Testing, Debugging, Performance, Re-factoring
  • Problem analysis
  • Source control, change management
  • The typical Software lifecycle
  • Metadata, Information Architecture, etc.
  • The basics of GIS
  • Touch typing
  • Health and safety (nutrition?)

I’m particularly fond of his mention of XML:

And, quite surprisingly, only 1 course mentioned XML, which was in an optional module. Is there any modern software these days that doesn’t use XML? So why can’t computer science graduates tell me when to use SAX and when to use DOM?

Well, Dan, because a) their professors don’t know when to use SAX and when to use DOM b) Computer Science is not concerned with the difference between SAX and DOM, but rather with the difference between building a buffer and not building a buffer when parsing a tree. However, software engineers and information technologists should know when to use DOM and when to use SAX.

(I got this through Ed Bilodeau’s blog.)

Disclaimer: I am one of the few university professors to offer a university-level course on XML (in French).

ET-WBC 2006 (November 25 2005 / February 25, 2006)

Filed under: Passed CFP — Daniel Lemire @ 11:51

The ET-WBC 2006 CFP is out. The Emerging Technologies for Web-based Communities (ET-WBC) Workshop will be held at Mondragon University (Spain).

The topics include, but are not limited to:

  • Trust, privacy, security in Web-based communities;
  • Emerging technologies in eCommerce and eGovernment communities;
  • Artificial intelligence methods for Web-based communities;
  • Distributed Learning Communities;
  • Semantic Web and related technologies for Web-based communities;
  • Grid Computing for Web-based communities;
  • Emerging eLearning technologies;
  • Ubiquitous computing for Web-based communities;
  • Software agents technologies for Web-based communities;
  • Social software and collaborative filtering tools;
  • Visualizing and modeling Web-based communities;
  • Emerging devices, tools, media and virtual environments;

Monday, September 12th, 2005

Attribute Value Reordering For Efficient Hybrid OLAP

Filed under: Abstracts, Data Warehousing and OLAP — Daniel Lemire @ 12:50

Our paper Attribute Value Reordering For Efficient Hybrid OLAP was accepted by Information Sciences a few days ago. It should appear next year I imagine but I make the preprint available now. It is an extended version of an earlier paper presented at DOLAP. In this case, the journal version is considerably extended and well worth the read.

It shows a very mathematical approach to multidimensional databases (OLAP) linking some OLAP problems to graph theory (an equivalent to graph isomorphism is shown) and there are some probabilistic results there as well.

There are also other, less mathematical, novel results like the concept of the normalizaiton of a data cube which is quite distinct from the normalization of a relational database.

The normalization of a data cube is the ordering of the attribute values. For large multidimensional arrays where dense and sparse chunks are stored differently, proper normalization can lead to improved storage efficiency. We show that it is NP-hard to compute an optimal normalization even for 1×3 chunks, although we find an exact algorithm for 1×2 chunks. When dimensions are nearly statistically independent, we show that dimension-wise attribute frequency sorting is an optimal normalization and takes time O(d n log(n)) for data cubes of size n^d. When dimensions are not independent, we propose and evaluate a several heuristics. The hybrid OLAP (HOLAP) storage mechanism is already 19%-30% more efficient than ROLAP, but normalization can improve it further by 9%-13% for a total gain of 29%-44% over ROLAP

« Previous PageNext Page »

33 queries. 0.407 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.