AI requires huge volumes of data to exist: what about learning?

This has been around for quite some time, but it keeps on popping up left and right. “Google (…) believes that strong AI requires huge data volumes to really exist.”

Since nobody knows what is required for strong AI to exist, this is a currently non-falsifiable conjecture. One thing is for sure is that it takes several years, for a human being to hope passing the famous Turing test. My 4 months old baby can’t pass the Turing test.

So, there is strong evidence that you need lots and lots of data before intelligence, as defined by the Turing test, can emerge.

Now, what does it say about “learning”? It seems to imply that to “learn”, you need to be exposed to lots and lots of data. This suggests, maybe, that the web is the real future of learning because, face it, there is only so much an instructor can convey to a group while spending hours in front of a black board. I can look up facts and theories much faster through the web, though this is recent as, until a few years ago, the black board was still a more efficient way to gather data and, in some instances, like mathematics, it still is.

One interesting conclusion though is that broadband ought to be very useful to learning. If being exposed to lots and lots of data is required, then you need broadband. What am I doing here, in my basement, with my cable modem? I need a T1 stat! Oh! Right! I’d still be limited by how fast others can deliver the information.

Theorem A large data output is necessary for having a rich learning experience.

So, if you have online content for a given course, the relative performance of the server does matter. Multimedia content does matter.

Or does it? Notice I didn’t attempt to prove my theorem. So, let’s call it the “Lemire conjecture” for now.

We need better text forms on the web

Web forms are evil. You know these things where you enter text in a text box and then click submit? Yes, I know there are better, more XMLish, ways of coding them, but my beef is with the current user model of a text form and I don’t see this changing any time soon unless the browser people start paying attention.

  • There is not builtin protection, at the browser, for a crashed server. So, you can fill out very long forms and lose all of your work because the server crashed. No, the back button may not work. Conceptually, there is not way to tell how the back button will behave with respect to web forms and it is a poor substitute for a saved copy of your work.
  • Spell checking is still not supported by default by most browser. Why?
  • Most standard text editing functions are not supported by most browsers (such as “search and replace”). Why?

Tools like gmail, using AJAX, manage to get most of these functions right, but why isn’t it supported at the browser level? At least in an optional way. For example, when you submit a form, the browser could save a copy of all text content in a local folder. Security you say? Well, the security people can handle the issues this would create, I’m sure.

We need better. We need it badly.

Thoughts on Software Complexity

Kurt shares with us his thoughts on software complexity:

Over the years, I’ve noticed that in programming, as in other systems, there seems to be a fairly invariant rule out there:

You can never eliminate complexity from a system, you can only move it from place to place.

Yep. This is yet another instance of the No-Free-Lunch Theorem. It basically says that while you can find more accurate algorithms, very often, all you are doing is specializing your algorithm to perform better in some conditions, but worse in others.

Of course, specializing is good. Some cases are more important than others. But be skeptical if someone says that X is better in every respect than Y. There is, usually, a catch.

The same must be true in software. Fancier platforms make it easier to do some things, but harder to do other things. What you have to worry about is whether these cases are important for you.

J2EE, at least the early versions, is a beautiful example where the designers did a great job at making some cases very easy, while making others, very important cases, much harder, leaving J2EE developers in tough spots.

Opening lots and lots of files under Linux

Suppose you want a program, or a process to be precise, to open 10,000 files simultaneously. For some reason, I thought that, by default, this would be possible, but it seems that Linux sets the the default limit to 1024 files on most distributions we checked.

First of all, check how many files your system allows you to open simulateneously:

# cat /proc/sys/fs/file-max
101066

On my system, as you can see, a process should be able to open 100,000 files simultaneously without a problem. If your number is much lower, you may need to do some extra work.

Unfortunately, there are security settings above and beyond this number. To get around them, add the following line to “/etc/security/limits.conf”:

* - nofile 100000

Then, you need to log in again (fresh, not within X). To make sure it worked, type

#ulimit -n
100000

As you can see, it worked for me.

To make double sure it works, you might try to run the following C++ program:

#include <fstream>
#include <string>
#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>
using namespace std;
int main() {
fstream lfs[10000];
for (int k = 0 ; k < 10000; ++k) {
stringstream strs;
strs << "stupidtest" << k;
lfs[k].open(strs.str().c_str(),ios::out);
assert(lfs[k].good());
if(lfs[k].good()) {
cout << "file created "<<strs.str()<< " ok!" << endl;
}
}
for (int k = 0 ; k < 10000; ++k) {
stringstream strs;
strs << "stupidtest" << k;
if(lfs[k].good()) {
cout << "file "<<strs.str()<< " still ok!" << endl;
}
lfs[k] << k;
if(lfs[k].good()) {
cout << "file "<<strs.str()<< " still ok
after write!" << endl;
}
}
for (int k = 0 ; k < 10000; ++k) {
lfs[k].close();
}
}

China soon to export degrees?

According to the Guardian, we are on the brink of a revolution which includes China becoming an exporter of degrees:

David Graddol, an applied linguist, said China, which has traditionally been a major source of international students, was repositioning itself as a net exporter of higher education, poaching students from its Asian neighbours, such as India, Japan and Korea. China, says Graddol, will soon be able to offer cheaper degrees that are taught in English and come with the added incentive of Mandarin, a language that is becoming increasingly important to the international business community.

As anyone who has visited a North American English-Speaking University knows, Chinese make up a large proportion of students in many programs (including Computer Science). This is equivalent, for China, to “importing degrees”. Now, they claim that China could reverse this and start exporting degrees. I think China can pull it off, but what would happen to the degrees we offer in the West?

Some claim that, to survive in the education market, Westerners ought to offer more online degrees. At least in theory, it makes sense, there will always be a strong demand, in Asia and elsewhere, for international degrees, especially if they can be have cheap (such as is possible online).

But I’m somewhat skeptical. Naturally, online courses and degrees is a growing market and it will keep on growing for many years to come, especially as web technology becomes more capable and bandwidth grows cheaper. Smart people put their long term money on online learning.

However, before online degrees become a distinct exportable good, you need to have your local students freely choosing the web instead of the classroom. Asians are not going to buy degrees your local population doesn’t value dearly.

I really think China can reverse its status as far as education goes, and by doing so, hurt badly Western Universities in the long run. I don’t think online degrees are going to be a viable escape for Western Universities. If China goes ahead with its plans to become an education provider, it will hurt, no matter what!

« Previous Page

18 queries. 0.390 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.