Tuesday, May 2nd, 2006

Putting Labs Online with Web Services

Filed under: Abstracts — Daniel Lemire @ 17:16

I don’t normally repost IEEE Computing abstracts here, but Yuhong is one of my collaborators and Hamadou is a UQAM colleague, and I’ve known Ali for years. Moreover, with all the evil things I said about standard-heavy web services, it seems only fair that I would compensate somewhat with this post.

Putting Labs Online with Web Services
IEEE Computing
March/April 2006 (Vol. 8, No. 2) pp. 27-34

Yuhong Yan, University of New Brunswick
Yong Liang, University of New Brunswick
Xinge Du, University of New Brunswick
Hamadou Saliah-Hassane, Université du Québec à Montréal
Ali Ghorbani, University of New Brunswick

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MITP.2006.45

Abstract
Web services provide a way to offer remote control of scattered scientific instruments, enabling online labs that students can use from anywhere, at any time.

In science and engineering education, experimentation plays a crucial role. The classicuniversity science course entails lecture and lab: students’ active participation in experiments enhances their understanding of the principles described in the lectures. However, not every educational institution can afford all the experimental equipment it would like. Moreover,colleges and universities increasingly offer distance-learning programs, allowing students to attend lectures and seminars and complete coursework using the Internet. In situations such as these, access to online laboratories or experiment systems can greatly enhance student learning—increasing the range of experiments available at an institution and giving the distance learners hands-on, real-time experience.

Online laboratories,however,are not as mature as online courses. Current online experiment systems fall into two categories: virtual laboratories provide a simulation environment in which students conduct experiments; remote laboratories, our focus in this article, let students use a GUI to operate actual instruments via remote control.

The difficulty with creating an effective laboratory operated by remote control is making scattered computational resources and instruments operable across platforms. Existing online experiment systems commonly use a classic client-sever architecture and off-the-shelf middleware for communication.

Normally, to ensure interoperability, these systems rely on instruments from a single company—such as National Instruments or Agilent—and Microsoft Windows as the common operating system. Users must then install additional software to operatethe remote instruments. For a student using an old laptop or the computer at a public library, this could be difficult. So, online labs configured this way can’t achieve the ultimate goals of sharing heterogeneous resources among online laboratories and easy access via the Web.

Our solution to these shortcomings is to base online experiment systems on Web services,which are designed to support interoperable, machine to-machine interaction over a network and can also integrate heterogeneous resources.We have devised a service-oriented architecture for online experiment systems, enabled by Web service protocols, and a methodology for wrapping the operations of the instruments into Web services.

Thursday, January 19th, 2006

Quasi-Monotonic Segmentation Talk in Ottawa

Filed under: Abstracts — Daniel Lemire @ 10:49

I’m giving a talk next week at the Text Analysis and Machine Learning Group (TAMALE) seminar at the University of Ottawa. I will talk on Optimal Linear Time Algorithm for Quasi-Monotonic Segmentation. It is not directly related to text and machine learning, but many of the ideas from time series data mining port over to text processing. After all, a sequence is a sequence. I see Joel Martin wil also give a talk there this Spring on “Libminer”. Here’s the abstract for my talk:

Monotonicity is a simple yet significant qualitative characteristic. We consider the problem of segmenting an array in up to K segments. We want segments to be as monotonic as possible and to alternate signs. We propose a quality metric for this problem, present an optimal linear time algorithm based on novel formalism, and compare experimentally its performance to a linear time top-down regression algorithm. We show that our algorithm is faster and more accurate. Applications include pattern recognition and qualitative modeling.

Tuesday, January 17th, 2006

Linear Time Algorithm for Approximating a Curve by a Single-Peaked Curve

Filed under: Abstracts — Daniel Lemire @ 22:34

Here is an interesting paper by Jinhee Chun, Kunihiko Sadakane, and Takeshi Tokuyama.

Given a function y = f(x) in one variable, we consider the problem of computing the single-peaked (unimodal) curve y=Φ(x) minimizing the L2-distance between them. If the input function f is a histogram with O(n) steps or a piecewise linear function with O(n) linear pieces, we design algorithms for computing Φ in linear time. We also give an algorithm to approximate f with a function consisting of the minimum number of unimodal pieces under the condition that each unimodal piece is within a fixed L2-distance from the corresponding portion of f.

It reminds me of this other paper:

N. Haiminen, A. Gionis, K. Laasonen, Algorithms for unimodal segmentation with applications to unimodality detection, to appear in the Journal of Knowledge and Information Systems (KAIS).

Tuesday, January 10th, 2006

Optimal Algorithms for Unimodal Regression

Filed under: Abstracts — Daniel Lemire @ 18:53

I don’t usually post abstracts of papers other than my own here, but this one is particularly significant to me though I don’t know the author.

This paper gives optimal algorithms for determining real-valued univariate unimodal regressions, that is, for determining the optimal regression which is increasing and then decreasing. Such regressions arise in a wide variety of applications. They are a form of shape-constrained nonparametric regression, closely related to isotonic regression. For the L2 metric our algorithm requires only O(n) time for regression on n points, while for the L1 metric it requires O(n logn) time. Previous algorithms only considered the L2 metric and required (n^2) time. All previous algorithms used multiple calls to isotonic regression, and our major contribution is to organize these into a prefix isotonic regression, whereby one computes the regression on all initial segments. The prefix approach utilizes the solution for one initial segment to aid in the solution of the next, which considerably reduces the total time required. Our prefix isotonic regression algorithm for
the L1 metric also supplies the first O(n log n) algorithm for L1 isotonic regression.

Source.

Sunday, October 16th, 2005

Analyzing Large Collections of Electronic Text Using OLAP

Filed under: Abstracts, Data Warehousing and OLAP — Daniel Lemire @ 10:27

Steven will be presenting our paper Analyzing Large Collections of Electronic Text Using OLAP at APICS 2005. This work is based on an idea by Owen Kaser: what happens if we apply multidimensional databases (OLAP) to literary research?

Data Mining and Information Retrieval techniques are used routinely for literary research or processing text in general, but decision support techniques commonly used in the business world (sometimes called “Business Intelligence”) have not seen much use yet in text processing. The main difference between decision support systems and data mining is the fact that in decision support, the user remains in control, thus simple yet extremely efficient algorithms are favoured over sophisticated, but possibly expensive algorithms. Ideally, all decision support algorithms should be O(1) after accounting for precomputations. With infinite storage almost available now, decision support research is due for a technological and scientific boom.

Computer-assisted reading and analysis of text has various applications in the humanities and social sciences. The increasing size of many electronic text archives has the advantage of a more complete analysis but the disadvantage of taking longer to obtain results. On-Line Analytical Processing is a method used to store and quickly analyze multidimensional data. By storing text analysis information in an OLAP system, a user can obtain solutions to inquiries in a matter of seconds as opposed to minutes, hours, or even days. This analysis is user-driven allowing various users the freedom to pursue their own direction of research.

Monday, September 12th, 2005

Attribute Value Reordering For Efficient Hybrid OLAP

Filed under: Abstracts, Data Warehousing and OLAP — Daniel Lemire @ 12:50

Our paper Attribute Value Reordering For Efficient Hybrid OLAP was accepted by Information Sciences a few days ago. It should appear next year I imagine but I make the preprint available now. It is an extended version of an earlier paper presented at DOLAP. In this case, the journal version is considerably extended and well worth the read.

It shows a very mathematical approach to multidimensional databases (OLAP) linking some OLAP problems to graph theory (an equivalent to graph isomorphism is shown) and there are some probabilistic results there as well.

There are also other, less mathematical, novel results like the concept of the normalizaiton of a data cube which is quite distinct from the normalization of a relational database.

The normalization of a data cube is the ordering of the attribute values. For large multidimensional arrays where dense and sparse chunks are stored differently, proper normalization can lead to improved storage efficiency. We show that it is NP-hard to compute an optimal normalization even for 1×3 chunks, although we find an exact algorithm for 1×2 chunks. When dimensions are nearly statistically independent, we show that dimension-wise attribute frequency sorting is an optimal normalization and takes time O(d n log(n)) for data cubes of size n^d. When dimensions are not independent, we propose and evaluate a several heuristics. The hybrid OLAP (HOLAP) storage mechanism is already 19%-30% more efficient than ROLAP, but normalization can improve it further by 9%-13% for a total gain of 29%-44% over ROLAP

Tuesday, September 6th, 2005

Collaborative Filtering and Inference Rules for Context-Aware Learning Object Recommendation

Filed under: Abstracts — Daniel Lemire @ 11:40

I posted on the web a preprint of our paperCollaborative Filtering and Inference Rules for Context-Aware Learning Object Recommendation. Here’s the abstract:

Learning objects strive for reusability in e-Learning to reduce cost and allow personalization of content. We argue that learning objects require adapted Information Retrieval systems. In the spirit of the Semantic Web, we discuss the semantic description, discovery, and composition of learning objects using Web-based MP3 objects as examples. As part of our project, we tag learning objects with both objective and subjective metadata. We study the application of collaborative filtering as prototyped in the RACOFI (Rule-Applying Collaborative Filtering) Composer system, which consists of two libraries and their associated engines: a collaborative filtering system and an inference rule system. We are currently developing RACOFI to generate context-aware recommendation lists. Context is handled by multidimensional predictions produced from a database-driven scalable collaborative filtering algorithm. Rules are then applied to the predictions to customize the recommendations according to user profiles. The prototype is available at inDiscover.net.

You can download the preprint.

Next Page »

35 queries. 0.402 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.