Workshop on Approximation Algorithms

Data mining aims at finding interesting, useful or profitable information in very large databases. The enormous increase in size of available scientific and commercial databases as well as the continuing and exponential growth in performance of present day computers make data mining a very active field. Its main tasks, i.e., discrimination, clustering, and relation finding have already been partially explored in Data Analysis, Pattern Recognition, Artificial Intelligence and other fields, sometimes since a half century ago. The challenge, however, resides in the much increased size of problems considered. Indeed, instead of problem instances with a few tens or hundred entities, current ones often have thousands, tens of thousands and sometimes much more. Therefore, traditional methods are revised and streamlined. They are complemented by many new methods.

Mathematical Programming plays a key role in this endeavor. It forces to make precise the objective persued (e.g., a clustering criterion or a measure of discrimination) as well as the constraints imposed on the solution (e.g., find a partition, a covering or a hierarchy in clustering). It also provides powerful mathematical tools to build highly performing exact or approximate algorithms.

The workshop will be the occasion for pioneers in the Data Mining and Mathematical Programming fields, as well as promising young scientists (and some researchers in between) to meet and discuss main trends and developments. The three basic areas of data mining will be considered:

• Discrimination or supervised classification: build a function that best discriminates between good and bad entities of a given set, and classifies as correctly as possible new entities.

• Clustering or unsupervised classification: find subsets of a given set of entities that are homogeneous and well-separated.

• Relation finding: given a set of entities, together with measurements or observations on them, find relations satisfied by all, or by most, of them.

Updated -