----------------------------------------------------------------------------- Filename: virmajoki.pdf Title: Pairwise Nearest Neighbor Method Revisited Author: Olli Virmajoki Format: Adobe Acrobat PDF Abstract: The pairwise nearest neighbor (PNN) method, also known as Ward's method belongs to the class of agglomerative clustering methods. The PNN method generates hierarchical clustering using a sequence of merge operations until the desired number of clusters is obtained. This method selects the cluster pair to be merged so that it increases the given objective function value least. The main drawback of the PNN method is its slowness because the time complexity of the fastest known exact implementation of the PNN method is lower bounded by (N2), where N is the number of data objects. We consider several speed-up methods for the PNN method in the first publication. These methods maintain the precision of the method. Another method for speeding-up the PNN method is investigated in the second publication, where we utilize a k-neighborhood graph for reducing distance calculations and operations. A remarkable speed-up is achieved at the cost of slight increase in distortion. The PNN method can also be adapted for multilevel thresholding, which can be seen as a 1-dimensional special case of the clustering problem. In the third publication, we show how this can be implemented efficiently using only O(N logN) time, in comparison to a straightforward approach that requires O(N2). The merge philosophy is extended, by using the iterative shrinking method, in the fourth publication. In the merge phase of the PNN method, the two nearest clusters are always joined. Instead of this, we assign data objects to the neighboring clusters that they belong to. In this way, we get better clustering results; however, the results come at the cost of an increase in the running time. The proposed method is also used as a crossover method in a genetic algorithm, which produces the best clustering results in respect of the minimization of intra cluster variance. The PNN algorithm can also be applied to generating optimal clustering. In the fifth publication, we use a branch-and-bound technique for finding the best possible clustering by generating a sequence of merge operations. Instead of using the local optimization strategy in the merge phase, we consider every possible merge by constructing a search tree, in which each merge performs the branch. We are also able to reduce the search space under certain bounding conditions. In addition, we give two polynomial time variants that utilize the proposed branch-and-bound technique, which only construct the search tree to a limited depth. Keywords: agglomerative clustering, codebook generation, clustering algorithms, pairwise nearest neighbor method, pattern recognition, unsupervised learning, vector quantization, Ward's method. ----------------------------------------------------------------------------- Filename: kolesnikov.zip Title: Efficient Algorithms for Vectorization and Polygonal Approximation Author: Alexander Kolesnikov Format: zipped postscript Abstract: ----------------------------------------------------------------------------- Filename: eriksson.ps.gz Title: An Algebraic Theory of Multidimensional Arrays Author: Stephen Eriksson-Bique Format: gzipped postscript Abstract: An algebra of programming for multidimensional arrays is presented. This new calculus enhances software development as in the theory of lists and provides a complete theory for the data type as in More's Array Theory. An architecture-independent approach is taken that allows program derivation and optimization. Multidimensional arrays have a single type. Notation and terminology are introduced to facilitate reasoning about arrays. New definitions help to simplify some of the definitions and proofs. A set of primitive operations is defined. This set represents the data type as in abstract data types and includes new functions that account for common program structures. Arrays are equipped with all of their known properties and features, and with appropriate tools to take advantage of all of their dimensions. Yet, the theory is concise. A sound methodology is prescribed to define primitive operations instead of freely defining functions ad hoc. Mainly, operations are defined in a structured way without usig indices. A constructive theory is developed which includes useful identities, properties and laws. The formulas lack many of the indices typically required. Array homomorphisms are explicitly classified. Generic programming is possible using templates for different computations. Programming techniques are explained. Case studies are done which show that programs can be written at a low-level using fine grain parallelism. Keywords: applicative programming, categorical data type, homomorphism, primitive --------------------------------------------------------------------------- Filename: juvaste.ps.gz Title: Modeling Parallel Shared Memory Computations Author: Simo Juvaste Format: gzipped postscript, 200 pages Abstract: Interprocessor communication is the most difficult part of parallel computation on current parallel computers. Programmers find it difficult to correctly and reliably distribute and maintain the data of a parallel program. Most efficiency problems are due to excessive or inefficient communication. Parallel computer manufacturers find it difficult and expensive to build interprocessor communication networks that would keep up with fast processors. In this thesis we shall present a new model of parallel computing, the F-PRAM model. The model characterizes parallel computers with a set of parameters, most of which model the limitations of the shared memory access of the processors, i.e., the communication. For the programmer, the new model offers a convenient abstraction of shared memory, but charges duely the machine-dependent costs of the use of the shared memory. For shared memory access, the model presents a new prefetching primitive. By using the model the programmer can avoid too expensive communication, and the parallel computer manufacturer can choose the most important features to improve. The new model was tested with a fully configurable emulator of an abstract parallel computer. Using the emulator we implemented and analyzed a set of sample algorithms. These measurements revealed, e.g., the effects of insufficient shared memory access capabilities. Different algorithms tolerated different levels of scarcities such as insufficient bandwidth or high shared memory latency. We also estimated the values of the parameters on some existing parallel computers. Keywords: parallel computing, shared memory, modeling, F-PRAM --------------------------------------------------------------------------- Filename: kopponen.ps Title: CAI in CS Author: Marja Kopponen Format: Postscript, 97 pages Abstract: By computer-aided instruction (CAI) we mean the use of a computer application for instructing a specific subject of a domain. The essential question is what CAI applications are good and why. In order to be able to evaluate a CAI application it is necessary to fix the domain. The domain of this work is computer science at the university level. Evaluation of CAI applications is a complex process in which several perspectives have to be considered. We developed evaluation criteria that consist of four parts, namely, domain-based criteria, instructional criteria, user interface criteria, and pragmatic criteria. Our domain-based criteria focus on evaluating the course contents and their relevancy to the instructional aims of the CAI course. We examined several human learning theories in order to find the most appropriate one to be the basis of our instructional criteria, which concentrate on evaluating the educational support. User interface criteria and pragmatic criteria focus on evaluating the implementation of the user interface and the practical matters, such as hardware, software and human resources, respectively. We designed an analysis method based on our criteria for testing the evaluation criteria in practice. The analysis method was applied to a collection of CAI courses on computer science, called COSTOC (COmputer Supported Teaching Of Computer science). The analysis indicated that our criteria worked properly. The main results were that the contents of a CAI course have to be designed by an expert of the domain and that the instructional support should be based on human learning. Further, authoring tools should support CAI authors by offering instructional advice. In addition, authoring CAI courses on computer science requires some special properties of the authoring tool, such as possibilities to speak mathematical language, to animate abstract structures, and to write and execute pseudo or programming code. Keywords: computer-aided instruction (CAI), use of CAI, evaluation of CAI, COSTOC --------------------------------------------------------------------------- Filename: ahonen.ps Title: Author: Jarmo Ahonen Format: Postscript Abstract: --------------------------------------------------------------------------- Filename: forsell.ps Title: Implementation of Instruction-Level and Thread-Level Parallelism in Computers Author: Martti Forsell Format: PostScript Abstract: There are many theoretical and practical problems that should be solved before parallel computing can become the mainstream of computing technology. Among them are problems caused by architectures originally designed for sequential computing-the low utilization of functional units due to the false dependencies between instructions, inefficient message passing due to the send, receive and thread switch overheads, undeterministic operation due to the dynamic scheduling of instructions, and the long memory access delays due to the high memory system latency. In this thesis we try to eliminate these problems by designing new processor, communication and memory system architectures with parallel computation in mind. As a result, we outline a parallel computer architecture, which uses a theoretically elegant shared memory programming model. The obvious VLSI implementation of a large machine using such a shared memory is shown impossible with current technology. There exists, however, an indirect implementation-one can simulate a machine using a shared memory by a machine using a physically distributed memory. Our proposal for such an indirect implementation-Instruction-Level Parallel Shared-Memory Architecture (IPSM)-combines elements from both instruction-level parallelism and thread-level parallelism. IPSM features a static VLIW-style scheduling of instructions and the absence of message passing and thread switch overheads. In the execution of parallel programs, IPSM exploits parallel slackness to hide the high memory system latency, and interthread instruction-level parallelism to eliminate delays caused by the dependencies between instructions belonging to a single thread. In the execution of sequential programs, IPSM uses minimal pipelining to minimize the delays caused by the dependencies between instructions. ...............................................................................