WHAT IS DONE
· Worked on Network Clustering problems’ theoretical properties under different distance metrics (Euclidean and squared Euclidean) and objective functions (fuzzy assignment and hard assignment)
· Built a framework based on the similarities and differences of different Network Clustering problems and explored properties of objective functions
· Developed mathematical proofs for objective functions’ properties and cluster centroid locations at optimality
· Generated problem instances to test the developed solution method
· Built a Genetic Algorithm to solve the problem on MATLAB. The algorithm solved the models within the time limits and outperformed the problems in the literature.
ABSTRACT
In this thesis, Center-Based Clustering Problems on Networks are studied. Four different problems are considered differing in the assignment scheme of the data points and the objective function. Two different assignment schemes are considered, hard assignment and soft assignment. In hard assignment, data points (vertices) are strictly assigned to one cluster, while in soft assignment, vertices are assigned to the multiple clusters with a membership probability. Objective function of a clustering problem could be categorized as minimizing sum of distances or sum of squared distances between the vertices and the centers of clusters they are assigned to.
In this study, cluster centers are not restricted to vertices. They are allowed to be located on vertices or anywhere on the edges. The problems that are studied are analyzed in terms of properties of the cluster centers, and theoretical results are derived. Benefiting from these properties, a unified solution framework is developed which is named Hybrid Genetic Algorithm (HGA), a genetic algorithm with a Local Search operation which uses the theoretical results obtained about the cluster centers. Two versions of HGA, namely Node Based HGA (HGA-N) and Edge Based HGA (HGA-E) are developed by modifying HGA considering the derived properties. To test the performance of the proposed algorithms, numerical experiments are conducted on clustering of datasets from the literature and the simulated ones. Results are compared with the optimal or best solutions reported in the literature (if available). The proposed algorithms are also compared with the well-known heuristics used for the planar clustering problems. These heuristics are modified for the network problems. Computational results show that the proposed approach performs well in all clustering problems studied.
Keywords: Keywords: Clustering on Network, Genetic Algorithm, Hard Assignment, Soft Assignment, Local Search
Click to view the complete work >>