The procedure for an example dataset S1: - In the beginning, there should be two empty folders s1 and s1_train when a subset with 20% of objects is created another subset is also created with the rest of objects (80%) which is needed for the classification-based approach. the datasets s1-01.ts, s1-02.ts, ... are created. The indexes of objects in the original dataset are also created as: idx-01.txt, idx-02.txt, ... All datasets are supposed to be in the folder ../dataset/ You just need to run: sub2_disjoint.bat (note that the permission of files should allow executation) The bat file generates 100 subsets (look at sub1_disjoint.bat) The original dataset should exist in the fodler s1 - Generating null reference and its subsets Generate null reference of a dataset using its txt file and MATLAB code generateNullReference.m as s1_null.txt then convert teh txt file to s1_null.ts on the server using the program txt2ts.bat after that generate subsets similarly as for s1 Do not make s1_null_train, because we don't need it. Note: for s1_null reference, we don't need s1_null_train, therefore it is not created. indexes files like idx.txt are also not produced. The program gives error beccause the folder and files do not exist. But it does not matter, The program produces what we weant. - Running clustering algorithms: KM, RS, and GA Explanation for KM (same for RA and GA): run clustering_main.bat, if you want to have RS, replace km to rs in clustering_all_k.bat clustering_main.bat can run clustering for different datasets with the same range for number of clusters for example for s1 to s4 When you want clustering for a datasets with different range for number of clusters (like birch1), you should modify clustering_all_k.bat Note: check number of required iterations for RS in rs_one_k.bat and rs_one_subset.bat NOte: check number of required number of genrations for GA and other parameters in ga_one_k.bat and ga_one_subset.bat