Have you being occupied with doing some k-means clustering and have being coping to calculate silhouette coefficient to evaluate your clustering with Rapidminer? Or to find the best k for the algorithm? (Also I will come to this subject again in more detail, about implementing it in Python code).
Although, Rapidminer itself does not contain the Silhouette coefficient, you can add it with a java plugin from here.
The downloads worked for me for Rapidminer 6.5. If you have any older version you can check older versions from my friend here!
So, what to do next?
The download was not so difficult for me, but finding what should I do with this jar file was kind of tricky! My friend stated: Simply put the plugin (file CPPlugin-0.3.jar) into lib/plugins in RapidMiner home directory.
Simply! Ha! And where are these lib/plugins please??
Anyway, it took me some time but I found it:)
Just go to the Terminal or cmd prompt and use this command:
cp /Users/username/Downloads/CPPlugin-0.3.jar /Users/username/.Rapidminer/extensions
- cp is the unix command for copy
- The first path is the directory that the jar file is in.
- The second path is lib/plugins. I suppose! At least it worked for me.
Oh, don’t forget to change username with the name you have set your computer 🙂
Then if we open Rapidminer and search for Silhouette it is there. Magic!
A simple connection could be like this:
Attention! If you don’t use Euclidean Distance to your clustering you should also add a Data to Similarity operator.