Data mining is the process of discovering patterns in massive amounts of data. Methods like machine-learning, database systems, artificial intelligence and statistics can help to extract information from the data in an understandable way in order to use it for business decisions.
RapidMiner
RapidMiner offers data integration and analysis, analytical ETL and reporting combined in a community edition or enterprise edition. It comes with a graphical user interface for designing analysis processes. The solution offers a meta data transformation, which allows inspecting for errors during design time.
KNIME
KNIME [naim] is a user-friendly graphical workbench for the entire analysis process. The Konstanz Information Miner offers data access, initial investigation, data transformation, predictive analyses and reporting tool with a user-friendly graphical interface. They offer an open-source tool and commercial products. The open integration platform offers over 1.000 modules. KNIME is also the open source data analytics platform continuously ranked No. 1 in customer satisfaction.
Mahout
Mahout is another project by Apache. It offers algorithms to build machine-learning libraries that can scale to reasonable large data sets. It is especially made for classification, clustering and batch-base collaborative filtering that run on Hadoop. Non-Hadoop and single-node contribution can also be used and the core libraries are optimized also for non-distributed algorithms.
Orange
As the name indicates, it is an open source tool aiming to make data mining ‘fruitful and fun’. It offers data mining through the use of visuals. It is data visualizations and analysis made easy for novice users as well as for experts. User can design data analyses through visual programming and Python scripting.
WEKA
WEKA stands for ‘Waikato Environment for Knowledge Analysis’ and it is a collection of machine-learning algorithms in order to solve data mining problems. It is written in Java and thus runs on almost any modern computing platform. It supports different data mining tasks such as clustering, data pre-processing, regression, classification, feature selection as well as visualization.
KEEL
KEEL is an abbreviation for Knowledge Extraction based on Evolutionary Learning. It is based on Java and it can assess the behaviour of evolutionary algorithms used for data mining related problems, among others clustering, regression or classification. KEEL was also designed for research and educational purposes.
Togaware
Togaware has developed the ‘R Analytical Tool To Learn Easy’ also known as Rattle. This is a graphical user interface for data mining using the R language. It can present visual and statistical summaries of data. It can built forms out of data that are easily modelled, present the performance of models graphically and it can score new datasets.
SPMF
Also a Java based data mining. It includes 51 different algorithms among others for sequential pattern and or rule mining, frequent itemset mining, clustering and association rule mining. The source code of each algorithm can be combined with other Java programs. It can be used with an interface or from the command line.