We looooooove FREE!!! Here are some of the best free data mining tools - dance to that tune

Mo Data stashed this in Big Data Technologies

http://www.siliconafrica.com/the-best-data-minning-tools-you-can-use-for-free-in-your-company/#!

1. RapidMinerRapidMiner is unquestionably the world-leading open-source system for data mining. It is available as a stand-alone application for data analysis and as a data mining engine for the integration into own products. Thousands of applications of RapidMiner in more than 40 countries give their users a competitive edge.

2. RapidAnalyticsBuilt around RapidMiner as a powerful engine for analytical ETL, data analysis, and predictive reporting, the new business analytics server RapidAnalytics is the key product for all business critical data analysis tasks and a milestone for business analytics.

3. WekaWeka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.

4. PSPPPSPP is a program for statistical analysis of sampled data. It has a graphical user interface and conventional command-line interface. It is written in C, uses GNU Scientific Library for its mathematical routines, and plotutils for generating graphs. It is a Free replacement for the proprietary program SPSS (from IBM) predict with confidence what will happen next so that you can make smarter decisions, solve problems and improve outcomes.

5. KNIMEKNIME is a user-friendly graphical workbench for the entire analysis process: data access, data transformation, initial investigation, powerful predictive analytics, visualisation and reporting. The open integration platform provides over 1000 modules (nodes)

6. OrangeOrange is an Open source data visualization and analysis for novice and experts. Data mining through visual programming or Python scripting. Components for machine learning. Add-ons for bioinformatics and text mining. Packed with features for data analytics.

7. Apache MahoutApache Mahout is an Apache project to produce free implementations of distributed or otherwise scalable machine learning algorithms on the Hadoop platform. Currently Mahout supports mainly four use cases: Recommendation mining takes users’ behavior and from that tries to find items users might like. Clustering takes e.g. text documents and groups them into groups of topically related documents. Classification learns from exisiting categorized documents what documents of a specific category look like and is able to assign unlabelled documents to the (hopefully) correct category. Frequent itemset mining takes a set of item groups (terms in a query session, shopping cart content) and identifies, which individual items usually appear together.

8. jHepWorkjHepWork (or “jWork”) is an environment for scientific computation, data analysis and data visualization designed for scientists, engineers and students. The program incorporates many open-source software packages into a coherent interface using the concept of scripting, rather than only-GUI or macro-based concept. jHepWork can be used everywhere where an analysis of large numerical data volumes, data mining, statistical analysis and mathematics are essential (natural sciences, engineering, modeling and analysis of financial markets).

9. RattleRattle (the R Analytical Tool To Learn Easily) presents statistical and visual summaries of data, transforms data into forms that can be readily modelled, builds both unsupervised and supervised models from the data, presents the performance of models graphically, and scores new datasets. It is a free and open source data mining toolkit written in the statistical language R using the Gnome graphical interface. It runs under GNU/Linux, Macintosh OS X, and MS/Windows. Rattle is being used in business, government, research and for teaching data mining in Australia and internationally.

<a rel="nofollow" target="_blank" href="http://www.siliconafrica.com/the-best-data-minning-tools-you-can-use-for-free-in-your-company/#!">http://www.siliconafrica.com/the-best-data-minning-tools-you-can-use-for-free-in-your-company/#!</a>

1. RapidMiner<a rel="nofollow" target="_blank" href="http://rapid-i.com/content/view/181/190/lang,en/">RapidMiner</a>
 is unquestionably the world-leading open-source system for data mining.
 It is available as a stand-alone application for data analysis and as a
 data mining engine for the integration into own products. Thousands of 
applications of RapidMiner in more than 40 countries give their users a 
competitive edge.

2. RapidAnalyticsBuilt around RapidMiner as a powerful engine for analytical ETL, data
 analysis, and predictive reporting, the new business analytics server <a rel="nofollow" target="_blank" href="http://rapid-i.com/content/view/182/196/">RapidAnalytics</a> is the key product for all business critical data analysis tasks and a milestone for business analytics.

3. Weka<a rel="nofollow" target="_blank" href="http://www.cs.waikato.ac.nz/ml/weka/">Weka</a> is a 
collection of machine learning algorithms for data mining tasks. The 
algorithms can either be applied directly to a dataset or called from 
your own Java code. Weka contains tools for data pre-processing, 
classification, regression, clustering, association rules, and 
visualization. It is also well-suited for developing new machine 
learning schemes.
<a rel="nofollow" target="_blank" href="http://www.siliconafrica.com/wp-content/themes/directorypress/thumbs//weka.png"><img alt="weka" src="//img.pandawhale.com/post-31679-weka-lWDZ.png" height="521" width="709" /></a>

4. PSPP<a rel="nofollow" target="_blank" href="https://www.gnu.org/software/pspp/">PSPP </a>is a 
program for statistical analysis of sampled data. It has a graphical 
user interface and conventional command-line interface. It is written in
 C, uses GNU Scientific Library for its mathematical routines, and 
plotutils for generating graphs. It is a Free replacement for the 
proprietary program SPSS (from <a rel="nofollow" target="_blank" href="http://www-01.ibm.com/software/analytics/spss/">IBM</a>) predict with confidence what will happen next so that you can make smarter decisions, solve problems and improve outcomes.
<a rel="nofollow" target="_blank" href="http://www.siliconafrica.com/wp-content/themes/directorypress/thumbs//PSPP.png"><img alt="PSPP" src="//img.pandawhale.com/post-31679-PSPP-m6in.png" height="512" width="683" /></a>

5. KNIME<a rel="nofollow" target="_blank" href="http://knime.org/">KNIME</a> is a user-friendly 
graphical workbench for the entire analysis process: data access, data 
transformation, initial investigation, powerful predictive analytics, 
visualisation and reporting. The open integration platform provides over
 1000 modules (nodes)
<a rel="nofollow" target="_blank" href="http://www.siliconafrica.com/wp-content/themes/directorypress/thumbs//KNIME.jpg"><img alt="KNIME" src="//img.pandawhale.com/post-31679-KNIME-63Ow.jpeg" height="531" width="708" /></a>

6. Orange<a rel="nofollow" target="_blank" href="http://orange.biolab.si/">Orange</a> is an Open source 
data visualization and analysis for novice and experts. Data mining 
through visual programming or Python scripting. Components for machine 
learning. Add-ons for bioinformatics and text mining. Packed with 
features for data analytics.
<a rel="nofollow" target="_blank" href="http://www.siliconafrica.com/wp-content/themes/directorypress/thumbs//Orange.png"><img alt="Orange" src="//img.pandawhale.com/post-31679-Orange-5yo4.png" /></a>

7. Apache Mahout<a rel="nofollow" target="_blank" href="https://mahout.apache.org/">Apache Mahout</a> is an 
Apache project to produce free implementations of distributed or 
otherwise scalable machine learning algorithms on the Hadoop platform.
Currently Mahout supports mainly four use cases: Recommendation 
mining takes users’ behavior and from that tries to find items users 
might like. Clustering takes e.g. text documents and groups them into 
groups of topically related documents. Classification learns from 
exisiting categorized documents what documents of a specific category 
look like and is able to assign unlabelled documents to the (hopefully) 
correct category. Frequent itemset mining takes a set of item groups 
(terms in a query session, shopping cart content) and identifies, which 
individual items usually appear together.
<a rel="nofollow" target="_blank" href="http://www.siliconafrica.com/wp-content/themes/directorypress/thumbs//Apache-Mahout.png"><img alt="Apache-Mahout" src="//img.pandawhale.com/post-31679-ApacheMahout-qjhS.png" height="452" width="754" /></a>

8. jHepWork<a rel="nofollow" target="_blank" href="http://jwork.org/jhepwork/">jHepWork</a> (or “jWork”) 
is an environment for scientific computation, data analysis and data 
visualization designed for scientists, engineers and students. The 
program incorporates many open-source software packages into a coherent 
interface using the concept of scripting, rather than only-GUI or 
macro-based concept.
jHepWork can be used everywhere where an analysis of large numerical 
data volumes, data mining, statistical analysis and mathematics are 
essential (natural sciences, engineering, modeling and analysis of 
financial markets).
<a rel="nofollow" target="_blank" href="http://www.siliconafrica.com/wp-content/themes/directorypress/thumbs//jHepWork.png"><img alt="jHepWork" src="//img.pandawhale.com/post-31679-jHepWork-jNGd.png" /></a>

9. Rattle<a rel="nofollow" target="_blank" href="https://code.google.com/p/rattle/">Rattle</a> (the R 
Analytical Tool To Learn Easily) presents statistical and visual 
summaries of data, transforms data into forms that can be readily 
modelled, builds both unsupervised and supervised models from the data, 
presents the performance of models graphically, and scores new datasets.
It is a free and open source data mining toolkit written in the 
statistical language R using the Gnome graphical interface. It runs 
under GNU/Linux, Macintosh OS X, and MS/Windows. Rattle is being used in
 business, government, research and for teaching data mining in 
Australia and internationally.
<a rel="nofollow" target="_blank" href="https://code.google.com/p/rattle/"><img alt="rattle" src="//img.pandawhale.com/post-31679-rattle-pz8w.png" height="442" width="667" /></a>

Mo Data
12:33 AM Nov 15 2013

Stashed in: Big Data!, Family Guy, Big Data

To save this post, select a stash from drop-down menu or type in a new one:

So much software to choose from! Feels like a person would be best served spending more time on evaluation before committing to any given software package.

Adam Rifkin
7:52 AM Nov 15 2013

We looooooove FREE!!! Here are some of the best free data mining tools - dance to that tune

Mo Data stashed this in Big Data Technologies

You May Also Like: