Friday 19 April 2013

Data Mining

What is Data Mining?

Data mining is the process of discovering interesting patterns and knowledge from large amounts of data.

The data sources can include databases, data warehouses, the Web, other information repositories, or data that are streamed into the system dynamically.

Example
Imagine that you are a sales manager at an Electronics showroom.
A customer recently bought a Computer from the store.
What should you recommend to her next?

In this case information about which products are frequently purchased by your customers following their purchases of a Computer would be very helpful in making your recommendation.

In this case, Data Mining comes into picture.

If you have the data of all customer transactions done at your store, then from this transactional database you can try to figure out that what does people generally buy along with computer.

Suppose you discovered that in most of the transactions in which a computer has been bought, a printer has also been bought along with.
Thus we may say that if a customer is buying a computer, most likely she would also be interested in buying a computer.
So whenever a customer buys a computer from your store, you can recommend her a printer and increase your sales.

There are various techniques of data mining :
1. Association Rule Mining
2. Classification
3. Clustering
4. Outlier Analysis

The example which we discussed above falls into the domain of Association Rule Mining.