Data Warehouse: Knowledge & Data Mining

But suppose there are 1000 other customers who also buy candle from you on every Sunday (mostly – with some percentage of variations) and all of them are Christian by religion. So, you can conclude that Alex, Jessica and Paul must be also Christian.

Now the religion of Alex, Jessica and Paul were not given to you as data. This could not be retrieved from the database as information. But you learnt this piece of information indirectly. This is the ”knowledge” that you discovered. And this discovery was done through a process called “Data Mining”.

.. As long as you are not dealing with predictive analysis or not discovering “new” pattern from the existing data – you are not doing data mining.

.. Clustering

Clustering is the method of assigning a set of objects into groups (called clusters) so that the objects in the same cluster are more similar (in some sense or another) to each other than to those in other clusters. Cluster analysis is widely used in market research when working with multivariate data. Market researchers often use this to create customer segmentation, product segmentation etc.