Data Generalization In Data Mining
From a Data Analysis perspective, data mining will be categorized into two classes:
- Descriptive mining.
- predictive mining.
- Descriptive mining: It describes the information set in a concise and summative method and presents attention-grabbing basic properties of data.
- Predictive mining: It analyzes the data to assemble one or a set of models, and makes an attempt to predict the conduct of recent data sets.
- Characterization: It offers a concise and succinct summarization of the given assortment of data.
- Comparison: It offers descriptions evaluating two or extra collections of data.
Data Generalization & Summarization
- low-level item info similar to item_ID, name, brand, class, provider, place_made, and value.
- which may be very useful for sales and advertising and marketing managers.
Data Generalization in Data Mining.
- A course that abstracts a large set of activity–related data in a database from a low conceptual level to greater ones.
- Data Generalization is a summarization of general options of objects in a goal class and produces what known as attribute guidelines.
- The data is associate with a user-specified class that is usually retrieved by a database question and run via a real module to calculate the essence of the data at different levels of abstractions.
Note that with an information cube containing a summarization of data,
easy OLAP operations match the aim of data characterization.
- Data cube strategy(OLAP strategy).
- Attribute-oriented induction strategy.
Presentation Of Generalized Results
- Relations the place some or all attributes are generalized, with counts or different aggregation values collected.
- Mapping outcomes into cross-tabulation type (just like contingency tables).
- Pie charts, bar charts, curves, cubes, and different visible varieties.
- Mapping generalized leads to attribute guidelines with quantitative info related to it.
Data Cube Approach
- An environment-friendly implementation of data generalization.
- Computation of varied kinds of measures, e.g., rely( ), sum( ), common( ), max( ).
- Generalization and specialization may be carried out on a data cube by roll-up and drill-down.
- It handles solely dimensions of easy non-numeric data and measures of easy aggregated numeric values.
- Lack of clever evaluation, can’t inform which dimensions need to be used and what ranges ought to the generalization attain.
- Data generalization in Data Mining is the method that abstracts a large
- set of activity–related data in a database from a low conceptual level to greater ones.
- It is a summarization of basic options of objects in a goal class and produces what known as attribute guidelines.