课后阅读1: Mastering Data Mining

Chapter 1 Data Mining in Context

What is Data Mining

Data mining is the process of exploration and analysis, by automatic or semiautomatic means, of large quantities of data in order to discover meaningful pattern and rules.–John Wiley

What Can Data Mining DO?

Classification

examine the features of a newly presented object and assign to it a predefined class

Estimation

deal with continuously valued outcome

Prediction

Prediction cannot be checked about accuracy. We can only wait and see

Affinity Grouping

things go together (cross-selling opportunities)

Clustering

diverse group into similar subgroups

Description and Visualization

data visualization

The Business Context for Data Mining

  1. large quantities of data
  2. worth learning

Research Tool

Process Improvement

Marketing

Customer Relationship Management

The Technical Context for Data Mining

  1. Algorithms
  2. Data
  3. Modeling practices

Machine Learning

Neural network
Decision Trees

Statistics

Decision Support

Data Warehouse

OLAP

Decision Support Fusion

Computer Technology

The Societal Context for Data Mining


Chapter 2 Why Master the Art?

Four Approaches to Data Mining

Purchasing Scores

Purchasing Software

Purchasing Models

neural net models for predicting fraud in credit, product Falcon. Concern-false positive-innocent people
vertical application

Purchasing Model-Building Software

Quadstone Decision house

what tools can and cannot automate

assumption is important

Hiring Outside Experts

  1. one time vs on-going
  2. source of data
  3. how be employed
  4. availability and skill level

Lessons Learned

  1. understand business problem
  2. select relevant data
  3. transform data
  4. interpret result