Rule induction is an important component of
data mining, and this course covers two main styles of generating rules.
One style of machine learning is association
learning. In association learning, the learning method searches for any
association between features. That is, there is no specific target variable. An
example is the recommendation systems used in many online shopping systems - If
you bought X, then you may also like Y. We will look at the industry standard
method: APRIORI.
A second style of machine learning is
classification learning. In classification learning, a learning scheme takes a
set of classified examples from which it is expected to learn a way of
classifying unseen examples. These are forms of supervised learning, in which
there is a specific target variable. We will look at two decision tree methods:
C4.5 and CHAID. We will also look at some machine learning methods, such as
PRISM and INDUCT.
Rule induction methods have a number of strong
pluses going for them: They produce interpretable results; they are flexible
and make no strong assumptions about model form; they perform well in practice.
Learn this in Anthony Babinec’s online
course “Decision Tree and Rule Base Segmentation” at Statistics.com. For
more details please visit at http://www.statistics.com/decisiontrees/.
Who Should Take This Course:
Analysts and researchers who need to know more
about automated machine learning methods for generating association and
decision rules: data miners, consultants, ecommerce analysts, market researchers,
direct marketers, diagnosticians.
Course Program:
Course outline: The course
is structured as follows
SESSION 1: Introduction
- Overview
of rule induction methods
- APRIORI
- Find all associations with at least a specified support, confidence, and
number of elements
SESSION 2: Classification via
CHAID
- CHAID
- Chi-square Automatic Interaction Detection - an exploratory method for
large numbers of categorical variables
- Response-based
segmentation
- Finding
profitable segments
- Understanding
the quality of the CHAID solution
SESSION 3: Classification via C4.5 Decision Trees
- C4.5
- A public-domain machine learning method
- Decision
trees versus rule sets
- Limitations
of tree-based methods
- Successor
methods to C4.5
SESSION 4: Rule Construction via
Covering Algorithms
- Using
covering algorithms to construct rules
- The
PRISM rule learner
- INDUCT
- A modification of PRISM
Anthony Babinec is President of AB Analytics. For
over two decades, Tony Babinec has specialized in the application of
statistical and data mining methods to the solution of business problems.
Before forming AB Analytics, Babinec was Director of Advanced Products
Marketing at SPSS; he worked on the marketing of Clementine and introduced
CHAID, neural nets and other advanced technologies to SPSS users. He is on the
Board of Directors of the Chicago Chapter of the American Statistical
Association, where he has held various offices including President.
This course takes place over the internet at
the Institute for 4 weeks. During each course week, you participate at times of
your own choosing - there are no set times when you must be online. Course
participants will be given access to a private discussion board so that they will be able to ask
questions and exchange comments with instructor, Dr. Daniel T. Kaplan. In class discussions led by the instructor,
you can post questions, seek clarification, and interact with your fellow
students and the instructor.
The course typically requires 15 hours per
week. At the beginning of each week, you receive the relevant material, in
addition to answers to exercises from the previous session. During the week,
you are expected to go over the course materials, work through exercises, and
submit answers. Discussion among participants is encouraged. The instructor
will provide answers and comments, and at the end of the week, you will receive
individual feedback on your homework answers.
For Indian participants statistics.com accepts registration for its courses
at special prices in Indian Rupees through its partner, the Center for
eLearning and Training (C-eLT), Pune.
For More details contact at
Call: 020 6600 9116
Websites: