Monday 31 December 2012

Decision Tree and Rule Base Segmentation


Rule induction is an important component of data mining, and this course covers two main styles of generating rules.

One style of machine learning is association learning. In association learning, the learning method searches for any association between features. That is, there is no specific target variable. An example is the recommendation systems used in many online shopping systems - If you bought X, then you may also like Y. We will look at the industry standard method: APRIORI.

A second style of machine learning is classification learning. In classification learning, a learning scheme takes a set of classified examples from which it is expected to learn a way of classifying unseen examples. These are forms of supervised learning, in which there is a specific target variable. We will look at two decision tree methods: C4.5 and CHAID. We will also look at some machine learning methods, such as PRISM and INDUCT.

Rule induction methods have a number of strong pluses going for them: They produce interpretable results; they are flexible and make no strong assumptions about model form; they perform well in practice.

Learn this in Anthony Babinec’s online course “Decision Tree and Rule Base Segmentation” at Statistics.com. For more details please visit at  http://www.statistics.com/decisiontrees/.  

Who Should Take This Course:
Analysts and researchers who need to know more about automated machine learning methods for generating association and decision rules: data miners, consultants, ecommerce analysts, market researchers, direct marketers, diagnosticians.

Course Program:

Course outline: The course is structured as follows
SESSION 1: Introduction
  • Overview of rule induction methods
  • APRIORI - Find all associations with at least a specified support, confidence, and number of elements

SESSION 2: Classification via CHAID
  • CHAID - Chi-square Automatic Interaction Detection - an exploratory method for large numbers of categorical variables
  • Response-based segmentation
  • Finding profitable segments
  • Understanding the quality of the CHAID solution

SESSION 3: Classification via C4.5 Decision Trees
  • C4.5 - A public-domain machine learning method
  • Decision trees versus rule sets
  • Limitations of tree-based methods
  • Successor methods to C4.5

SESSION 4: Rule Construction via Covering Algorithms
  • Using covering algorithms to construct rules
  • The PRISM rule learner
  • INDUCT - A modification of PRISM

Anthony Babinec is President of AB Analytics. For over two decades, Tony Babinec has specialized in the application of statistical and data mining methods to the solution of business problems. Before forming AB Analytics, Babinec was Director of Advanced Products Marketing at SPSS; he worked on the marketing of Clementine and introduced CHAID, neural nets and other advanced technologies to SPSS users. He is on the Board of Directors of the Chicago Chapter of the American Statistical Association, where he has held various offices including President.

This course takes place over the internet at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board so that they will be able to ask questions and exchange comments with instructor, Dr. Daniel T. Kaplan. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

The course typically requires 15 hours per week. At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.

For Indian participants statistics.com accepts registration for its courses at special prices in Indian Rupees through its partner, the Center for eLearning and Training (C-eLT), Pune.

For India Registration and pricing, please visit us at www.india.statistics.com.

For More details contact at
Call: 020 6600 9116

Websites:

Thursday 27 December 2012

Advanced Structural Equation Modeling


The American Psychological Association is debating whether to drop narcissism as a personality disorder, largely because it is hard to measure.  How do social scientists actually measure characteristics of personality?  Part of the answer is to measure more concrete attributes (e.g. via responses on a survey) and to treat, say, narcissism as a derived "latent variable." Causal models of relationships involving latent variables are the domain of structural equation modeling (SEM).  Randall Schumacker, author of a number of books in this area, will present his online course "Advanced Structural Equation Modeling" at statistics.com. For more details please visit at http://www.statistics.com/advancedsem.

Who can take this course:
Market researchers, educational researchers, sociologists and psychologists, political scientists, economists, and survey researchers.

Course Program:

Course outline: The course is structured as follows
SESSION 1: Multiple Indicators and Causes
  • Multiple Indicator and Multiple Causes (MIMIC) model
  • Multiple-Group model

SESSION 2: Multilevel Models
  • Multilevel (HLM) model
  • Mixture model
  • Structured Means model

SESSION 3: Multitrait, Multimethod, Interacties
  • Multitrait-Multimethod model
  • Second Order Factor model
  • Interaction models

SESSION 4: Latent Variable, Dynamic Factor
  • Latent Variable Growth Curve model
  • Dynamic Factor model
  • Power and Sample Size
  • Monte Carlo Methods

Dr. Schumacker is a professor at the University of Alabama and, in addition to "Advanced Structural Equation Modeling," he has co-authored "A Beginner's Guide to Structural Equation Modeling" (with Richard Lomax) and is the co-editor (with George Marcoulides) of "Advanced Structural Equation Modeling: Issues and Techniques and Interaction and Nonlinear Effects in Structural Equation Modeling." Dr. Schumacker was the founder, editor (1994-1998), and is the current emeritus editor of "Structural Equation Modeling: A Multidisciplinary Journal." He also founded the Structural Equation Modeling Special Interest Group at the American Educational Research Association.

This course takes place over the internet at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board so that they will be able to ask questions and exchange comments with instructor, Dr. Randall Schumacker. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

The course typically requires 15 hours per week. At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.

For Indian participants statistics.com accepts registration for its courses at special prices in Indian Rupees through its partner, the Center for eLearning and Training (C-eLT), Pune.

For India Registration and pricing, please visit us at www.india.statistics.com.

For More details contact at
Call: 020 6600 9116

Websites:

Thursday 29 November 2012

Statistics 2 – Inference and Association


Online Course “Statistics 2 – Inference and Association” provides an easy introduction to inference and association through a series of practical applications, based on the resampling/simulation approach. Once you have completed this course you will be able to test hypotheses and compute confidence intervals regarding proportions or means, computer correlations and fit simple linear regressions.  Topics covered also include chi-square goodness-of-fit and paired comparisons. For more details please visit at http://www.statistics.com/statistics-2/.

Who Should Take This Course?
Anyone who encounters statistics in their work, or will need introductory statistics for later study. The only mathematics you need is arithmetic.

Course Program:

Course outline: The course is structured as follows
SESSION 1:  Confidence Intervals for Proportions and Means
  • CI Proportion
  • CI Mean
  • CI Difference in Means
  • CI Difference in Proportions
  • Hypothesis Tests vrs. Confidence Intervals

SESSION 2:  Tests for Two Means, Proportions; Paired Comparisons
  • Test 2 Means
  • Test 2 Proportions
  • Paired Comparisons
  • 1-way and 2-way hypotheses

SESSION 3:  Chi-Square, Directional Hypotheses
  • More than 2 Samples
  • Chi Square
  • Goodness of Fit
  • Null and Alternative Hypotheses
  • 1-Way and 2-Way Hypotheses
  • Correlation

SESSION 4:  Simple Linear Regression
  • Simple Regression
  • Regression Inference

Mrs. Meena Badade has over 23 years teaching experience at various levels of education and at different institutions nationally and internationally. She has taught various courses in Statistics. She also has a number of research papers in various journals.

In addition to academic practice, she has considerable corporate experience at Metric Consultancy. She has worked as statistical consultant and data processing manager for international clients. She has applied various statistical techniques for statistical projects for industry.

This course takes place over the internet at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board so that they will be able to ask questions and exchange comments with instructor, Mrs. Meena Badade. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

The course typically requires 15 hours per week. At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.

For Indian participants statistics.com accepts registration for its courses at special prices in Indian Rupees through its partner, the Center for eLearning and Training (C-eLT), Pune.

For India Registration and pricing, please visit us at www.india.statistics.com.

For More details contact at
Call: 020 6600 9116

Websites:

Wednesday 28 November 2012

Spatial Analysis Techniques in R


Identifying transit corridors for elephants, analyzing the spread of cancer in time and space, modeling commercial interactions among firms - these are just a few of the recent applications I have seen of spatial statistics.   Learn the basics in David Unwin’s online course “Spatial Analysis Techniques in R” at statistics.com. For more details please visit at http://www.statistics.com/Spatial-analysis-R/.

“Spatial Analysis Techniques in R” will teach users how to implement spatial statistical analysis procedures using R software. Topics covered include point pattern analysis, identifying clusters, measures of spatial association, geographically weighted regression and surface procession.  The course includes a concise introduction to R, but some prior familiarity with R will minimize time spent learning it, and allow you to focus more on the spatial statistics techniques.

Who Should Take This Course?
Analysts and researchers who are familiar with R and want to learn how to use it for analysis of spatial data.

Course Program:

Course outline: The course is structured as follows

SESSION 1: Using R with Spatial Data

SESSION 2: Practical Point Pattern Analysis: Dealing with Inhomogeneity and Locating Clusters

SESSION 3: Looking Closer at Areas: The Importance of Local Measures of Spatial Association and Geographically Weighted Regression

SESSION 4: Surface Procession

Dr. David Unwin is Emeritus Chair of Geography at Birkbeck College and Visiting Professor in the Department of Geomatic Engineering at University College, both in the University of London. He has authored over a hundred academic papers in the field, together with a series of texts, of which the most recent are his “Geographic Information Analysis, 2nd edition” (with D. O'Sullivan, 2010) and a series of edited collections at the interface between geography and computer science in “Visualization in GIS” (Hearnshaw and Unwin, 1994), “Spatial Analytical Perspectives on GIS” (Fischer, Scholten and Unwin, 1996) “Virtual Reality in Geography” (Fisher and Unwin, 2002) and, most recently representation issues in “Re-presenting GIS” (Fisher and Unwin, 2005). Participants can ask questions and exchange comments directly with Dr. Unwin via a private discussion board during the course.

This course takes place over the internet at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board so that they will be able to ask questions and exchange comments with instructor, Dr. David Unwin. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

The course typically requires 15 hours per week. At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.

For Indian participants statistics.com accepts registration for its courses at special prices in Indian Rupees through its partner, the Center for eLearning and Training (C-eLT), Pune.

For India Registration and pricing, please visit us at www.india.statistics.com.

For More details contact at
Call: 020 6600 9116

Websites:

Friday 23 November 2012

Introduction to Smoothing and P-spline Techniques using R


Smoothing helps you maintain a view of the data forest, while not losing sight of the trees. Or vice-versa, if excessive globalism is your weakness. Our smoothing course with Brian Marx uses R at Statistics.com. More on Introduction to Smoothing and P-Splines Using R.

Real-life data often are not well-described by a single simple function. Splines combine multiple functions differentially in a smooth fashion over different ranges. P-splines are widely applicable, effective, and popular: over 500 citations for the instructors' Statistical Science article that introduced P-splines. You will be introduced to P-splines via B-splines (basis splines), and learn how to balance the competing demands of fidelity to the data and smoothness, and how to optimize the smoothing.

Who Should Take This Course?
Medical and social science researchers, data miners, environmental analysts;  any researcher who must develop statistical models with "messy" data.

Course Program:

Course outline: The course is structured as follows
SESSION 1:  Smoothing via Regression - Local vs Global Bases
  • Global bases can be ineffective
  • Local bases are attractive
  • B-splines
  • Difference penalties

SESSION 2: Introducing P-splines
  • Dealing with non-normal data
  • Moving from GLM to P-spline
  • Density estimation
  • Variance smoothing

SESSION 3: Optimizing the Smoothing
  • Fidelity to the data vs smooth curve
  • Cross-validation, AIC
  • Error bands

SESSION 4: Multidimensional Smoothing
  • Generalized Addition Models
  • Varying coefficient models
  • Tensor products

The instructors:

Brian Marx is Professor of Statistics at Louisiana State University, Chair of the Statistical Modeling Society, and the Coordinating Editor of "Statistical Modeling: An International Journal."

Paul Eilers is Professor of Genetical Statistics at the Erasmus University Medical Center (Netherlands). His research interests include high throughput genomic data analysis, chemometrics, smoothing, and filtering and smoothing of time series and signals from chemical instruments.


This course takes place over the internet at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board so that they will be able to ask questions and exchange comments with instructor, Dr. Brian Marx and Dr. Paul Eilers. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

The course typically requires 15 hours per week. At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.

For Indian participants statistics.com accepts registration for its courses at special prices in Indian Rupees through its partner, the Center for eLearning and Training (C-eLT), Pune.

For India Registration and pricing, please visit us at www.india.statistics.com.

For More details contact at
Call: 020 66009116

Websites:

Tuesday 20 November 2012

Risk Simulation and Queuing

If you want to learn all about children's fantasy literature, you'd jump at the chance to attend a private class with J.K. Rowling. 

In the world of operations research, Cliff Ragsdale stands at the top with his optimization and simulation textbook, "Spreadsheet Modeling and Decision Analysis," #1 in the world.  Cliff is Bank of America Professor of Business Information Technology at Virginia Tech, you can meet him in his online course "Risk Simulation and Queuing" at Statistics.com. Please visit for more details at http://www.statistics.com/queueing.

This course begins with the specification of risk analysis simulations and interpretation of their output. It goes on to cover the science and structure of optimal decision-making: payoff matrices, types of expected value, decision trees.  It also covers queuing theory, and closes with a discussion of multi-stage decisions.

The software used is Risk Solver Platform, from Frontline Systems.  If you don't already have a licensed copy of Risk Solver Platform, you will be provided with a license for the course so you can use all the features and capabilities.

Who Should Take This Course?
Business analysts with responsibility for specifying, creating, deploying or interpreting quantitative decision models.  Users of risk analysis or queuing software who need to attain a more solid grounding in the subject.

Course Program:

Course outline: The course is structured as follows
SESSION 1: Risk Analysis
  • Simulation
  • Interpreting output
  • Implementing & optimizing the model

SESSION 2: Advanced Simulation
  • Poisson distribution, arrival rate
  • Service rate
  • Operating characteristics

SESSION 3: Queuing
  • Mastering complexity
  • Payoff matrix
  • Decision rules
  • Expected monetary value
  • Expected regret
  • Expected value of information
  • Decision trees

SESSION 4: Multistage Decision Problems
  • Risk profile
  • Sensitivity analysis
  • Tornado charts
  • Using sample information
  • Conditional probabilities
  • Utility functions

Cliff T. Ragsdale is Professor of Business Information Technology. His primary research interests involve applications of quantitative modeling techniques to managerial decision making problems using microcomputers. Dr. Ragsdale has served as a consultant for a variety of organizations including General Mills, The World Bank, Frontline Systems, and Dominion Energy. His research has been published in Decision Sciences, Naval Research Logistics, Operations Research Letters, Computers and Operations Research, OMEGA, Personal Financial Planning, Financial Services Review, Decision Support Systems, and a number of other scholarly journals. He is a Fellow of Decision Sciences Institute and a member of INFORMS. He has also served as the faculty advisor for the Virginia Tech student chapter of APICS and on the Board of Directors for the Southwest Chapter of APICS.

This course takes place over the internet at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board so that they will be able to ask questions and exchange comments with instructor, Dr. Cliff Ragsdale. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

The course typically requires 15 hours per week. At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.

For Indian participants statistics.com accepts registration for its courses at special prices in Indian Rupees through its partner, the Center for eLearning and Training (C-eLT), Pune.

For India Registration and pricing, please visit us at www.india.statistics.com.

For More details contact at
Call: 020 66009116

Websites: