Data mining is a fast growing area in statistics, but if you lack
institutional access to standard data mining software from SAS and IBM/SPSS
your options are limited. If you are interested in Data Mining, Dr. Luis
Torgo's "Data Mining in R-Learning with Case Studies,"
online course introduce you to both R and data mining.
This course teaches you how to do data mining in the
increasingly-dominant open-source R software. This course follows a "Lear
by doing it" strategy where data mining topics are introduced as needed
when addressing a series of real world data mining case studies. Join Dr. Luis
Torgo in his online course "Data Mining in R-Learning with Case
Studies," at statistics.com. For More details please visit at http://www.statistics.com/data-mining-r.
Aim of Course:
The main goal of this course is to teach users
how to perform data mining tasks using R. The course follows a learn by doing
it strategy, where data mining topics are introduced as needed when addressing
a series of real world data mining case studies.
Who can take this
course:
R users who want to learn how to apply R to
data mining. Data mining analysts in
search of new tools. Students in
statistics.com's PASS program in Data Mining seeking an affordable data mining
tool. Note that working in R will be
more involved than using a specially designed interface for data mining, such
as those found in major commercial data mining programs.
Course Program:
Course outline: The course
is structured as follows
SESSION 1: Predicting Algae Blooms (Case Study 1)
- Descriptive statistics
- Data visualization
- Strategies to handle unknown variable values
- Regression tasks
- Evaluation metrics for regression tasks
SESSION 2: Predicting Algae
Blooms (Continuation of Case Study 1)
- Multiple linear regression
- Regression trees
- Model selection/comparison through k-fold
cross-validation
SESSION 3: Detecting Fraudulent Transactions (Case Study 2)
- Clustering methods
- Classification methods
- Imbalanced class distributions and methods for
handling this type of problems
- Naive Bayes classifiers
- Precision/recall and precision/recall curves
SESSION 4: Classifying Microarray Samples (Case Study 3)
- Feature selection methods for problems with a
very large number of predictors
- Random forests
- k-Nearest neighbors
The instructor, Dr. Luis Torgo, is an Associate Professor of the Department of Computer Science of the Faculty of Sciences of the University of Porto and a researcher of the Laboratory of Artificial Intelligence and Data Analysis (LIAAD) belonging to INESC Porto LA. He is the author of Data Mining With R, as well as a number of scholarly articles and other publications. He teaches R at different levels and has given courses in the use of R for data mining in several countries. Dr. Torgo will be available to course participants throughout the period, taking comments and questions on a private discussion forum.
Schedule:
January 10, 2014 to February 07, 2014
June 27, 2014 to July 25, 2014
You will be able to ask questions and exchange
comments with the instructors via a private discussion board throughout the
course. The courses take place online at statistics.com in a series
of 4 weekly lessons and assignments, and require about 15 hours/week.
Participate at your own convenience; there are no set times when you must be
online. You have the flexibility to work a bit every day, if that is your
preference, or concentrate your work in just a couple of days.
For Indian participants statistics.com accepts
registration for its courses at special prices in Indian Rupees through its
partner, the Center for eLearning and Training (C-eLT), Pune.
Call:
020 6680 0300
Websites: