Tuesday, 7 January 2014

Data Mining in R

Data mining is a fast growing area in statistics, but if you lack institutional access to standard data mining software from SAS and IBM/SPSS your options are limited. If you are interested in Data Mining, Dr. Luis Torgo's "Data Mining in R-Learning with Case Studies," online course introduce you to both R and data mining.

This course teaches you how to do data mining in the increasingly-dominant open-source R software. This course follows a "Lear by doing it" strategy where data mining topics are introduced as needed when addressing a series of real world data mining case studies. Join Dr. Luis Torgo in his online course "Data Mining in R-Learning with Case Studies," at statistics.com. For More details please visit at http://www.statistics.com/data-mining-r.

Aim of Course:
The main goal of this course is to teach users how to perform data mining tasks using R. The course follows a learn by doing it strategy, where data mining topics are introduced as needed when addressing a series of real world data mining case studies.

Who can take this course:
R users who want to learn how to apply R to data mining.  Data mining analysts in search of new tools.  Students in statistics.com's PASS program in Data Mining seeking an affordable data mining tool.  Note that working in R will be more involved than using a specially designed interface for data mining, such as those found in major commercial data mining programs.

Course Program:

Course outline: The course is structured as follows
SESSION 1: Predicting Algae Blooms (Case Study 1)
  • Descriptive statistics
  • Data visualization
  • Strategies to handle unknown variable values
  • Regression tasks
  • Evaluation metrics for regression tasks

SESSION 2: Predicting Algae Blooms (Continuation of Case Study 1)
  • Multiple linear regression
  • Regression trees
  • Model selection/comparison through k-fold cross-validation

SESSION 3:  Detecting Fraudulent Transactions (Case Study 2)
  • Clustering methods
  • Classification methods
  • Imbalanced class distributions and methods for handling this type of problems
  • Naive Bayes classifiers
  • Precision/recall and precision/recall curves

SESSION 4: Classifying Microarray Samples (Case Study 3)
  • Feature selection methods for problems with a very large number of predictors
  • Random forests
  • k-Nearest neighbors

The instructor, Dr. Luis Torgo, is an Associate Professor of the Department of Computer Science of the Faculty of Sciences of the University of Porto and a researcher of the Laboratory of Artificial Intelligence and Data Analysis (LIAAD) belonging to INESC Porto LA.  He is the author of Data Mining With R, as well as a number of scholarly articles and other publications. He teaches R at different levels and has given courses in the use of R for data mining in several countries.  Dr. Torgo will be available to course participants throughout the period, taking comments and questions on a private discussion forum.

January 10, 2014 to February 07, 2014
June 27, 2014 to July 25, 2014

You will be able to ask questions and exchange comments with the instructors via a private discussion board throughout the course.   The courses take place online at statistics.com in a series of 4 weekly lessons and assignments, and require about 15 hours/week.  Participate at your own convenience; there are no set times when you must be online. You have the flexibility to work a bit every day, if that is your preference, or concentrate your work in just a couple of days.

For Indian participants statistics.com accepts registration for its courses at special prices in Indian Rupees through its partner, the Center for eLearning and Training (C-eLT), Pune.

For India Registration and pricing, please visit us at www.india.statistics.com.

Call: 020 6680 0300