Statistics and Analytics

Statistics, Analytics, Calculus, Modeling, Bayesian, Logistic Regression, Spatial Analysis, GIS, Clinical Trials, Data Mining, Microarray, R Programming, STATISTICA, Natural Language Processing, Sentiment Analysis, Text Mining, Rasch, Sample Size, Survey, Simulation, Life Science, Biostatistics, Clinical Trial, Pharmacokinetics, Bioequivalence, Epidemiology, Bootstrap, Meta Analysis, Inference, Survival Analysis, Forecasting, Bioequivalence, Linear Models, Quantitative Risk, Sampling, ANOVA

Wednesday, 9 April 2014

Biostatistics in R: Clinical Trial Applications

Would you like to learn how to use R to compare treatments, incorporate covariates into the analysis, analyse survival (time-to-event) trials, model longitudinal data, and analysis of bioequivalence trials.

Would you like to learn the implementation in R of statistical procedures important for the clinical trial statistician?

Learn all this and more in Prof. Din Chen and Prof. Karl Peace’s online course Biostatistics in R: Clinical Trial Applications at Statistics.com.

For more detail please visit at http://www.statistics.com/Clinical-Trials-R.

Course Program:

Course outline: The course is structured as follows

WEEK 1: Treatment Comparisons

· R fundamentals associated with clinical trials

· A simple simulated clinical trial

· Statistical models for treatment comparisons

· Incorporating covariates

WEEK 2: Survival Analysis

· Time-to-event data structure

· Statistical models for survival data

· Right-censored data analysis

· Interval-censored data analysis

WEEK 3: Analysis of Data from Longitudinal Clinical Trials

· Trial designs and data structure

· Statistical models and analysis

WEEK 4: Analysis of Bioequivalence Clinical Trials

· Data from bioequivalence clinical trials

· Bioequivalence clinical trial endpoints

· Statistical methods to analyze bioequivalence

HOMEWORK:

Homework in this course consists of short answer questions to test concepts, guided data analysis problems using software, and guided data modeling problems using software.

In addition to assigned readings, this course also has an end of course data modeling project, example software files, and supplemental readings available online.

Instructors: Prof. Din Chen, Univ. University of Rochester Medical Center, co-author of "Clinical Trial Methodology" and "Clinical Trial Data Analysis Using R," and the author or co-author of 80 refereed articles in scholarly journals.

Prof. Karl E. Peace, Jiann-Ping Hsu College of Public Health at Georgia Southern University, Georgia Cancer Coalition Distinguished Cancer Scholar, founding director of the Center for Biostatistics, and the founder of Biopharmaceutical Research Consultants, Inc. (BRCI), and is Founder and Chair of the Biopharmaceutical Applied Statistics Symposium (BASS). He has contributed heavily to the medical, scientific and statistical literature by authoring or co-authoring over 150 articles and six books.

Who Should Take This Course:

Analysts and statisticians at pharmaceutical companies and other health research organizations who need or want to become involved in the design, monitoring or analysis of clinical trials and who are familiar with R software and considering its use in clinical trials.

You will be able to ask questions and exchange comments with the instructors via a private discussion board throughout the course. The courses take place online at statistics.com in a series of 4 weekly lessons and assignments, and require about 15 hours/week. Participate at your own convenience; there are no set times when you must be online. You have the flexibility to work a bit every day, if that is your preference, or concentrate your work in just a couple of days.

We, the Center for eLearning and Training (C-eLT), Pune, partner with Statistics.com and offer these courses to Indian participants at special prices payable in INR.

For India Registration and pricing, please visit us at www.india.statistics.com.

Email: info@c-elt.com

Call: 020 6680 0300 / 322

Websites:

www.statistics.com/india

www.c-elt.com

Spatial Statistics with Geographic Information Systems (GIS)

Peter Bruce (President and Founder- Statistics.com) says, “Wandering around Memphis, TN recently, I was able to use my phone to tell me the value of a mansion I was passing by, and locate the hotel famous for letting ducks use the elevator. I could also have used it to find Graceland, but some tasks still lie well within the realm of the human brain. Location data is now the fastest growing type of data, and its effective use is the province of spatial statistics.”

Learn more about the statistical foundations of geospatial analysis in David Unwin's online course "Spatial Statistics with Geographic Information Systems (GIS)," at Statistics.com. For more detail please check at http://www.statistics.com/spatial-statistics-GIS/.

Course Program:

Course outline: The course is structured as follows

SESSION 1: Some Basics:

· Geographical data

· Statistics

· Describing spatial data using maps

SESSION 2: The Analysis of Patterns in Point Data:

· Introductory methods for detecting non-randomness in dot/pin map distributions

SESSION 3: The Analysis of Patterns in Area Data:

· Detecting and measuring spatial autocorrelation in lattice data

SESSION 4: The Analysis of Continuous Field Data:

· Creating contour-type maps using inverse distance weighting and geostatistical methods

Note that the course does not concentrate on the analysis of spatially continuous data using methods that are collectively referred to as geostatistics. Lesson 4 has a brief introduction to the basic concepts as used in interpolation, but this is all.

HOMEWORK:

In this course the homework is a mixture of some simple exercises and consists of guided data analysis problems using public domain software.

In addition to assigned readings, this course also has an end of course data modeling project, and supplemental video lectures.

The instructor, Dr. David Unwin, is Emeritus Chair of Geography at Birkbeck College, and Visiting Professor in the Department of Geomatic Engineering at University College, both in the University of London. His work using and developing spatial statistics in research stretches back some 40 years, and he has authored over a hundred academic papers in the field, together with a series of texts, of which the most recent are his “Geographic Information Analysis, 2nd edition” (with D. O'Sullivan, 2010) and a series of edited collections at the interface between geography and computer science in “Visualization in GIS” (Hearnshaw and Unwin, 1994), “Spatial Analytical Perspectives on GIS” (Fischer, Scholten and Unwin, 1996) “Virtual Reality in Geography” (Fisher and Unwin, 2002) and, most recently representation issues in “Re-presenting GIS” (Fisher and Unwin, 2005). Having developed the world's first wholly internet-delivered Master's program in GIS in 1998, David Unwin has considerable experience of teaching and tutoring online. Participants can ask questions and exchange comments directly with Dr. Unwin via a private discussion board during the course.

Aim of the course:

Spatial analysis often uses methods adapted from conventional analysis to address problems in which spatial location is the most important explanatory variable. This course, which is directed particularly to students with backgrounds in either computing or statistics but who lack a background in the necessary geospatial concepts, will explain and give examples of the analysis that can be conducted in a geographic information system such as ArcGIS or Mapinfo. The motivation is simple: it is one thing to run a GIS, but quite another to use it analytically to help answer questions such as:

- Is there an unusual cluster of crimes/cases of a disease here that we need to worry about?

- Do these data show variation across the country that I need to know about?

- What is the most probable air temperature here?

In the course we will explore methods that enable answers to be given to these, and similar, questions involving spatial variation.

Who Should Take This Course?

Analysts and researchers who need to know how to use and interpret the data from Geographic Information Systems (GIS's), including those in environmental analysis and management, banking, insurance, logistics, law enforcement services, defence, media, real estate, retail and more.

We, the Center for eLearning and Training (C-eLT), Pune, partner with Statistics.com and offer these courses to Indian participants at special prices payable in INR.

For India Registration and pricing, please visit us at www.india.statistics.com.

Email: info@c-elt.com

Call: 020 6680 0300 / 322

Websites:

www.statistics.com/india

www.c-elt.com

Monday, 7 April 2014

Webinar on exact tests for correlated data (Tuesday, April 8, 11:00 am - 12:00 US Eastern time)

This coming Tuesday, the folks at Cytel will be giving an instructional webinar on exact tests for correlated data.

Webinar: Tuesday, April 8, 11:00 am - 12:00 US Eastern time (NO Charge)

Correlated data are common in many research areas, but especially in multicentre clinical trials, genetics, epidemiology, ophthalmology, and teratology. Correlated data arises often where multiple outcomes are measured on an individual over time, or on several individuals sharing common genetic or environmental exposures.

Conventional statistical methods for analysing correlated categorical outcomes rely on large-sample distributional assumptions (e.g., approximate normality) to justify their results. These approaches work poorly for small or sparse samples. That's when you need exact methods.

A recently introduced StatXact® module provides a suite of exact tests for analysing correlated data tables. These tools provide correlated-data analogues for common exact tests, including Fisher’s test. We’ll illustrate use and results interpretation using real-world examples, including ophthalmological, developmental toxicology, family-based Alzheimer’s genetics studies, two multicentre clinical trials, and brain pathology research applications.

We’ll analyse the examples using:

• Trend tests for ordered binomials

• Wilcoxon test for ordered multinomials

• Kruskal-Wallis test (for a two-way table with one ordered and one unordered variable)

• Fisher’s exact test (for a two-way table with two unordered variables)

• Stratified 2 x 2 tables

• Exact test for clustering

• Exact trend test for multiple binomial outcomes

Interested but can't attend? Email mweitz@cytel.com for the slides.

The lead presenter is Dr. Christopher Corcoran, Ph.D., Associate Dept. Head, Utah State University. Dr. Corcoran received his B.S. from USU in Statistics and Computer Science in 1995. He earned his Biostatistics and Genetic Epidemiology doctorate from Harvard in 1999, then joined the faculty of USU's Department of Mathematics and Statistics. At USU, Chris has collaborated on several large research projects, including studies of aging, dementia, cardiovascular disease, cancer, and autism.

Chris focuses on genetic causes of disease, and how genetic and environmental factor interactions alter disease risk. Considering thousands or even millions of genes simultaneously requires carefully designed statistical and computational methods. Chris has steadily developed, implemented, and documented such approaches in StatXact, LogXact, and in SAS-compatible functions.

Email: info@c-elt.com

Call: 020 6680 0300 / 322

Websites:

www.statistics.com/india

www.c-elt.com

Wednesday, 2 April 2014

Applied Predictive Analytics

Einstein said "In theory, theory and practice are the same. In practice, they are not." This course is about the practice of predictive analytics, and is taught in partnership with a company with years of consulting experience with telecoms (predicting churn), large retailers (predicting holiday sales), online merchants (microtargeting), and more

This is a fully practical and applied course - you should have some familiarity with predictive modeling before taking it. In part 1 of this course, you will get some hands-on experience with real-world data and the issues of problem definition and data prep. In Part 2 you will take a defined dataset, develop models, assess them, and submit your best model. Your instructor will serve as your personal guide in this effort.

Mr. Mohan Singh will present his course “Applied Predictive Analytics, in partnership with CrowdANALYTIX” online at Statistics.com. For more details please visit at Applied Predictive Analytics, inpartnership with CrowdANALYTIX.

Aim of Course:

The goal of this course is to teach users (who have basic knowledge of R programming, predictive analytics and statistics) to apply machine learning techniques in real world case studies. This course provides hands on approach, presenting the opportunity to participate in a private educational competition hosted by CrowdANALYTIX.

Business Case Study: We will study data from the "daily deals" industry (consisting of websites like Groupon, Living Social etc. which source local deals to offer each day). The daily deals industry is emerging and highly competitive. The goal will be to predict the revenue from each offering using the given data.

Course Walkthrough: Each week, participants will be given a set of exercises and instructions to work on the raw data or processed data (details below). Users will apply their statistical/machine learning knowledge, along with their business understanding, to solve the problems and interpret the modeling results for the given business objective. The course will follow an iterative approach for problem solving, in which users are required to submit modeling responses multiple times.

Who Should Take This Course:

Business analysts, R users, SAS users, statistical analysts who want to learn to implement and apply machine learning techniques to solve predictive analytics problems in a real world business case. Data mining analysts who want extend their knowledge to include machine learning techniques for data modelling.

Course Program:

WEEK 1: Business Case Study, Introduction and Data Pre-processing

Course and business case study introduction
Data Cleaning, pre-processing & data visualization
Understanding business objective to be modelled based on the case study

Assignment: To read data in R and perform simple statistical measures. Perform data cleaning, data pre-processing and data manipulation as required by the case study problem.

WEEK 2: Classification Problem, Data Sampling, Feature Selection and Model Building

Introduction to machine learning

Supervised classification problem (Machine learning packages in R)

Perform feature selection and processing on given data
Model building based on classification techniques

Assignment: To perform simple exercises for machine learning packages in R. Converting raw data into features or creating new features, if needed for the modelling objective.

WEEK 3: Evaluation Metrics, Scoring Model Output, Leaderboard Prediction

Build machine learning model like decision trees, ensemble modelling etc.
Evaluation metric like confusion matrix, RMSE etc.
Leaderboard scoring for peer-2-peer feedback

Assignment: Perform basic to intermediate level machine learning model for the problem. These scores will be reflected on the private contest leaderboard for all participants.

WEEK 4: Visualization of Modeling Output and Analysis of Features & Model for Business Case Study

Model building and Leaderboard updates on a daily basis visualize the output of the model
Interpret modelling results for case study objective

Assignment: Final week assignment is based on understanding the various model outputs for the business objective.

The instructor, Mr. Mohan Singh, is a senior data scientist at CrowdANALYTIX with extensive research experience in healthcare and imaging. He has taught data mining & computer science at University College Dublin and UC, Irvine. Participants can ask questions and exchange comments with Mr. Singh via a private discussion board throughout the period.

We, the Center for eLearning and Training (C-eLT), Pune, partner with Statistics.com and offer these courses to Indian participants at special prices payable in INR.

For India Registration and pricing, please visit us at www.india.statistics.com.

Email: info@c-elt.com

Call: 020 6680 0300 / 322

Websites:

www.statistics.com/india

www.c-elt.com

Friday, 28 March 2014

Data Mining: Unsupervised Techniques

Data mining, the art and science of learning from data, covers a number of different procedures. This course covers key unsupervised learning techniques: association rules, principal components analysis, and clustering. (Introduction to Predictive Modeling covers techniques that are used to predict a record's class or the value of an outcome variable on the basis of a set of records with known outcomes).

Learn with Mr. Anthony Babinec in online course " Data Mining: Unsupervised Techniques" at Statistics.com. For more details please visit at http://www.statistics.com/datamining/.

We, C-eLT, Pune, partner with Statistics.com and offer these courses to Indian participants at special prices payable in INR.

The course will include an integration of supervised and unsupervised learning techniques. This is a hands-on course - participants in the course will have access to an Excel-based comprehensive tool for data-mining, XLMiner, the use of which will be explained in the course. Participants will apply data mining algorithms to real data, and will interpret the results.

Who Should Take This Course:

Marketers seeking to specify customer segments and identify associations among products purchased, environment scientists seeking to cluster observations, analysts who need to identify the key variables out of many, MBA's seeking to update their knowledge of quantitative techniques, managers and scientists who want to see what data-mining can do, and anyone who wants a practical hands-on grounding in basic data-mining techniques.

Course Program:

Course outline: The course is structured as follows

SESSION 1: Principal Components Analysis

The goal - dimensionality reduction
The principal components
Scale variance estimation
Normalizing the data
Principal components and least orthogonal squares
Exercises

SESSION 2: Clustering

What is cluster analysis?
Hierarchical methods
Nearest neighbor (single linkage)
Farthest neighbor (complete linkage)
Group average (average linkage)
Optimization and the k-means algorithm
Similarity measures
Other distance measures
The curse of dimensionality
Exercises

SESSION 3: Association Rules

Discovering association rules in transaction databases
Support and confidence
The apriori algorithm
Shortcomings
Exercises

SESSION 4: Integration of Supervised and Unsupervised learning

Clustering into customer segments
Profiling of customer segments
Classifying new records by segment

The final lesson is an integration of supervised and unsupervised techniques. To get the full benefit of this course, familiarity with supervised learning is needed, but those not requiring this integration can learn about clustering, association rules and principal components without having had a course in supervised learning.

The instructor, Anthony Babinec, is the president of AB Analytics and served previously as Director of Advanced Products Marketing at SPSS; he worked on the marketing of Clementine and introduced CHAID, neural nets and other advanced technologies to SPSS.

We, the Center for eLearning and Training (C-eLT), Pune, partner with Statistics.com and offer these courses to Indian participants at special prices payable in INR.

For India Registration and pricing, please visit us at www.india.statistics.com.

Email: info@c-elt.com

Call: 020 6680 0300 / 322

Websites:

www.statistics.com/india

www.c-elt.com