Data Mining – Basic

Course Summary

This course is designed for engineers, quality professionals, researchers, and managers who need to understand and extract information from observational data such as key process input variables or process drivers.

Event Details

16 hours
Instructor-led class training, with opportunities to practice learned skills using prepared data, live demonstrations, and data collected real time in class
Minitab or JMP Statistical Software


In today’s data rich environment, vast amounts of data are routinely collected. These are termed ‘happenstance’, ‘non-experimental’, or ‘observational’ data. The role of statistics with such observational data is to extract all available information – often called Data Mining – and in particular to identify the Key Process Input Variables (KPIVs) for use in process improvement and process control. With a suitable sampling plan and a knowledge of how to prepare data for analysis, the engineer or researcher can then use statistical methods, much like a detective looking for clues, to release otherwise hidden information from data, providing the basis for correct decisions.

Observational data require special techniques and care in order to extract meaningful information and reach valid conclusions. Observational data are common in most process industries and can yield valuable information from normal process data without resorting to designed experimental data, which may be more costly to obtain. This course gives basic methods to compare a single input to a single output. It covers discrete or continuous inputs with continuous outputs and discrete inputs with discrete outputs. The methods introduced here are building blocks for more advanced data mining techniques as well as the basis for single factor experiments.

Who Should Attend

This course is designed for engineers, quality professionals, researchers, and managers who need to understand and extract information from observational data such as key process input variables or process drivers.

Learning Objectives

Through training, participants will:

  • Understand statistical reasoning
  • Be able to plan a multi-vari study and clean datasets
  • Learn the different types of statistical tests based on data type (t-Tests, ANOVA, non-parametric tests, simple linear regression and chi-square test)
  • How to avoid the pitfalls and perils of analyzing observational data
  • Improve utilization of available data to extract relevant information

Course Outline

Introduction to Data Mining

  • The Purpose of Data Mining and When it Should be Used
    • Six Sigma Roadmap Application
    • The Data Difference
    • Step-by-Step Guide
    • Analysis Tools
    • What Can Be Learned
  • Pitfalls of Data Mining
    • Observational Data Conclusions
    • Specification Influence
    • Confounding
    • Interactions
    • Large Amounts of Data
  • Planning the Study
    • Questions vs. Data Sources
    • Sampling
    • Data Organization
  • Data Mining Strategy
  • Data Cleaning

Statistical Reasoning

  • The Logic of Statistical Reasoning
    • Scientific Method
    • Fundamental Question Statistics Can Answer
  • Statistical Testing
    • Four Steps of Statistical Testing
    • Two Decision Errors
    • Two Ways to Control the More Serious Error
    • p-Values
    • Five Conditions to Accept a Conclusion from Data
    • Confidence Intervals

One and Two Sample Comparison of Means

  • One Sample Comparison
    • Analysis Roadmap
    • t-Distribution
    • Non-Parametric Sign and Wilcoxon Tests
    • Examples
  • Two Sample Comparison
    • Analysis Roadmap
    • Comparison of Standard Deviations
    • F-Distribution
    • Non-Parametric Mann Whitney Test
    • Examples
    • Paired t-Test

Three or More Sample Comparison of Means

  • Analysis of Variance
    • Null and Alternative
    • Partitioning Variation
    • Signal to Noise Ratios
    • Assumptions
    • Analysis Roadmap
    • Examples
    • Residuals
    • Multiple Comparisons
    • Non-Parametric Kruskal Wallis Test and Multiple Comparisons

Simple Linear Regression

  • Correlation
  • Analysis Roadmap
  • Coefficient of Determination
  • Assumptions and Transformations
  • Polynomial Regression
  • Examples
  • Exercise:  Hands-On Helicopter Demonstration

Chi-Square Analysis

  • Contingency Table
  • Chi-Square Distribution
  • Assumption
  • Examples
  • Cross Tabulation and Layers
  • Examples


Basic SPC or the equivalent


Upcoming Events

Customer Reviews

    Your Cart
    Your cart is emptyReturn to Shop
    Reap the benefits

    Login with your Membership Credentials

    Not Yet a Member? Request Membership Now

    Interested in this course for the Future?

    Thanks for letting us know!
    Please fill in the information below so that we can keep you informed.
    I'm not registering yet because

    Introducing our Updated Website Designed to Enhance your Experience

    Explore our revamped website and experience a more user-friendly interface designed to serve you better!

    Thank you for visiting QSG!

    If you have any questions, would like more information, or would like to speak with a QSG representative, please contact us at any time!