Basic Concepts and Definitions in Statistics

Dave Amiana - Apr 27 '21 - - Dev Community

The human civilization has long been exposed to statistical data, the long history of our collective experience with statistical information extends back to the beginning of mankind (Walpole, 1968). Turning data into information demand a statistical process that is crucial to decision-making. Our understanding of statistical methods revolves around our familiarity with the basic concepts of statistics.

This article will introduce some basic concepts that we encounter during our elementary courses in statistics. Here is the list of terms we should be familiar with:

  • variable - pertains to an attribute that can assume different values
  • data - pertains to the value of a variable
    • random data - pertains to the value of variables determined by chance
  • data set - pertains to a collection of data
  • data value - pertains to each value in a data set
  • population - pertains to all entities that are subject to the study
    • parameter - pertains to numerical summary or any measurement derived from a population
  • sample - pertains to a subset of the population
    • statistic - pertains to a measure of the sample

Let's apply what we learned.

Consider this example:

A survey conducted at the University of Arizona revealed that students that attended class 80-100% of the time mostly received an A in class. Meanwhile, those who attended 70-80% of the time usually received B or C.

Try answering the following questions:

  1. what are the variables considered in the survey?
  2. what are the data in the survey?
  3. what is the population under the survey?
  4. where is the sample drawn in the survey?

Answers

  1. The variables considered in the survey are grades and attendance.
  2. The data consist of specific grades and attendance numbers.
  3. The population under the survey are students at the University of Arizona.
  4. While not specified, sample is drawn from the students at the University since sample is a subset of our population.

Data and Variables

Let's define the types that our data can embody. Two broad categories determine the operations we can make with our data.

  • Qualitative. Some data cannot be counted, we can represent them in nominal or ordinal forms.
  • Quantitative. Some data can embody numerical values. Qualitative data can either be discrete or continuous.
    • discrete - assumes that a value can be counted e.g. * the number of people.*
    • continuous - can assume an infinite number of values between any two specific values e.g. temperature.

In an experimental setting, we can categorize our data as:

  • Dependent variable is the outcome of an experiment.
  • Independent variable(s) - are variable(s) that often parameterize the outcome. In other words, these are the variables that we manipulate in an experimental setting that possibly affects the outcome of our experiment.

Let's reinforce our understanding by trying to answer the following example:

Identify if entities are discrete or continuous.

  1. wind speed.
  2. weight.
  3. number of pages in a book.
  4. amount of money a person spends.

Identify if entities are quantitative or quantitative.

  1. Marital status.
  2. Time.
  3. Age of person.
  4. Different vitamins are taken.

Consider the experiment below and identify the independent variable and dependent variable.

Richard made 6 paper planes with different wing lengths. He is trying to determine which plane covered the longest trajectory. Assume that Richard threw the planes at the same angle of trajectories.

Click to see the answers

  1. Independent variable: plane's wing lengths.
  2. Dependent variable: distance covered by the plane trajectory.

Levels of Measurement

There are generally four levels of measurement. The list below is organized according to the number of operations we can perform in a given measurement. Let's begin with the simplest.

Type Represents Operations Central trend
Nominal category count mode
Ordinal rank and order add median
Interval entities with equal spacing multiply mean
Ratio entities with true zero trigonometric, exponential, etc. geometric mean

Note that as we go down to our table the operations from above are also valid with the operations below i.e. ratio accumulates all the operations above it.

Let's reinforce our understanding!

Classify each as nominal, ordinal, interval, or ratio level of measurement.

  1. Rankings of students in school.
  2. Weights of people in gym class.
  3. Movie rating as good, fair, or bad.

Note that the answers to some questions here can be found in this blog post: https://dcode.hashnode.dev/basic-concepts-and-definitions-in-statistics.

For some reason, I can't find how to make dropdowns.


References

  1. Walpole, R. E. (1982). Introduction to statistics (No. 04; QA276. 12, W35 1982.).
  2. Bluman, A. G. (2013). Elementary statistics: A step by step approach: A brief version (No. 519.5 B585E.). McGraw-Hill.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .