# Objectives of Statistics

### From MM*Stat International

English |

Português |

Français |

Español |

Italiano |

Nederlands |

## A Definition of Statistics

Statistics is the science of collecting, describing and interpreting data, i.e. the tool box underlying empirical research.
In analyzing data, scientists aim to describe our perception of the world. Descriptions of stable relationships among observable phenomena in the form of theories are sometimes referred to as being explanatory. (Though one could argue that science merely describes *how* things happen rather then *why.*) Inventing a theory is a creative process of restructuring information embedded in existing (and accepted) theories and extracting exploitable information from the real world. (We are abstracting from purely axiomatic theories derived by logical deduction.)
A first exploratory approach to groups of phenomena is typically carried out using methods of *statistical description*.

### Descriptive Statistics

*Descriptive statistics* encompasses tools devised to organize and display data in an accessible fashion, i.e. in a way that doesn’t exceed the perceptual limits of the human mind. It involves the quantification of recurring phenomena. Various summary statistics, mainly averages, are calculated; raw data and statistics are displayed using tables and graphs.
Statistical description can offer important insights into the occurrence of isolated phenomena and indicate associations among them. But can it provide results that can be considered laws in a scientific context? Statistics is a means of dealing with variations in characteristics of distinct objects. Isolated objects are thus not representative for the population of objects possessing the quantifiable feature under investigation. Yet variability can be the result of the (controlled or random) variation of other, underlying variables. Physics, for example, is mainly concerned with the extraction and mathematical formulation of exact relationships, not leaving much room for random fluctuations. In statistics such random fluctuations are modelled. Statistical relationships are thus relationships which account for a certain proportion of stochastic variability.

### Inductive Statistics

In contrast to wide areas of physics, empirical relationships observed in the natural sciences, sociology and psychology (and more eclectic subjects such as economics) are statistical. Empirical work in these fields is typically carried out on the basis of experiments or sample surveys. In either case, the entire population cannot be observed—either for practical or economic reasons. Inferring from a limited sample of objects to characteristics prevailing in the underlying population is the goal of *inferential* or *inductive statistics*. Here, variability is a reflection of variation in the sample and the sampling process.

### Statistics and the Scientific Process

Depending on the stage of the scientific investigation, data are examined with varying degrees of prior information. Data can be collected to explore a phenomenon in a first approach, but it can also serve to statistically test (verify/falsify) hypotheses about the structure of the characteristic(s) under investigation. Thus, statistics is applied at all stages of the scientific process wherever quantifiable phenomena are involved. Here, our concept of quantifiability is sufficiently general to encompass a very broad range of scientifically interesting propositions. Take, for example, a proposition such as ’a bumble bee is flying by’. By counting the number of such occurrences in various settings we are quantifying the occurrence of the phenomenon. On this basis we can try to infer the likelihood of coming across a bumble bee under specific circumstances (e.g. on a rainy summer day in Berlin). Descriptive statistics provide the means to summarize and visualize data. The following table, which contains the frequency distribution of numbers drawn in the National Lottery provides an example of a such a summary. Cursory examination suggests that some numbers occur more frequently than others. Does this suggest bias in the way numbers are selected? As we shall see, statistical methods can also be used to test such propositions.

1 | 2 | 3 | 4 | 5 | 6 | 7 |

311 | 337 | 345 | 316 | 321 | 335 | 322 |

8 | 9 | 10 | 11 | 12 | 13 | 14 |

309 | 324 | 331 | 315 | 302 | 276 | 310 |

15 | 16 | 17 | 18 | 19 | 20 | 21 |

322 | 319 | 337 | 331 | 326 | 312 | 334 |

22 | 23 | 24 | 25 | 26 | 27 | 28 |

322 | 319 | 304 | 325 | 337 | 323 | 285 |

29 | 30 | 31 | 32 | 33 | 34 | 35 |

321 | 311 | 333 | 378 | 340 | 291 | 330 |

36 | 37 | 38 | 39 | 40 | 41 | 42 |

340 | 320 | 357 | 326 | 329 | 335 | 335 |

43 | 44 | 45 | 46 | 47 | 48 | 49 |

311 | 314 | 304 | 327 | 311 | 337 | 361 |