# Objectives of Statistics

 English Português Français ‎Español Italiano Nederlands

## A Definition of Statistics

Statistics is the science of collecting, describing and interpreting data, i.e. the tool box underlying empirical research. In analyzing data, scientists aim to describe our perception of the world. Descriptions of stable relationships among observable phenomena in the form of theories are sometimes referred to as being explanatory. (Though one could argue that science merely describes how things happen rather then why.) Inventing a theory is a creative process of restructuring information embedded in existing (and accepted) theories and extracting exploitable information from the real world. (We are abstracting from purely axiomatic theories derived by logical deduction.) A first exploratory approach to groups of phenomena is typically carried out using methods of statistical description.

### Descriptive Statistics

Descriptive statistics encompasses tools devised to organize and display data in an accessible fashion, i.e. in a way that doesn’t exceed the perceptual limits of the human mind. It involves the quantification of recurring phenomena. Various summary statistics, mainly averages, are calculated; raw data and statistics are displayed using tables and graphs. Statistical description can offer important insights into the occurrence of isolated phenomena and indicate associations among them. But can it provide results that can be considered laws in a scientific context? Statistics is a means of dealing with variations in characteristics of distinct objects. Isolated objects are thus not representative for the population of objects possessing the quantifiable feature under investigation. Yet variability can be the result of the (controlled or random) variation of other, underlying variables. Physics, for example, is mainly concerned with the extraction and mathematical formulation of exact relationships, not leaving much room for random fluctuations. In statistics such random fluctuations are modelled. Statistical relationships are thus relationships which account for a certain proportion of stochastic variability.

### Inductive Statistics

In contrast to wide areas of physics, empirical relationships observed in the natural sciences, sociology and psychology (and more eclectic subjects such as economics) are statistical. Empirical work in these fields is typically carried out on the basis of experiments or sample surveys. In either case, the entire population cannot be observed—either for practical or economic reasons. Inferring from a limited sample of objects to characteristics prevailing in the underlying population is the goal of inferential or inductive statistics. Here, variability is a reflection of variation in the sample and the sampling process.

### Statistics and the Scientific Process

Depending on the stage of the scientific investigation, data are examined with varying degrees of prior information. Data can be collected to explore a phenomenon in a first approach, but it can also serve to statistically test (verify/falsify) hypotheses about the structure of the characteristic(s) under investigation. Thus, statistics is applied at all stages of the scientific process wherever quantifiable phenomena are involved. Here, our concept of quantifiability is sufficiently general to encompass a very broad range of scientifically interesting propositions. Take, for example, a proposition such as ’a bumble bee is flying by’. By counting the number of such occurrences in various settings we are quantifying the occurrence of the phenomenon. On this basis we can try to infer the likelihood of coming across a bumble bee under specific circumstances (e.g. on a rainy summer day in Berlin). Descriptive statistics provide the means to summarize and visualize data. The following table, which contains the frequency distribution of numbers drawn in the National Lottery provides an example of a such a summary. Cursory examination suggests that  some numbers occur more frequently than others.  Does this suggest bias in the way numbers are selected?  As we shall see, statistical methods can also be used to test such propositions.

 1 2 3 4 5 6 7 311 337 345 316 321 335 322 8 9 10 11 12 13 14 309 324 331 315 302 276 310 15 16 17 18 19 20 21 322 319 337 331 326 312 334 22 23 24 25 26 27 28 322 319 304 325 337 323 285 29 30 31 32 33 34 35 321 311 333 378 340 291 330 36 37 38 39 40 41 42 340 320 357 326 329 335 335 43 44 45 46 47 48 49 311 314 304 327 311 337 361