**Probability** is a mathematical language used to discuss uncertain events, and probability plays a crucial role in statistics. Any measurement or data collection effort is subject to several sources of variation. By this, we mean that if the same measurement were repeated, then the answer would likely change.

The concept of probability, used since the 17th century, has become over time the basis of several scientific disciplines, although it is not univocal. In particular, it is the basis of a branch of statistics (inferential statistics), used by many natural and social sciences.

The mathematical discipline that studies the intuitive concept of probability, specifying its substantial content and formal rules and developing them into broader theories, is called calculus of probabilities.

The concept of probability seems completely unknown to the ancients, although it was implicitly present. The first known studies of probability issues refer to the game of dice, and are the De aleae ludo by G. Cardano and a letter by G. Galilei. A more extensive development occurred with a correspondence between B. Pascal and P. Fermat, originating around 1650 from problems posed to Pascal by an avid gambler, the Chevalier de Méré. The study of these problems was undertaken by various authors, and in 1657 appeared what can be considered the first treatise on probability, De ratiocinio in ludo aleae, by C. Huygens. But the most important contribution, in this phase of development, is due to G. Bernoulli that in his book Ars coniectandi, appeared posthumously in 1713, gave a first systematic treatment.

Another milestone is the Théorie analytique des probabilités, by P.-S. de Laplace, appeared in 1812, which presents in an organic theory the classical developments of the calculus of probability. In the 20th century has been clarified (in particular by F.P. Cantelli) the notion of random variable. Have been introduced and studied (B. De Finetti, A.N. Kolmogorov, P. Lévy) the distributions infinitely divisible (or indefinitely decomposable), and with their help came to the most complete form of the central theorem of convergence, with the essential contribution, in addition to the three mentioned above, of W. Feller, A. Kincin, B. Gnedenco. But the most important contributions are given by the study of random processes, which began with the chain processes introduced by A. Markov, and then developed with the fundamental contributions of Kolmogorov, Lévy, J.L. Doob, Kincin, Feller, K.L. Chung.

In the common language, every time there is a multiplicity of alternatives and we do not have sufficient elements to decide which of these alternatives will occur, we speak of probability of the various alternatives. The attempt to clarify the concept of probability has been developed historically through the analysis of the way it is used in the various contexts in which it occurs and has led to the emergence of two main ways of using the term probability:

- the probability of an event understood as an expression of a physical property of the event itself and the conditions under which it occurs (ontological point of view);
- the probability of an event as the degree of confidence that an individual has in the occurrence of the event in question, and therefore concerning the state of our knowledge about the event at a given instant (or time interval) rather than the objective properties of the event itself (epistemic viewpoint).

In an imprecise way (and sometimes polemical) some call objective the ontological point of view on probability and subjective the epistemic one. The first presupposes the notion (intuitive) of equiprobability events. One speaks of equiprobability events when there is not sufficient reason that, of the various modes of which a phenomenon is susceptible, one occurs more often than the others.

The calculus of probability has had an enormous development at the beginning of the 20th century, and this is related to the increasing importance that it has assumed in applications. In fact, in all experimental sciences we recognize a situation that can be framed in the following general scheme:

- one considers a set I of physical systems for which it makes sense to talk about the occurrence of event A;
- we assume that each system of the set I satisfies a set of conditions, which we denote with the symbol C and which represent the conditions of preparation of the experiment in which the occurrence of the event A occurs;
- it is assumed that all other conditions that, in addition to the set of conditions C, can influence the occurrence of the event A occur in a completely random way within the set I;
- we verify experimentally the number of occurrences of the event A on a large number, say N, of systems chosen at random in the family I, and we denote N(A∣C) such number;
- as N increases, the quantities N(A∣C)/N (relative frequencies of event A) tend to stabilize around a fixed number independent of N and the collective I; this number, which is denoted P(A∣C), is called the conditional probability of A given C.

Sometimes the set of conditions C is considered fixed once and for all and therefore it is omitted from the notations simply writing P(A) instead of P(A∣C) and speaking of probability of the event A without specifying under which conditions this probability is evaluated. Every set I of systems, which satisfies conditions 1, 2, 3 is called statistical collective, or ensemble, or set of independent and identically prepared systems.

The phenomenon described in the previous point e is also called empirical law of chance or law of stabilization of relative frequencies. The term ‘law’ is to be understood in the sense that the observable quantities to which the statistical laws refer are relative frequencies, i.e., quantities that do not concern individuals, but statistical collectives; of these quantities, the statistical laws do not say what values they will assume, but what values they will tend to assume.