2.2.1. Single Bottle Statistics—Descriptive
The data and analysis from experiments performed by the author is now provided, in which a single bottle was emptied 100 times. When students were asked to perform similar experiments for the purpose of single bottle statistics, the number of trials was reduced to 10–25 in the interest of limiting experimentation time outside of class. However, larger number of trials are preferable from the standpoint of obtaining statistically significant results. Here, the measurand of interest is the emptying time, . As demonstrated in what follows, data collected from a single bottle can be used to introduce and/or reinforce the key elements of descriptive statistics, graphical presentation of data, and basic inferential statistics for a single variable. Later, we show how these findings are used to estimate overall emptying time uncertainty.
Before considering in detail the data collected during the course of experiments, it is worthwhile to generate a sequential plot to ensure that the emptying time does not itself change with time/order, i.e., to ensure that time is not a lurking variable [
16] in the analysis that follows (although it is not anticipated, based on the physics of bottle emptying, that the emptying time will vary with the order of collection). Pedagogically, this is an instance in which, prior to generating the plot, students can reflect on what they expect to see from a sequential plot of the data and explain their expectations.
Figure 2a depicts the sequential plot of bottle emptying time for 100 trials performed with the same bottle (cf. Figure 7a for a diagram of the bottle). The data appears as expected, possessing a time-invariant central tendency and spread, quantified later, with the scatter in the data likely caused by subtle differences in the initial conditions of the experiment, for example, differences in filled volume, initial fluid motion within the bottle, and geometry of the air–water interface at the bottle opening. Reaction time likely plays a role as well. We can therefore proceed with further analysis and turn our attention to descriptive statistics and graphical presentations of those results.
At this point, we make no assumption regarding the distribution of the data, nor do we make any inference from the data. Rather, we are interested in providing concise summary measures of the data at hand [
16]. The summary measures of interest will describe the central tendency of the data, the spread—or variability—about the central tendency, and the shape of the data. More specifically, these measures will be the sample mean , median, sample standard deviation, five-number summary, and the skewness and kurtosis. We can also graphically present the data to highlight these metrics.
The mean (
A1) is easily calculated using a spreadsheet, and for the data shown in
Figure 2a, the mean value is
. The median—the middle value in the ordered set of data—is
. A difference of only
exists between the two measures of the central tendency. There are many ways to characterize the spread, or variability, of the data about the measure of the central tendency. We will begin by generating a “five-number summary” after the data has been ordered from minimum to maximum (
j will be the index for the ordered dataset). Once ordered, we can easily obtain the range
, and the interquartile range, i.e.,
, where
Q is used to denote a quartile. The complete five-number summary is presented in
Table 1.
The five-number summary can be depicted graphically in the form of the box-and-whisker plot [
16]. This type of plot is provided on the right edge of
Figure 2a. The shape of the box-and-whisker plot reinforces the notion that the distribution of emptying times measured for a single bottle is nearly symmetric. Aside from the five-number summary, we can calculate a single value to characterize the spread, namely, the sample standard deviation
s (
A2), which for the data has a value of
. The standard deviation can be considered as a kind of “average” distance the data sits, as a whole, from the central tendency. The extent of the standard deviation about the sample mean, i.e.,
and
, is also shown in
Figure 2a, and we find that 71 out of 100 data points, i.e., 71%, fall within this region.
The symmetry and peakedness of the measurand distribution, both measures of the shape, can each be expressed by a single value—the skewness
(
A3) and kurtosis
(
A4), respectively. We find, for the data in
Figure 2a, the value of the skewness to be
= −0.34 and the kurtosis
= 3.03. Skewness can take on both positive and negative values. Positive skewness implies a distribution shape that is shifted to the right, or, more specifically, the mean value exceeds the mode. Negative skewness indicates the reverse–the value of the mode exceeds the mean. For a symmetric distribution,
= 0, i.e., the mean value is equal to the mode value [
17]. To interpret the value of the skewness obtained from our data, let us consider the following rules of thumb [
18]; highly skewed,
; moderately skewed,
; approximately symmetric,
. Using this interpretation with a value of
= −0.34, we can conclude that the distribution is approximately symmetric. The kurtosis calculated from a set of data is often compared against the value for a known distribution, the normal distribution being a frequent comparator. For a normal distribution,
= 3, with values of
representative of a flatter distribution, and values of
for more peaked distributions [
17]. Therefore, the calculated value for the kurtosis (
= 3.03) is reasonably close to that of a normal distribution. It should not be surprising then that we found 71% of the data to fall within
, as 68% of data will fall within
for a normal distribution (and for
n = 100 we expect that
and
; in other words, the sample is approaching a population).
The quantitative measures of the central tendency, spread, and shape can also be graphically presented, as we have seen with the sequential plot and box-and-whisker plot. However, a frequency distribution (i.e., histogram) more clearly illustrates the mode and shape, and it is worth developing one here. Later, we will use this histogram from our experimentally determined data to support the selection of a standard probability density function, so as to further interpret the data.
For the student, it is important to recognize the bin number (or number of classes),
k, should be selected to adequately convey the nature of the distribution of data. Little can be learned from a histogram with too many or too few bins [
16]. Although the development of a histogram can be highly subjective—the selection of bin number and boundaries is at the discretion of the student—there are a few rules of thumb to consider: (1) in general [
19], the bin number should be some value between 5–20; (2) for
, consider using
as a starting estimate [
20]; and (3) for large enough
n, try
as an initial estimate [
21]. Regardless of method, one should take care to consider the resolution with which the measurand was obtained, as this can influence the frequency of values contained in bins as the value of
k gets large with increasing
n, i.e., bin sizes smaller than the resolution of the measuring instrument can lead to artificially empty bins.
For
n = 100, the rules of thumb suggest that
k = 10 or 13 will be reasonable. Using the value of the range (cf.
Table 1), this yields bins with widths of 0.20 and 0.15 s. The data from
Figure 2a was binned using
, and the frequency
F of occurrences in each bin was tabulated. The result is shown in
Figure 2b, and the distribution demonstrates that the data is unimodal and approximately symmetric (consistent with the calculated value of
and the box-and-whisker plot shape).
2.2.2. Single Bottle Statistics—Inference
In the previous section, we did not utilize a standard distribution to interpret the data; we simply described the data quantitatively and graphically. However, the purpose of obtaining a sample set of emptying times is to ultimately develop an estimate of the “true” emptying time for any particular bottle in the complete set of , i.e., an estimate of the population mean including a statistical uncertainty (which will be combined with a measurement uncertainty). This will be used in later portions of this work when the results of different bottles are used collectively.
The evidence presented thus far suggests that the emptying times are, or can be modeled as being, normally distributed. Specifically, the skewness
(near zero) implies the distribution is approximately symmetric, and the kurtosis
= 3.03 suggests the distribution is near-normal; the number of data points contained within
and the frequency distribution of
Figure 2b hint at the characteristic Gaussian shape. If the assumption that emptying times are normally distributed is valid, then with the data from a sample, we can estimate the population mean using
, where the
subscript implies a population parameter and the degrees of freedom is expressed as
. The Student-
t variable
is tabulated and is a function of the degrees of freedom and percent confidence specified [
21]. The sample mean and sample standard deviation have already been calculated; to utilize this estimate we must clearly demonstrate normalcy in the data.
To supplement the evidence, we can strengthen the visual assessment of normalcy in two ways. The first involves superimposing atop the data from
Figure 2b a continuous curve representing a normal frequency distribution. We can generate such a curve by starting with the probability density function for a normally distributed variable (see
Appendix B). Recalling that the probability density
p can be related to the relative frequency
f and therefore the frequency
F for a finite sample, we can create the normal distribution
. Added to the frequency distribution in
Figure 2b is the continuous curve representing a normal distribution. A visual inspection of
Figure 2b shows that the frequency distribution for the data is consistent with the shape of a normal distribution.
A second graphical method to evaluate the normalcy of data is the normal probability plot. This is a plot of the standard normal variable,
z, versus the measurand, i.e.,
. Each value of
z is obtained by assigning every ordered data point an equal cumulative probability and then calculating the corresponding value of
z based on the cumulative probability, assuming that the data is normally distributed. Specific details for developing the plot can be found in standard statistics textbooks [
16,
21]. The purpose of the normal probability plot is that if the data is approximately normal, the plot of
z vs.
will be linear according to the definition of the standard normal variable, i.e.,
. Therefore, rather than relying on a comparison of curved shapes as in
Figure 2b, we can visually inspect how the data falls against a straight line. A normal probability plot for the bottle emptying data is given in
Figure 3a along with a straight line to guide eye. This graphical assessment of normalcy is reasonable if there at least 20 data points [
16], as is the case here (
), and under conditions when students are asked to perform 25 measurements (e.g., if the instructor is intending to emphasize the statistics aspect of this exercise). The majority of the points in
Figure 3a fall along the straight line and would pass a “fat pencil test”, with only a few points at the upper and lower tails showing deviation. Little weight is placed on the extreme points if they fail to fall along the straight line [
21].
Lastly, a final quantitative justification for treating the bottle emptying time as being normally distributed can be had by determining whether or not the empirical distribution when compared to a normal distribution passes the Kolmogorov–Smirnov test, cf.
Appendix B and Equation (
A9). This involves comparing the maximum deviation between the empirical cumulative frequency and the cumulative frequency for a continuous normal distribution. For the
points, the maximum deviation was found to be
, which is below the critical value of
for a level of confidence of
. This indicates that the normal distribution passes the Kolmogorov–Smirnov test. A visual comparison of the empirical cumulative frequency and the theoretical normal cumulative frequency can be found in
Figure 3b.
In summary, the collective evidence presented in this section suggests that bottle emptying times can be reasonably modeled as normally distributed, for the sake of estimating the population mean for emptying time. Thus, using Equation (
A5), we can determine that the emptying time for this particular bottle, accounting for the statistical uncertainty is
(with 95 % confidence).
2.2.3. Single Bottle Statistics—Additional Examples
Given the many academic quarters that the author has used this exercise, there have been numerous instances in which the same type of bottle has been used by students for data collection. Although these redundant sets of data are not included in the sections that follow (where we attempt to use unique bottles for each data point), here they present the opportunity to show that a similar presentation of single bottle statistics (descriptive and inferential) can be had, without subjecting one experimenter to the tedious task of emptying a bottle 100 times. Examples of data collected by multiple students using the same type of bottles is shown in
Figure 4. Within
Figure 4a we see sequential plots and box-and-whisker plots for three different bottles (milk, tea, and soda).
Figure 4b shows the corresponding histograms and normal distributions. Again, the data suggests that bottle emptying time can be reasonably modeled as normally distributed. Perhaps the only distinct feature of
Figure 4a as compared to
Figure 2a are instances of localized increases in the magnitude of the data spread and abrupt changes in the value of the central tendency (e.g.,
60–70 “tea” data and
70–80 “soda” data). It is thought that these instances are due to differences in student interpretation of when a bottle has fully emptied.
Another permutation of the single-bottle experiment is to provide students with the same type of bottle and to group the emptying time data that is collected. In this case, given the size of a class, a significant number of data points for a single bottle can be obtained in a short period of time. An example of the data that results is shown in
Figure 5, where 75 students were provided with wine bottles of the same style and size. Each student was asked to contribute
values of bottle emptying time, yielding a total of
. The data arranged in sequential order is provided in
Figure 5a. Here, again we see the instances of abrupt changes in the magnitude of the central tendency for each student data cluster (e.g.,
200–210), which can be attributed to differences in student interpretation of the end of an emptying experiment. Given a large enough set of emptying times, the tendency is for the data to appear as though all values were collected from a single experimenter; this is especially true if the data is randomized. To see this, compare the sequential plots for the sequential and randomized data in
Figure 5a. The histogram for this data is shown in
Figure 5b where a subtle skewness is observed, when compared to the normal distribution, favoring a longer emptying time by a small fraction of students.