Problem 3

Question

Bei einer Klausur konnten maximal 50 Punkte erreicht werden. In der folgenden Tabelle ist zu jeder Note die Punktzahl \(x\) angegeben, die zum Erhalt dieser Note mindestens erreicht werden musste. \begin{tabular}{|c||ccccc|} \hline Note & 1 & 2 & 3 & 4 & 5 \\ \hline\(x\) & 35 & 20 & 15 & 10 & 0 \\ \hline \end{tabular} Folgende Punktzahlen \(x_{1}, x_{2}, \ldots, x_{20}\) wurden von den 20 Teilnehmern erzielt: \(\begin{array}{cccccccccc}2 & 6 & 6 & 6 & 9 & 10 & 10 & 13 & 15 & 15 \\ 18 & 18 & 18 & 20 & 23 & 23 & 27 & 27 & 36 & 48\end{array}\) a) Skizzieren Sie die empirische Verteilungsfunktion der gegebenen Messreihe \(x_{1}, x_{2}, \ldots, x_{20}\) b) Zeichnen Sie zu den Daten ein Histogramm. Wählen Sie hierbei die Klasseneinteilung \([0,10),[10,15),[15,20),[20,35),[35,50]\) (entsprechend der Notengebung). c) Bestimmen Sie die folgenden statistischen Mafzahlen für obige Messreihe: arithmetisches Mittel, Median, 0.33-Quantil, Quartilabstand, Spannweite, empirische Varianz, empirische Standardabweichung d) Veranschaulichen Sie die Struktur obiger Messreihe mit Hilfe eines Boxplots.

Step-by-Step Solution

Verified
Answer
Mean: 17.25, Median: 16.5, 0.33-Quantile: 10, IQR: 13.5, Range: 46, Variance: 147.69, Standard Deviation: 12.15
1Step 1 - List the Scores
List out the given scores in ascending order: 2, 6, 6, 6, 9, 10, 10, 13, 15, 15, 18, 18, 18, 20, 23, 23, 27, 27, 36, and 48.
2Step 2 - Empirical Distribution Function
To sketch the empirical distribution function, plot the cumulative frequency against the scores. For each score, calculate the relative frequency as follows: \[F(x) = \frac{k}{20}\] where \( k \) is the number of students scoring less than or equal to \( x \). For example, for \( x = 2 \), \( F(2) = \frac{1}{20} \), for \( x = 6 \), \( F(6) = \frac{4}{20} = 0.2 \), and so on.
3Step 3 - Construct Histogram
Create a histogram with the class intervals: [0,10), [10,15), [15,20), [20,35), and [35,50]. Count the frequencies for each interval and plot the histogram: - [0,10): 5 scores - [10,15): 3 scores - [15,20): 5 scores - [20,35): 6 scores - [35,50]: 1 score.
4Step 4 - Calculate Arithmetic Mean
Calculate the arithmetic mean: \[ \bar{x} = \frac{1}{20} \sum_{i=1}^{20} x_{i} = \frac{2 + 6 + 6 + 6 + 9 + 10 + 10 + 13 + 15 + 15 + 18 + 18 + 18 + 20 + 23 + 23 + 27 + 27 + 36 + 48}{20} \approx 17.25 \]
5Step 5 - Find Median
Since there are 20 scores, the median will be the average of the 10th and 11th scores. The 10th score is 15 and the 11th score is 18. Thus, \[ \text{Median} = \frac{15 + 18}{2} = 16.5 \]
6Step 6 - Calculate 0.33-Quantile
The 0.33-Quantile corresponds to the score at position \( 0.33 \times 20 = 6.6 \), which lies between the 6th and 7th scores. Interpolating between the 6th (10) and 7th (10), we get \[ 0.33-\text{Quantil} \approx 10 \]
7Step 7 - Find Quartile Range
To find the interquartile range (IQR), calculate Q1 (25th percentile) and Q3 (75th percentile). - Q1 is the average of the 5th and 6th scores: \[ Q1 = 9.5 \] - Q3 is the average of the 15th and 16th scores: \[ Q3 = 23 \] Then, \[ IQR = Q3 - Q1 = 23 - 9.5 = 13.5 \]
8Step 8 - Calculate Range
The range is the difference between the highest and lowest scores: \[ \text{Range} = 48 - 2 = 46 \]
9Step 9 - Find Empirical Variance
Calculate the empirical variance using: \[ s^2 = \frac{1}{n} \sum_{i=1}^{20} (x_{i} - \bar{x})^2 \approx \frac{1}{20} \sum_{i=1}^{20} (x_{i} - 17.25)^2 \approx 147.69 \]
10Step 10 - Find Empirical Standard Deviation
Calculate the empirical standard deviation as the square root of variance: \[ s = \sqrt{147.69} \approx 12.15 \]
11Step 11 - Boxplot Visualization
To illustrate the data using a boxplot, draw a diagram with the minimum, Q1, median, Q3, and maximum values. - Minimum: 2 - Q1: 9.5 - Median: 16.5 - Q3: 23 - Maximum: 48

Key Concepts

empirical distribution functionhistogramarithmetic meanmedian0.33-quantilequartile rangerangeempirical varianceempirical standard deviationboxplot
empirical distribution function
To understand the empirical distribution function (EDF), think of it as a cumulative plot that shows the proportion of data points less than or equal to a given value. For each score, we calculate the relative frequency. For instance, if 4 students scored 6 or below out of 20, the EDF at 6 is 0.2. The function can be plotted by connecting these points.
histogram
A histogram is a type of bar chart that represents the frequency distribution of numerical data. It displays the data in intervals. For instance, given the class intervals [0,10), [10,15), [15,20), [20,35), [35,50], each bar shows the number of scores falling into each interval. This visual representation helps in understanding data patterns and distributions.
arithmetic mean
The arithmetic mean, or average, is calculated by summing all data points and dividing by the number of points. For example, with scores: 2, 6, 6, 6, 9, 10, 10, 13, 15, 15, 18, 18, 18, 20, 23, 23, 27, 27, 36, and 48, the mean is the total sum divided by 20. This gives us an average score of 17.25.
median
The median is the middle value of a data set when ordered from least to greatest. If the dataset has an even number of points, it is the average of the two middle numbers. Here, with 20 scores, the median is the average of the 10th and 11th scores, which are 15 and 18. Thus, the median is 16.5.
0.33-quantile
A quantile is a point or interval on a data set that divides the data into intervals with equal probabilities. The 0.33-quantile divides the dataset so that 33% of the data points are below this value. Here, it lies between the 6th and 7th scores, both being 10, so the 0.33-quantile is 10.
quartile range
The quartile range (or interquartile range, IQR) is the difference between the third quartile (Q3) and the first quartile (Q1). Q1 is the median of the first half of the data, and Q3 is the median of the second half. In this set, Q1 is 9.5 and Q3 is 23, making the IQR 13.5.
range
The range of a data set is the difference between the highest and lowest values. It provides a measure of how spread out the data is. With scores ranging from 2 to 48, the range is calculated as 48 - 2 = 46.
empirical variance
Empirical variance measures how much the values in a data set differ from the mean. It's calculated by finding the average of the squared differences from the mean. For these scores, the variance is approximately 147.69, indicating the extent of variability in the data.
empirical standard deviation
The empirical standard deviation is the square root of the variance. It provides a measurement of data dispersion relative to the mean. For the given scores, with a variance of approximately 147.69, the standard deviation is about 12.15.
boxplot
A boxplot is a graphical representation of a data set's minimum, first quartile (Q1), median, third quartile (Q3), and maximum. For the given data, the boxplot includes:
  • Minimum: 2
  • Q1: 9.5
  • Median: 16.5
  • Q3: 23
  • Maximum: 48
This visualization helps to see the central tendency, dispersion, and skewness of the data easily.