Problem 40

Question

Listed below are forty ordered computergenerated observations that presumably represent a normal distribution with \(\mu=5\) and \(\sigma=2\). Can the sample be considered random with respect to the number of runs up and down? | Obs. # | y_(i) | Obs. # | y_(i) | Obs. # | y_(i) | Obs. # | y_(i) | | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | | I | 7.0680 | 11 | 7.6979 | 21 | 5.9828 | 31 | 5.2625 | | 2 | 4.0540 | 12 | 4.4338 | 22 | 1.4614 | 32 | 5.9047 | | 3 | 6.6165 | 13 | 5.6538 | 23 | 9.2655 | 33 | 4.6342 | | 4 | 1.2166 | 14 | 8.0791 | 24 | 4.9281 | 34 | 5.3089 | | 5 | 4.6158 | 15 | 4.7458 | 25 | 10.5561 | 35 | 5.4942 | | 6 | 7.7540 | 16 | 3.5044 | 26 | 6.1738 | 36 | 6.6914 | | 7 | 7.7300 | 17 | 1.3071 | 27 | 5.4895 | 37 | 1.4380 | | 8 | 6.5109 | 18 | 5.7893 | 28 | 3.6629 | 38 | 8.2604 | | 9 | 3.8933 | 19 | 4.5241 | 29 | 3.7223 | 39 | 5.0209 | | 10 | 2.7533 | 20 | 5.3291 | 30 | 3.5211 | 40 | 0.5544 |

Step-by-Step Solution

Verified
Answer
From this exercise, we should obtain a certain number of 'runs'. To determine if the dataset can be considered random with respect to the number of runs up and down, compare the obtained number of 'runs' with the formula \(2 \times \sqrt{N}\) (roughly), where \(N\) is the total number of observations. If it fits in this expectation, after calculating, then it is considered random with respect and in line with randomness in the context of a normal distribution.
1Step 1: Ordering the dataset
The first task is looking at the data and determining the sequence of increasing ('up') and decreasing ('down') runs. A 'run' starts once the direction swapped from up to down, or vice versa. Begin with the first data point, and look at the data point following it. If the second point is larger, it's the start of an 'up' run, if it's smaller it's the start of a 'down' run.
2Step 2: Count the runs
Move to the next point in the dataset. If the number is larger than the last one and you are in an 'up' run, then you are still in the same run. If the number is smaller, then this is the start of a new 'down' run. Continue this process for the entire dataset, incrementing your 'run' count every time you swap from an up run to a down run, or from a down run to an up run.
3Step 3: Evaluate Randomness
If the number of runs is around \(2 \times \sqrt{N}\), where \(N\) is number of data points, or more strictly, lies within the interval \([2 \times \sqrt{N/2}, 2 \times \sqrt{2N}]\), then the sequence can be considered random with respect to its order.

Key Concepts

Normal distributionRun testStatistical hypothesis testingOrdered observations
Normal distribution
Normal distribution is one of the most important probability distributions in statistics. It is often referred to as the bell curve because of its symmetric and bell-shaped curve appearance. The probability density function of the normal distribution is defined by its mean, \( \mu \), and standard deviation, \( \sigma \). In simple terms:
- Mean (\( \mu \)) reflects the center of the distribution.
- Standard deviation (\( \sigma \)) indicates the spread of the data.

In a normally distributed data set, about 68% of observations lie within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations. This characteristic makes normal distribution incredibly useful in statistical analysis and hypothesis testing. The data in the given exercise is assumed to be normally distributed with \( \mu=5 \) and \( \sigma=2 \), which forms the baseline for checking how randomness affects the sequence of runs in the data.
Run test
The run test, also known as the Wald–Wolfowitz runs test, is a non-parametric statistical test that is used to assess the randomness of a data sequence. The term 'run' refers to a series of consecutive elements with a similar property. In this context, we consider runs of increasing and decreasing observations in a sequence:
- An 'up' run occurs when consecutive ordered observations increase.
- A 'down' run occurs when they decrease.

To perform a run test, we start from the initial data point and count how many times runs switch between increasing and decreasing, ensuring each swap marks the end of one run and the beginning of another. This test helps statisticians determine whether a sequence is randomly distributed or if there's a notable pattern. The number of runs observed can be compared to expected values in a random sequence, giving insight into the randomness of the ordered observations.
Statistical hypothesis testing
Statistical hypothesis testing is a fundamental method in statistics used to decide whether a hypothesis about a data sample is consistent with the evidence from a sample. It encompasses several steps:
- Formulate null and alternative hypotheses. The null hypothesis often represents a statement of 'no effect' or 'no difference'.
- Determine a significance level, typically denoted as \( \alpha \), which is the probability of rejecting the null hypothesis when it is actually true.
- Calculate a test statistic based on the sample data, and compare it to a critical value associated with the chosen \( \alpha \).

If the test statistic falls within the critical region, the null hypothesis is rejected. In the context of this exercise, hypothesis testing can be used to determine whether the sequence of runs in the ordered observations is random or influenced by some systematic order.
Ordered observations
Ordered observations refer to a sequence of data points arranged according to their value, either in increasing or decreasing order. In statistical analysis, especially in tests like the run test, it's crucial to evaluate how values transit from one to another in an ordered manner:
- This ordering helps in detecting patterns or randomness in data.
- By analyzing the sequence of ups and downs in the ordered dataset, one can perform tests to check the randomness.

The process involves evaluating how often values rise or fall as you move through the data points. Ordered observations provide a structured way to see and assess patterns, which can be essential in validating the assumptions such as randomness or normality, often crucial for making broader conclusions about the population from which the sample is drawn. In the exercise, the ordered 40 observations are critically assessed to test their randomness using the run test technique.