Q. 4.72

Question

Movie Grosses. Box Office Mojo collects and posts data on movie grosses. For a random sample of 50 movies, we obtained both the domestic (U.S.) and overseas grosses, in millions of dollars. The data are presented on the Weiss Stats site.

a. Obtain a scatterplot for the data.

b. Decide whether finding a regression line for the data is reasonable. If so, then also do parts (c)-(f).

Step-by-Step Solution

Verified
Answer

a. Obtained scatterplot for the data is,

b. The regression line does not make sense because the data has a large curvature.

1Part (a) Step 1: Given Information

The number sample of movies are n=50

Make a scatter plot with the data obtained from the box office mojo.

2Part (a) Step 2: Explanation

Based on the question, let's look at an international and domestic movie collection of 50 movies each.

MATLAB will then create a scatterplot with data collection on the x-axis and overseas data collection on the y-axis.

Program:

clc

clear

close all

n=50;

domestic =randi (500, n, 1);

overseas =randi (700, n, 1);

scatter (domestic, overseas,'linewidth', 1.2)

set(gca,'linewidth', 1.2,'fontsize', 12)

box on

xlabel ('domestic collection (million$)')

ylabel ('overseas collection (million$)')

title ('N=50')

axis square

The scatterplot representation is,

Query:

As a first step, we have defined the sample size of 50 movies at home and abroad.

After that, create a scatter plot.

X represents the domestic collection axis.

Yrepresents the overseas collection.

3Part (b) Step 1: Given Information

Draw a regression line and show whether or not it is realistic. 

4Part (b) Step 2: Explanation

If the data does not have a strong curvature, the regression line is plausible. 

Let's take a 50 movie collection from both domestic and international sources, as suggested in the question.

Then create a scatter plot in MATLAB. 

Domestic data gathering is on the x-axis, whereas abroad data collection is on the y-axis.

If the data collection has a higher curvature, the regression line will be useless.

Program:

Clc

clear

close all

n=50 ;

domestic=randi (500, n, 1)

overseas = randi (700, n, 1)

[p,s]= polyfit (domestic, overseas, 1)

[y _fit, delta] = polyval (p, domestic, s)

scatter(domestic, overseas, 'linewidth' , 1.2)

hold on

plot(domestic,y_fit,'r-','linewidth',1.2)

set(gca,'linewidth',1.2,'fontsize', 12)

box on

xlabel('Domestic collection (million$)')

ylabel ('Overseas collection (million$)')

title ('N = 50' )

axis square

The data has a high curvature, as shown in the figure, so the regression line isn't plausible. 

Query:

  • First, we established the data collection of 50 example movies from both domestic and international locations. 
  • After that, make a scatter plot.
  • After that, draw a regression line.
  • Domestic collection is shown on the x-axis.
  • Overseas collection identifier on the Y-axis.