Paper , Order, or Assignment Requirements
This problem uses the RETIREMENT FUNDS data file that has been referred to multiple times in class. This file is posted on BLACKBOARD next to this assignment in the GRADED ASSIGNMENTS Folder in the MODULES section
a) One of the variables in the data file is the categorical variable, STAR RATING. This variable is a type of rating of the funds analogous to how hotels or movies might be given a certain number of stars to rate them. Here the ratings are from one to five stars with higher ratings indicating better performance. Provide a pie chart summarizing the STAR RATING data. Include the count and percentage of funds in each rating category as well as the names of the categories in your chart. What percentage of the funds had a five star rating? (HINT: To show the labels for the slices of the pie, Click on the LABELS button, then choose “Slice Label”, then check category name, frequency, and percent.)
b) Provide a contingency table that summarizes data simultaneously on the RISK and STAR RATING variables. Just include the counts for each cell and not any corresponding percentages. Use the contingency table to determine how many of the one star rated funds were in the High Risk category.
For the remainder of this question, use only the data on the 10YrReturn. This 10 year return is the average percentage gain per year for the fund’s investments in a ten year period. So, for example, the first fund in the file had an average gain of 11.93% per year over that ten year period.
c) Generate a histogram of the 10YrReturn for these 479 funds. Using the histogram, describe within which rates of return the values are concentrated.
d) Generate a boxplot for the ten year returns. During this ten year period, Dr. Markowski owned a fund that had an average return of 8.4%. Based on the boxplot, approximate the percentage of funds that had a ten year average rate of return higher than that earned by Dr. M’s fund.
e) Use the histogram and boxplot to determine the shape of this data for average ten year returns. Select between approximately symmetric, clearly skewed to the left, and clearly skewed to the right.
f) Using the Display Descriptive Statistics option in MINITAB, display a set of statistics for the ten year returns that includes the mean, standard deviation, and each of the five numbers in the five-number summary. Use this data summary to determine an interval of values for the variable, 10YrReturn, that should include the 10YrReturn for about 95% of the funds.
(HINT: Use the empirical rule)
Q2) The data file, RESTAURANTS2, describes data for 66 restaurants that includes information on ratings in each of three categories, the cost of a meal, a popularity index and the type of cuisine. This problem refers to three of the variables in the data set; the food rating, a restaurant popularity index, and the type of cuisine.
a) Provide basic descriptive statistics on the Food Rating for each type of Cuisine. (To generate separate descriptive statistics for each type of cuisine, choose Stat, then Basic Statistics, then Display Descriptive Statistics, and select Cuisine for the “By variables” box.)
b) Which types of cuisine tend to get the highest food ratings? Which tend to receive the lowest food ratings? What statistic(s) did you use to decide?
c) Which types of cuisine tend to have the greatest variability in food rating? Which types of cuisine tend to have the least variability in food rating? Which statistic(s) did you use to decide?
d) Provide basic descriptive statistics on the Popularity Index for each type of Cuisine. Would you say that the restaurants that have the highest food ratings tend to have the highest popularity scores? Justify your answer.
Q3. A firm classifies its customers’ accounts in two ways: according to the balance outstanding and according to whether or not the account is overdue. The following contingency table gives the number of accounts falling into various categories.
Overdue Overdue Total
Under $100 120 330 450
$100 $500 150 300 450
Over $500 20 80 100
Total 290 710 1000
a) Convert this table of frequencies into a corresponding table of probabilities.
Use your result from part a) to answer the following.
b) Find the probability that the account is overdue.
c) Find the probability that the account has a balance under $100 or is overdue.
d) Find the probability that the account has a balance over $500 and is overdue.
e) What is the conditional probability that the account is overdue assuming that the balance is over $500?
f) If the account is not overdue, what is the probability that the balance is under $100?