STATS1900 Business Statistics
Major Assignment
Semester 2, 2013
Date Due: Please refer to course description
Total Marks: 40 marks Worth: 20% of final assessment
This assignment requires a substantial amount of computer work and written comment. You may need to seek guidance from your tutor along the way. Do not leave things until too late!!
The questions give a careful statement of what is required and information about the presentation of your answers. Please follow these carefully! Marks may be deducted for poor presentation.
In this assignment you will examine data used by a Real Estate investment advisor. She wants you to answer some specific questions put by clients about houses prices in the neighbourhood encompassed by 4 suburbs around the city of Melbourne. The data is contained in the file Ã¢â‚¬ËœReal_Estate.xlsÃ¢â‚¬â„¢ and contains the following columns (variables):
Variable Name Description
ID House Identity number
Price Selling Price of the house (in 000Ã¢â‚¬â„¢s)
Bedrooms Number of bedrooms
Size House Size (m2)
Pool 0=House without a Pool
1=House with a Pool
Distance Distance from city centre (km)
Suburb Suburb number
Garage 0=House without a Garage
1=House with a Garage
1. Random Sample: Before you begin your analysis you are required to take a random sample of size 110 from the 170 cases in the file. Use the file Random_Sample_Generator-13-2.xls to do this. Your tutor will show you how this can be done in EXCEL. Answers to the questions below are to be based on your sample of 110 cases. Make sure to keep a safe copy of the sample you use since you cannot use Random_Sample_Generator-13-2.xls to reproduce it. Provide a printout of the data in your sample, with ID numbers in ascending order.
2. Summary table:
a)
i) Prepare a summary table that shows the mean, standard deviation and 95% confidence interval for the mean of the following variables:
Selling Price, Number of bedrooms, Size of house, Distance from city centre
ii) Use some of the information in (i) to describe a typical house in these suburbs.
b)
i) Prepare a summary table that shows the mean and standard deviation of Price for houses in the 4 Suburbs according (subject) to the variable Bedrooms. Think carefully about the layout of the rows and columns of your table. As well as means and standard deviations you should also include the number of houses in each group. So each cell in your final table should contain the mean, the standard deviation and n, the number of houses in that group.
ii) Comment, in bullet point form, on the Price of any combinations for Suburb and Bedrooms variables (i.e. cells in the table).
3. A local real estate firm has told a client that the average Price of a house in Suburb 2 is $420,000. You have been asked to evaluate this claim. Use a One Sample t Test for the Mean to evaluate the claim that the average price is $420,000.
4. Size and Price: One of the clients wants information on Size of houses as it relates to price.
a) First create a new variable (column) labelled Size Group which divides Size up into two size groups as follows:
Under 200 square meters Small
200 square meters and over Large
b) Produce suitable graphs or charts to help in providing the information requested on the Size of the house as it relates to Price.
c) Find 95% CI intervals for the small and large houses Price. Is there any interaction (overlap) between the two Confidence Intervals? What does this tell you about the Prices for the two Sizes.
5. Produce a scatter plot of Price vs Size (Size should be on the horizontal axis). Make sure you label your axes properly and that your graph has an appropriate title. Briefly describe the nature of the relationship between these two variables.
Use XL to carry out a regression analysis on these two variables. Copy the output into your assignment and use it to respond to the following:
a) Write down the regression equation.
b) State the R-Square value and the Standard Error and explain what they mean with respect to this data.
c) Write down the value of the gradient of the regression line and explain what it means in this case.
d) Is the constant or intercept value significant in this case? How do you know this?
e) Briefly explain why you think this regression model is, or is not, a good model.
Price and Suburb indexes:
a) Determine the Suburb index for each suburb after regressing Price with Size of the house. Use the multiplicative model in calculating suburb indices: ImprovedPredictedPrice=PredictedPrice(as a function of Size) × Suburb Index.
Hint: Use a similar technique to the time series technique that calculates seasonal indices.
b) Interpret the suburb indices in the context of the problem.
6. Using information from your analyses write a short concluding paragraph about house prices and sizes for different suburbs.