Statistics homework help.
The objectives of this assignment are as follows:
- Demonstrate how to import two .csv files into
- Merge two relatively large datasets into one SPSS
- Re-enforce the knowledge gained in the previous assignments in SPSS generating summary statistics.
- Create variables in SPSS using the compute
- Re-enforce the knowledge gained in a previous assignments in SPSS generating regression output.
- Test for the presence of
- Multicollinearity in SPSS using the VIF
- Autocorrelation in SPSS using the Durbin–Watson
- Heteroscedasticity in SPSS using Cook’s Distance statistic.
- Save output files created in SPSS in other file formats such as .htm for non–SPSS
Please remember to generate SPSS syntax on all aspects of the assignment, and to use the syntax you have generated to create output. (As shown in class, SPSS syntax will appear in the output file that you will print out.)
Method of Submission/Deliverables:
To be determined but in electronic format. Will involve submitting SPSS Output including the syntax generated. (I DO NOT NEED THE MERGED SPSS DATASET YOU CREATED.)
You can save the portions of your output file imbedded into a Word document, or you can save your file in html format.
Assignment Specifics Part 1 – Getting Started
- Download S&P500.csv from Datasets folder on course’s BlackBoard page
- Download VIX–OHLC.csv from Datasets folder on course’s BlackBoard page
- Launch SPSS to import both .csv files (individually) into
- Remember to generate SPSS syntax, and to adjust features of the import wizard as you import both
- All variables imported in the combined dataset (except date) should be scale
Part II – Performing the Assignment
- Using either SPSS code or pull downs menus merge both datasets into one dataset on the
date variable in both datasets.
- Using either SPSS code or pull downs menus compute two new
- Range (The difference between the daily high and low price on the S&P )
- Open_Close (The difference between the opening and closing trade of the day on the S&P 500.)
- Generate summary statistics
- standard deviation
- standard error of the mean
for all variables in the merged dataset except for date. (Note: Since the data in the dataset is a time series, some of the summary statistics generated may not be very meaningful.)
- Run a simple regression with Volume as the dependent variable, and Range as the independent
- Run a multiple regression with Volume as the dependent variable, with Range and Open_Close as independent variables.
- Run a multiple regression with Volume as the dependent variable, and all the other variables (except for date) as independent
- In the multiple regression models, test for:
- Mulitcollinearity (a situation in which one predictor variable in a multiple regression model is highly linearly related to one or more of the other predictor variables in our model.) by requesting collinearity
- Autocorrelation (a condition often present in time series data where the value of one observation of a variable is highly influenced by the value of that same variable in a previous period.) by requesting a Durbin–Watson test in
- For this project do not worry about the interpretation of these