Graphical Exploratory Data Analysis
Psychology 597G, 2004
Fridays, 9 - 11:50 am, Room CCIT 311
(Except: the first meeting, Jan. 16th will be held in Psychology 408)
Richard R. Bootzin (Bootzin@U.Arizona.edu)
Office hours: 1-1:45 pm TTH and by appointment, 217A Psychology
The goal of the seminar is to explore graphical methods for displaying and understanding data. There will be a strong emphasis on exploratory data analysis as developed by John Tukey and others. All too often, we do analyses using powerful statistical techniques without knowing much about our data. There are many fairly simple and robust procedures for helping us to explore and visualize our data. There are also innovative graphical procedures for helping us visualize complex multivariate data. Many times a picture or a series of pictures is more informative than pages of statistical analyses.
This will be an active hands-on course. Each enrollee will be asked to purchase a license to use DataDesk 6.1, a graphical exploratory data analysis (gEDA) program developed by Paul Velleman who was one of John Tukey’s students. Each enrollee should have a data set to examine throughout the semester. During the second half of the semester, we will have a number of sessions in which enrollees make presentations about their data and the gEDA procedures that were used to help understand it.
Grading will be based on class participation and on the presentation of the gEDA analysis of the enrollee’s data set.
Textbooks:
DataDesk, Version 6, Handbook 2, Operating and Navigating the Program.
DataDesk, Version 6, Handbook 3, Statistics Guide.
William S. Cleveland (1993). Visualizing Data. Summit, N.J.: Hobart Press.
Jan 16: Introduction; Numeracy; Displaying Data
Jan 23: Statistical Graphics; Working with DataDesk
Wainer, H., & Velleman, P.F. (2001). Statistical graphics: Mapping the pathways of science. Annual Review of Psychology, 52, 305-35.
(available online through the university library)
DataDesk Handbook 2, chapters 1-8 (including entering and editing data, importing and exporting, simple summaries, and displaying data)
Jan 30: Univariate Data
Cleveland: pp 1-41 (through Fits and Residuals in Chapter 2).
DataDesk Handbook 2, chapters 8-9 (displaying data and working with displays)
Feb 6: Re-expression or Transformation of Data
Cleveland: pp 42-85 (remainder of Chapter 2)
DataDesk Handbook 2, chapter 11 (derived variables)
Feb 13: Bivariate Data
Cleveland: pp 86-151 (through Bivariate Distributions in Chapter 3)
DataDesk Handbook 2, chapters 10, and 12 (brushing, slicing, and rotating, and manipulating variables)
Feb 20: Exploratory Data Analysis
Behrens, J.T. (1997). Principles and procedures of exploratory data analysis. Psychological Methods, 2, 131-160.
West, S.G. (2004, forthcoming). Seeing your data: Using modern statistical graphics to display and detect relationships. In R.R. Bootzin & P.E. McKnight (Eds.), Measurement, Methodology, and Evaluation: Festschrift in Honor of Lee Sechrest. Washington, D.C.: APA Books.
Feb 27: Statistical Analysis with DataDesk
DataDesk Handbook 2, chapter 13 (integrated analyses)
Mar 5: Time-Series; Growth Curves
Cleveland: pp 152-179 (remainder of Chapter 3)
DataDesk Handbook 3, chapter 33 (smoothing)
Mar 12: Templates
DataDesk Handbook 2, chapters 14, 15 (templates, and layouts and presentations)
Spring Break (March 15-19)
Mar 26: Analysis of Means; Class Presentations
DataDesk Handbook 3, chapters 19 and 21 (comparing two samples and ANOVA)
Apr 2: Regression; Class Presentations
DataDesk Handbook 3, chapters 22, 23, 24, 25 (simple regression, correlation, multiple regression, regression diagnostics)
Apr 9: GLM and MANOVA; Class Presentations
DataDesk Handbook 3, chapters 28, 29, 30 (GLM, repeated measures, exploratory manova)
Apr 16: Cluster and Factor Analysis; Class Presentations
DataDesk Handbook 3, chapters 26 and 27 (clustering and principal components)
Apr 23: Trivariate Data; Class Presentations
Cleveland: Chapter 4.
Apr 30: Hypervariate Data; Class Presentations
Cleveland: Chapters 5 and 6.
Last class!