Graphical Exploratory Data Analysis

 

Psychology 597G, 2004

Fridays, 9 - 11:50 am, Room CCIT 311

(Except: the first meeting, Jan. 16th will be held in Psychology 408)

Richard R. Bootzin (Bootzin@U.Arizona.edu)

Office hours:  1-1:45 pm TTH and by appointment, 217A Psychology

 

The goal of the seminar is to explore graphical methods for displaying and understanding data.  There will be a strong emphasis on exploratory data analysis as developed by John Tukey and others.  All too often, we do analyses using powerful statistical techniques without knowing much about our data.  There are many fairly simple and robust procedures for helping us to explore and visualize our data.  There are also innovative graphical procedures for helping us visualize complex multivariate data.  Many times a picture or a series of pictures is more informative than pages of statistical analyses.

 

This will be an active hands-on course.  Each enrollee will be asked to purchase a license to use DataDesk 6.1, a graphical exploratory data analysis (gEDA) program developed by Paul Velleman who was one of John Tukey’s students.  Each enrollee should have a data set to examine throughout the semester.  During the second half of the semester, we will have a number of sessions in which enrollees make presentations about their data and the gEDA procedures that were used to help understand it.

 

Grading will be based on class participation and on the presentation of the gEDA analysis of the enrollee’s data set.

 

Textbooks:

 

DataDesk, Version 6, Handbook 2, Operating and Navigating the Program.

DataDesk, Version 6, Handbook 3, Statistics Guide.

William S. Cleveland (1993).  Visualizing Data.  Summit, N.J.:  Hobart Press.

 

Jan 16: Introduction; Numeracy; Displaying Data

 

Jan 23: Statistical Graphics; Working with DataDesk

 

Wainer, H., & Velleman, P.F. (2001).  Statistical graphics: Mapping the pathways of science.  Annual Review of Psychology, 52, 305-35.

(available online through the university library)

 

DataDesk Handbook 2, chapters 1-8 (including entering and editing data, importing and exporting, simple summaries, and displaying data)


 

 

Jan 30: Univariate Data

 

Cleveland: pp 1-41 (through Fits and Residuals in Chapter 2).

 

DataDesk Handbook 2, chapters 8-9 (displaying data and working with displays)

 

Feb 6: Re-expression or Transformation of Data

 

Cleveland: pp 42-85 (remainder of Chapter 2)

 

DataDesk Handbook 2, chapter 11 (derived variables)

 

Feb 13: Bivariate Data

 

Cleveland: pp 86-151 (through Bivariate Distributions in Chapter 3)

 

DataDesk Handbook 2, chapters 10, and 12 (brushing, slicing, and rotating, and manipulating variables)

 

Feb 20: Exploratory Data Analysis

 

Behrens, J.T. (1997).  Principles and procedures of exploratory data analysis.  Psychological Methods, 2, 131-160.

 

West, S.G. (2004, forthcoming).  Seeing your data: Using modern statistical graphics to display and detect relationships.  In R.R. Bootzin & P.E. McKnight (Eds.), Measurement, Methodology, and Evaluation: Festschrift in Honor of Lee Sechrest.  Washington, D.C.: APA Books.

 

Feb 27: Statistical Analysis with DataDesk

 

DataDesk Handbook 2, chapter 13 (integrated analyses)

 

Mar 5: Time-Series; Growth Curves

 

Cleveland: pp 152-179 (remainder of Chapter 3)

 

DataDesk Handbook 3, chapter 33 (smoothing)

 

Mar 12:  Templates

 

DataDesk Handbook 2, chapters 14, 15 (templates, and layouts and presentations)

 

 

Spring Break (March 15-19)


 

 

Mar 26:  Analysis of Means; Class Presentations

 

DataDesk Handbook 3, chapters 19 and 21 (comparing two samples and ANOVA)

 

Apr 2: Regression; Class Presentations

 

DataDesk Handbook 3, chapters 22, 23, 24, 25 (simple regression, correlation, multiple regression, regression diagnostics)

 

Apr 9: GLM and MANOVA; Class Presentations

 

DataDesk Handbook 3, chapters 28, 29, 30 (GLM, repeated measures, exploratory manova)

 

Apr 16:  Cluster and Factor Analysis; Class Presentations

 

DataDesk Handbook 3, chapters 26 and 27 (clustering and principal components)

 

Apr 23: Trivariate Data; Class Presentations

 

Cleveland: Chapter 4.

 

Apr 30: Hypervariate Data; Class Presentations

 

Cleveland: Chapters 5 and 6.

 

Last class!