Introduction to SPSS


What is SPSS?

Statistical Package for the Social Sciences, a fairly user friendly statistics package that is appropriate for both novice and  sophisticated users.  SPSS allows users to operate in a menu environment or in a command (syntax) driven environment.  It is available for the Mac and Windows.  Dartmouth’s licensed copy can only be installed on machines that are owned by the college and used by faculty, staff, or students.  If you would like to install SPSS on a college machine, contact your computing consultant.


SPSS Basics: Windows

Available windows may vary from one version to another
Each has a toolbar that you can use instead of or along with the menus

Data editor               Where you open files; edit, select, or transform data, conduct statistical analyses and produce graphs
Output                       Where you look at statistical results, tables, charts
Chart editor             Where you edit graphs
Syntax editor           Where you can write and execute commands in the SPSS command language

SPSS Basics: Menus

Help menu

If you have the SPSS tutorial installed, it is a very easy way to get started with SPSS

Help topics
   SPSS at a glance
   Getting help
   Data management
   Statistical analysis
   Graphical analysis
   Interactive charts
   Output management
   Saving files
   Printing files
   Customizing SPSS

Window menu

Allows you to switch back and forth from the data editor, ouput, syntax, and chart windows
I do most of my work in the data editor window

File menu

New for defining a new data file, and entering your own data (or new syntax, output window)
Open existing data files in various formats; or SPSS output, syntax, etc. files
Read text data open text data files and define variables
     This is where you must know how your data file is organized and formatted
     Information is usually included in a codebook

Using existing data files

What type of file is it?
     Mac or Windows?
     Text, spreadsheet, SPSS, SPSS portable?
     Is it only data, or are labels included?

What variables are where?
     How are records formatted?
     How many records per case?
     Columns?  Delimiters?
     String vs. numeric variables?

How is missing data coded?

Value labels?

Define variables

Variable View in data editor
Assign variable names in an organized manner; keep them short
Type allows you to define string and numeric characteristics, others
Width total columns allowed
Label allows you to give longer references to variables
     Variable names are 8 characters or less and start with a letter
     Variable labels can be longer and can include spaces
Values assign strings to numbers
Missing define discrete or ranges of missing values for variables
Columns columns displayed

Edit menu

This works very much like common spreadsheet editors
     Cut, copy, paste, find

     I prefer to open a syntax window when the program launches
     Display order for variable lists

     Viewer/Draft viewer
     Fonts, page width and length

     Output labels
     Variable names and/or labels

     Aspect ratio, frame, font


File menu (cont.)

Save and Save as: data can be saved in various formats
SPSS portable files are transportable across Mac, Unix, Windows platforms
SPSS files save variable definitions and formats with the data
Tab-delimited files save data, with or without variable names
      value labels are not saved
Fixed ASCII  writes variables to fixed columns


View menu

Grid lines
Value labels
Switch to variable view

Utilities menu

Shows the format and characteristics for variables

File info
Gives information about all the variables in a file

Data menu

Insert variable in spreadsheet
Insert case in spreadsheet

Sort cases
Ascending or descending
Multiple variables

Split file
Allows you to repeat analyses by groups
It is on until you turn it off
A sort must take place (default)

Select cases
Allows you to filter out a subset of cases
It is on until you turn it off
Filter rather than delete cases
Selecting conditionally: keypad, formulas, and functions

Transform menu

Allows you to construct a new variable
This can be done conditionally

Allows you assign new values to represent old values
Useful for collapsing interval variables to categorical variables
Be careful about recoding into the same variable
String variables can be recoded to numeric, and vice versa
Recodes can be done conditionally

Gets the total number of values from a set across variables in a case

Categorize variables
Collapses a variable with many values to a specific number of categories
Done by percentiles

Automatic recode
Recodes string or numeric variables to consecutive integers

Rank cases
Different types of ranks
Different ways of dealing with ties
Can be done by subgroups


Data Analysis: The Analyze Menu


Case summary
Allows you to print values for selected cases and variables
Can be done for subgroups
Can get you some descriptive stats


Descriptive Statistics

Gets a table of counts and percentages
Usually used for categorical variables and ordinal variables with not too many categories
Additional statistics like percentiles can be obtained
Most summary statistics like means are better obtained from other procedures
The format subcommand can be used to order the table by counts or values
Graphs are better obtained from graphing procedures
Gets contingency tables (two-way tables)
Used for summarizing two categorical or ordinal variables at once
Cell information can include observed and expected frequencies, row and column %
Layers can give you tables by subgroups (i.e., three-way tables)

Computes summary statistics like mean, standard deviation, min, maximum
Options lets you select the statistics you want and order the display of results
This procedure can also save standardized values (z-scores) to the data set

Computes summary statistics and produces simple graphs and plots
As the name says, it is useful for exploring data
Stem and leaf plots and boxplots can be used to identify outliers
Analyses can be conducted by groups to explore differences

Computes correlations between two ordinal or interval variables
Will produce correlation matrices
One can specify listwise or pairwise treatment of missing values



Bar charts: summarize variables by groups and subgroups

Simple displays summaries by groups
Display frequencies or percents for groups [Ex: N for HSP]
Display summary statistics for a variable by groups [Ex: Mean(RDG) by HSP]
Statistical displays for a set of variables [Ex: Medians (RDG WRTG MATH)]

Cluster displays summaries by groups and sub-groups (clusters) [Ex: %'s for HSP SEX]
An elaboration of simple bar charts
Bars for "missing" can be discarded in the options menu
Can use pairwise or listwise deletion

Histograms: summarize an interval variable [Ex: SCI with normal curve]
A normal curve can be superimposed

Line charts: summarize by groups

Simple is like a simple bar chart, midpoints of bar tops are connected [Ex: Mean(RDG--CIV)]

Multiple is analogous to cluster bar charts
Can get separate lines for groups or variables
Series--transpose data switches groups with variables [Ex: Mean(RDG--CIV) by SES]

Scatterplots: show relationships between variables

Simple shows the relationship between two variables [Ex: RDG w/ WRTG by SEX]
Axes are usually interval variables
Markers can be used for separate groups
Options can fit regression lines to the total group or subgroups
Sunflowers are for marking # of hidden cases (can't be done for subgroups)

Some other charts
Pie, area, high-low, boxplots, control, time series


Editing and Saving Output

Output Viewer

Click on tree in left sends you to a table on right
Double-click on table to get in edit mode
   (or go to Edit menu and open/edit SPSS Pivot Table Object)
Double-click on cell you wish to change
Somewhat limited editing options for tables
Objects can be cut and pasted to other applications

File Menu

Tables can be saved as SPSS output files (.spo)
They can be exported as HTML or text files

Editing charts

Done in the chart editor (or in chart windows for other versions of SPSS)
Double-click on chart to open it in chart editor
Most changes are made by drilling down the chart, highlighting, and selecting from menus
Depending on the type of chart being edited, different characteristics can be changed, including:

       Axis scaling and ticks
       Number of intervals
       Interpolation, fitting lines, reference line
       Labels, text styles, and fonts
       Titles, footnotes, annotation, and legends
       Fill patterns, spacing, and colors for bars
       Bar style
       Marker colors, styles, and sizes
       Line styles and colors
       Swapping axes
       Transposing data

Saving charts

Save as SPSS output  (.spo)
Export as a jpeg, pict, png, tiff, or bmp (not SPSS files)
Cut and paste to another application