## Cours reading in and transforming variables for analysis in spss, tutoriel & guide de travaux pratiques en pdf.

**Reading in and Transforming Variables for Analysis in SPSS**

Before any data can be analyzed by SPSS with the technique termed analysis of variance (ANOVA), the data to be analyzed must be introduced to, or entered into, SPSS. Subsequent chapters explain how to perform the several variants of ANOVA presuming that the data are already entered. This chapter’s focus is to provide information on how to get the variables into SPSS beforehand.

Methods for reading in or directly entering the data are described, as well as those for performing simple data transformations (e.g., computing an average).

**READING IN DATA WITH SYNTAX**

Before examining the syntax in Fig. 2.1, the reader is strongly advised to reread the syntax conventions discussed in the previous chapter. For example, the line numbers are not to be typed in.

The first statement in Fig. 2.1, “TITLE”, is an optional command (i.e., it is perfectly acceptable to leave it off; note the o beside the line number) that allows the user to specify a title in the printout. You decide what the title should be. In this example, ‘a one-factor anova design’ was used. The title does not in any way affect the analysis. It will simply appear at the top of each page of the printout.

The space between the command “TITLE” and the actual title is required. If you wish to use a long descriptive title, you may continue the title on the next line. To do this, simply indent the second line one space. However, SPSS will only repeat the first 60 letters of the title at the top of every page. Additional

descriptive information can be added to the top of every page with the “SUBTITLE” command.

The “SUBTITLE” command is placed on the line following the “TITLE” command, with the actual subtitle separated from the “SUBTITLE” command by a space as follows: SUBTITLE example from chapter 2 of Page et al.

By default on many platforms, SPSS prints the results or output of your requested analyses on lines that are 132 characters wide. The optional command on line 2 reduces the size of the output to 80 columns, which will make it easier for you to see the entire output on your computer screen and,

moreover, the output will fit on an 8.5- × 11-in. piece of paper. In SPSS for Windows, this control over output size is given on Edit–Options–Viewer (or Draft Viewer); then click the desired alternative on Text Output.

Entering Data with the “DATA LIST” Command

One of the most crucial steps in programming is telling the statistical package how to “read” your data file. There are several ways to do this, including, in Windows, typing values directly into the Data Editor Window, which is described later. The most general method, available to both nonWindows and Windows users, is through the use of syntax, specifically the “DATA LIST” command on line 3 in Fig. 2.1, which tells SPSS (a) where the data are and (b) what value to give each variable (or measure, or score) for each participant. If you have a very small data set, you may want to type the data within the SPSS program, as was done in the example of Fig. 2.1. If your data set is large, however, you may prefer to type the data in another file called an external file (a separate file of just data), which you will read into the program with the “DATA LIST” command, to be described later in this chapter.

The first example, however, assumes that the data are within the SPSS program, as in the example in Fig. 2.1. As seen in line 3, the command “DATA LIST” is followed by a keyword describing the ype of format of the data, “FIXED” or “FREE” (more on this later). This line is followed by a subcommand (here in line 4; recall, however, that subcommands need not be on different lines) that

provides the names you wish to give the variables and, for “FIXED” format, their column locations.

Thus, this subcommand will tell SPSS which variables are in which columns for “FIXED” or in which order for “FREE”. In this example of “FIXED”, as will be explained in more detail later, line 4 specifies that the variable to be called ‘facta’ is in column 1 and the variable to be called ‘dv’ is in columns 3 and 4. Because, in this example, the data are included in the program, the “BEGIN..

**READING IN AND TRANSFORMING VARIABLES**

FIG. 2.1. Syntax commands to read in data.

DATA” command on line 5 is used, followed by all of the data in lines 6 through 20 (in the columns or order specified on the “DATA LIST” command), followed by the “END DATA” command in line 21.

**“FREE” or “FIXED” Data Format**

The data format can be “FIXED” or “FREE”. “FREE” format data indicates that each participant’s score on each variable will be separated by one or more blank spaces. Furthermore, scores on a given variable may be located in different columns for different participants. However, the measures must be entered in the same order for all participants. Following is an example of a “DATA LIST” command for “FREE” format. To be particularly clear here, the blank spaces in the data are indicated with a “^”:

DATA LIST FREE /id age m1 m2 m3.

SPSS will understand that, for the first participant (i.e., the first line of data just presented), the variable you wish to call ‘id’ is to have the value 142, the variable you want called ‘age’ is to have the value 48, that he or she is to get a 4 on the variable you want called ‘m1’ (perhaps shorthand for

“Measure 1”), a 16 on ‘m2’, and a 7 on ‘m3’. For the second participant (second line of data), the participant’s ‘id’ is 78, his or her ‘age’ is 24, and he or she got a 1 on ‘m1’, a 2 on ‘m2’, and a 33 on ‘m3’. (Note that there is inconsistently more than one space between scores; this is completely permissible with the “FREE” data format.)

Names you wish to give to variables can have no more than eight characters and they cannot begin

with a number. Additionally, there are some sets of letters that can form keywords for some commands and, therefore, must be avoided as names. The sets of letters that you cannot use as variable names are the following: ALL, AND, BY, EQ, GE, GT, LT, LE, LT, NE, NOT, OR, TO, and WITH. Ideally, the variable names should also be mnemonic, easily recognized by you later. For example, if the first variable in the data file represents a participant’s identification number, you might call that variable ‘subjid’ or ‘id’. The program must be consistent in the use of the names in the “DATA LIST” and other later (e.g., “MANOVA”) commands referring to the same variables.

“FIXED” is the other common data format. It is the default and thus the keyword “FIXED” does not actually have to be typed in if your data are in “FIXED” format. “FIXED” format means that the data are organized so that each variable is stored in a particular column (or columns). In this format, the subcommand contains an ordered list of the variable names you wish to use, each followed by the specific column or a successive series of columns where that variable is found. The columns containing a specific measure must be the same for all participants. Here is an example:

DATA LIST FIXED /id 1-3 age 4-5 m1 6 m2 7-8 m3 9-10.

Note that each variable is followed by a single digit or series of digits. The ‘6’ following ‘m1’, for example, tells SPSS that ‘m1’ can always be found in column 6 for every participant. In contrast, ‘id’, ‘age’, ‘m2’, and ‘m3’ are more than single digit variables; the first number following each refers to the column containing the first digit of the variable and the final number refers to the column containing the last digit of the variable. These are separated by a dash (‘-’) in the subcommand. Thus, ‘id’ is in columns 1 through 3 and ‘age’ is in columns 4 through 5. Thus, for the following data:

14248416^7

^78241^233

the first participant’s ID number is 142 (first 3 columns), his or her ‘age’ is 48, and he or she got a 4 on ‘m1’, a 16 on ‘m2’, and a 7 on ‘m3’. For the second participant (i.e., second line of data), the ‘id’ is 78, the ‘age’ is 24, and ‘m1’, ‘m2’, and ‘m3’ are 1, 2, and 33, respectively. Note that, when a variable is declared by the “DATA LIST” to have more than one column, but a certain participant has a value that requires less columns than specified, the columns to the left are blank. For example, whereas ‘m2’ has columns 7 through 8 devoted to it, the second participant’s value is only one column long, the value 2.

The initial column (i.e., 10’s place) is therefore left blank (this process is called right justifying).

If a variable beginning in, say, the sixth column was called ‘m1’, the seventh column ‘m2’, and the eighth ‘m3’, you could refer to the column numbers just once, as with:

/m1 m2 m3 6-8. or /m1 TO m3 6-8.

If the variables took up more than one space, but all took up the same number of spaces, the same

economy of space indication would be possible. For example:

/k1 k2 k3 10-15. or /k1 TO k3 10-15.

would mean that the variable ‘k1’ is in spaces 10 and 11, ‘k2’ is in 12 and 13, and ‘k3’ is in 14 and 15.

The following three subcommand lines tell SPSS the same thing and are interchangeable:

/id 1-3 age 4-5 m1 6 m2 7 m3 8 iq 24-26.

/id 1-3 age 4-5 m1 m2 m3 6-8 iq 24-26.

/id 1-3 age 4-5 m1 TO m3 6-8 iq 24-26.

The “TO” shortcut is an excellent shortcut to enter a series of variables whose names differ only by the sequential number at the end. (In subsequent commands, the “TO” keyword can be used in a different way, as a shortcut to identify variables that were sequentially named on the “DATA LIST” or later created with transformations. For example, suppose the “DATA LIST” creates data in this order: q2, x, v3, iq, v4. Then ‘q2 TO v4’ can be used in later commands to refer to this set of successive variables.)

Look back at Fig. 2.1, beginning with line 6, and observe the succeeding rows. The first number in each row ranges between 1 and 3; that is because there are three values to the variable called ‘facta’. The first five participants (each having a separate line) are in the first value or “level” of ‘facta’, the next five are in the second level or group, and so on. The second number for each participant refers to that participant’s score on the dependent variable, called ‘dv’.

Some Special Cases You can leave blank spaces between the numbers in “FIXED” format; just be sure to skip the same columns each time and be sure to identify the correct starting columns for your variables. Occasionally, you might have a string (i.e., text, word, or alphabetic) variable, in which the value is not a number, but a letter or string of letters. For example, imagine that you have recorded gender in the data in column 7 as M or F, rather than, say, 1 or 2. In this case, you would follow its name in the subcommand with an “(a)”, that is, ‘gender (a) 7’.

Sometimes you may have a variable that inherently contains a decimal place but you have not actually typed the decimal place in the data. In this case, you may identify the number of decimal places you wish the variable to have in parentheses in the subcommand (e.g., ‘gpa 8-10(2)’).

1 USING SPSS AND USING THIS BOOK

Conventions for Syntax Programs

Creating Syntax Programs in Windows

2 READING IN AND TRANSFORMING VARIABLES FOR ANALYSIS IN SPSS

Reading In Data With Syntax

Entering Data with the “DATA LIST” Command

“FREE” or “FIXED” Data Format

Syntax for Using External Data

Data Entry for SPSS for Windows Users

Importing Data

Saving and Printing Files

Opening Previously Created and Saved Files

Output Examination

Data Transformations and Case Selection

“COMPUTE”

“IF” 15

“RECODE”

“SELECT IF”

Data Transformations with PAC

3 ONE-FACTOR BETWEEN-SUBJECTS ANALYSIS OF VARIANCE

Basic Analysis of Variance Commands

Testing the Homogeneity of Variance Assumption

Comparisons

Planned Contrasts

Post Hoc Tests

Trend Analysis

Monotonic Hypotheses

PAC

4 TWO-FACTOR BETWEEN-SUBJECTS ANALYSIS OF VARIANCE

Basic Analysis of Variance Commands

The Interaction

Unequal N Factorial Designs

Planned Contrasts and Post Hoc Analyses of Main Effects

Exploring a Significant Interaction

Simple Effects

Simple Comparisons and Simple Post Hocs

Interaction Contrasts

Trend Interaction Contrasts and Simple Trend Analysis

PAC

5 THREE (AND GREATER) FACTOR BETWEEN-SUBJECTS ANALYSIS OF VARIANCE

Basic Analysis of Variance Commands

Exploring a Significant Three-Way Interaction

Simple Two-Way Interactions

A Nonsignificant Three-Way: Simple Effects

Interaction Contrasts, Simple Comparisons, Simple Simple Comparisons, and Simple Interaction Contrasts

Collapsing (Ignoring) a Factor

More Than Three Factors

PAC

6 ONE-FACTOR WITHIN-SUBJECTS ANALYSIS OF VARIANCE

Basic Analysis of Variance Commands

Analysis of Variance Summary Tables

Correction for Bias in Tests of Within-Subjects Factors

Planned Contrasts

The “TRANSFORM/RENAME” Method for Nonorthogonal Contrasts

The “CONTRAST/WSDESIGN” Method for Orthogonal Contrasts

Post Hoc Tests

PAC

7 TWO- (OR MORE) FACTOR WITHIN-SUBJECTS ANALYSIS OF VARIANCE

Basic Analysis of Variance Commands

Analysis of Variance Summary Tables

Main Effect Contrasts

Analyzing Orthogonal Main Effects Contrasts (Including Trend Analysis)

Using “CONTRAST/WSDESIGN”

Nonorthogonal Main Effects Contrasts Using “TRANSFORM/RENAME”

Simple Effects

Analyzing Orthogonal Simple Comparisons Using “CONTRAST/WSDESIGN”

Analyzing Orthogonal Interaction Contrasts Using “CONTRAST/WSDESIGN”

Nonorthogonal Simple Comparisons Using “TRANSFORM/RENAME”

Nonorthogonal Interaction Contrasts Using “TRANSFORM/RENAME”

Post Hocs

More Than Two Factors

8 TWO-FACTOR MIXED DESIGNS IN ANALYSIS OF VARIANCE: ONE BETWEEN-SUBJECTS FACTOR AND ONE WITHIN-SUBJECTS FACTOR

Basis Analysis of Variance Commands

Main Effect Contrasts

Between-Subjects Factor(s)

Within-Subjects Factor(s)

Interaction Contrasts

Simple Effects

Simple Comparisons

Post Hocs and Trend Analysis

9 THREE- (OR GREATER) FACTOR MIXED DESIGNS

Simple Two-Way Interactions

Simple Simple Effects

Main Effect Contrasts and Interaction Contrasts

Simple Contrasts: Simple Comparisons, Simple Simple Comparisons, and Simple Interaction Contrasts

10 ANALYSIS OF COVARIANCE

Testing the Homogeneity of Regression Assumption

Multiple Covariates

Contrasts

Post Hocs

Multiple Between-Subjects Factors

ANCOVAs in Designs With Within-Subjects Factors

Constant Covariate

Varying Covariate

11 DESIGNS WITH RANDOM FACTORS

Random Factors Nested in Fixed Factors

Subjects as Random Factors in Within-Subjects Designs: The One-Line-per-Level Setup

The One-Factor Within-Subjects Design

Two-Factor Mixed Design

Using One-Line-per-Level Setup to Get Values to Manually Compute Adjusted Means in Varying Covariate Within-Subjects ANCOVA

12 MULTIVARIATE ANALYSIS OF VARIANCE: DESIGNS WITH MULTIPLE ,DEPENDENT VARIABLES TESTED SIMULTANEOUSLY

Basic Analysis of Variance Commands

Multivariate Planned Contrasts and Post Hocs

Extension to Factorial Between-Subjects Designs

Multiple Dependent Variables in Within-Subject Designs: Doubly Multivariate Designs

Contrasts in Doubly Multivariate Designs

13 GLM AND UNIANOVA SYNTAX

One-Factor Between-Subjects ANOVA

Basic Commands

Contrasts

Post Hoc Tests

Two-Factor Between-Subjects ANOVA

Unequal N

Main Effects Contrasts and Post Hocs

Simple Effects

Simple Comparisons

Interaction Contrasts

Three or More Factor ANOVA

One-Factor Within-Subjects ANOVA

Basic Commands

Planned Contrasts

Post Hoc Tests

Two or More Factor Within-Subjects ANOVA

Main Effect and Interaction Contrasts

Simple Effects and Simple Comparisons

Mixed Designs

More Complex Analyses

REFERENCES

*Télécharger le cours complet*