### Data Loading and Management #### 1. Importing Data ```sps * Import from Excel. GET DATA /TYPE=XLSX /FILE="C:\YourPath\YourData.xlsx" /SHEET=name 'Sheet1' /CELLRANGE=FULL /READNAMES=ON. EXECUTE. * Import from CSV/Text file. GET DATA /TYPE=TXT /FILE="C:\YourPath\YourData.csv" /DELIMITERS="," /ARRANGEMENT=DELIMITED /FIRSTCASE=2 /VARIABLES= Var1 A10 Var2 F8.2 Var3 N. EXECUTE. * Import from other SPSS files (.sav). GET FILE='C:\YourPath\YourData.sav'. ``` #### 2. Saving Data ```sps * Save current dataset. SAVE OUTFILE='C:\YourPath\ProcessedData.sav' /COMPRESSED. ``` #### 3. Data Selection ```sps * Select cases based on condition. SELECT IF (Gender = 1 AND Age > 25). * Select random sample (e.g., 50% of cases). SAMPLE .5. * Select first N cases. N OF CASES 100. ``` #### 4. Merging Data ```sps * Add variables (matching cases). ADD FILES /FILE=* /FILE="C:\YourPath\AdditionalData.sav" /BY ID. * Add cases (matching variables). ADD FILES /FILE=* /FILE="C:\YourPath\MoreCases.sav". ``` ### Data Processing #### 1. Variable Definition ```sps * Define variable labels. VARIABLE LABELS Var1 "Demographic Variable" Var2 "Continuous Measurement". * Define value labels. VALUE LABELS Var1 1 "Male" 2 "Female". * Define missing values. MISSING VALUES Var2 (99, 999). * Change variable type. ALTER TYPE Var1 (F8.0). ``` #### 2. Computing New Variables ```sps * Compute a new variable. COMPUTE TotalScore = Q1 + Q2 + Q3. * Conditional computation. IF (Age = 18 AND Age 65) NewGroup = 3. EXECUTE. * Recode values. RECODE OldVar (1=1) (2=2) (3 thru 5=3) (ELSE=SYSMIS) INTO NewVar. EXECUTE. ``` #### 3. Data Transformation ```sps * Rank cases. RANK VARIABLES=Score (A) BY Group INTO RankScore. * Create dummy variables. AUTORECODE Gender /INTO Gender_Dummy. ``` ### Data Cleaning and Validation #### 1. Identifying Duplicates ```sps * Identify duplicate cases based on ID. SORT CASES BY ID. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=ID /Count_ID=N. SELECT IF (Count_ID > 1). ``` #### 2. Handling Missing Values ```sps * List cases with missing values for specific variables. LIST CASES /VARIABLES=Var1 Var2 Var3 /MISSING=REPORT. * Delete cases with missing values (listwise deletion). SELECT IF NOT (MISSING(Var1) OR MISSING(Var2)). EXECUTE. * Impute missing values with mean (numeric variable). DESCRIPTIVES VARIABLES=Var1 /STATISTICS=MEAN. COMPUTE Var1 = RETAIN(Var1). IF (MISSING(Var1)) Var1 = MEAN(Var1, Var2, Var3). EXECUTE. ``` #### 3. Outlier Detection ```sps * Explore outliers using boxplots (visual inspection). GRAPH /BOXPLOT(GROUPED)=Var1 BY Group. * Identify outliers using Z-scores (e.g., Z > 3 or Z 3). ``` ### Descriptive Statistics #### 1. Basic Descriptives ```sps * Frequencies for categorical variables. FREQUENCIES VARIABLES=Gender Education /FORMAT=NOTABLE /BARCHART. * Descriptives for continuous variables. DESCRIPTIVES VARIABLES=Age Income Score /STATISTICS=MEAN STDDEV MIN MAX SKEWNESS KURTOSIS. ``` #### 2. Exploring Relationships ```sps * Crosstabulations. CROSSTABS /TABLES=Gender BY Education /CELLS=COUNT ROW COLUMN TOTAL /STATISTICS=CHISQ PHI. * Means by group. MEANS TABLES=Score BY Group /CELLS=MEAN STDDEV COUNT. ``` #### 3. Data Visualization ```sps * Histogram. GRAPH /HISTOGRAM=Age. * Scatterplot. GRAPH /SCATTERPLOT(BIVARIATE)=Var1 WITH Var2. * Bar chart. GRAPH /BAR(GROUPED)=MEAN(Score) BY Group. ``` ### Inferential Statistics - Basic #### 1. T-Tests ```sps * One-sample T-test. T-TEST /TESTVAL=50 /VARIABLES=Score. * Independent samples T-test. T-TEST GROUPS=Gender(1 2) /VARIABLES=Score /MISSING=ANALYSIS. * Paired samples T-test. T-TEST PAIRS=PreTest WITH PostTest. ``` #### 2. ANOVA ```sps * One-way ANOVA. ONEWAY Score BY Group /STATISTICS=DESCRIPTIVES HOMOGENEITY /POSTHOC=BONFERRONI. * Two-way ANOVA (syntax for UNIANOVA). UNIANOVA Score BY Group1 Group2 /METHOD=SS_TYPE=3 /INTERCEPT=INCLUDE /PRINT=ETASQ /CRITERIA=ALPHA(.05) /DESIGN=Group1 Group2 Group1*Group2. ``` #### 3. Chi-Square Test ```sps * Chi-Square test of independence. CROSSTABS /TABLES=SmokingStatus BY HealthCondition /CELLS=COUNT EXPECTED /STATISTICS=CHISQ. ``` #### 4. Correlation ```sps * Pearson correlation. CORRELATIONS /VARIABLES=Var1 Var2 Var3 /PRINT=TWOTAIL NOSIG. ``` ### Inferential Statistics - Advanced #### 1. Regression ```sps * Linear Regression. REGRESSION /DESCRIPTIVES MEAN STDDEV CORR SIG N /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT Score /METHOD=ENTER Predictor1 Predictor2. * Logistic Regression. LOGISTIC REGRESSION VAR=Outcome (B) /METHOD=ENTER Predictor1 Predictor2 /PRINT=CI(95) /CRITERIA=PIN(.05) POUT(.10). ``` #### 2. Non-parametric Tests ```sps * Mann-Whitney U Test. NPAR TESTS /M-W=Score BY Group(1 2) /STATISTICS=DESCRIPTIVES. * Kruskal-Wallis H Test. NPAR TESTS /K-W=Score BY Group(1 3) /STATISTICS=DESCRIPTIVES. * Wilcoxon Signed-Rank Test. NPAR TESTS /WILCOXON=PreTest WITH PostTest. ``` #### 3. Factor Analysis ```sps * Principal Component Analysis. FACTOR /VARIABLES Var1 Var2 Var3 Var4 Var5 /PRINT UNIVARIATE INITIAL KMO AIC ROTATION /CRITERIA FACTORS(3) ITERATE(25) /EXTRACTION=PA1 /ROTATION=VARIMAX /SAVE SCORES(FactorScores). ``` #### 4. Reliability Analysis ```sps * Cronbach's Alpha. RELIABILITY /VARIABLES=Item1 Item2 Item3 Item4 /SCALE('Overall Scale') ALL /MODEL=ALPHA /STATISTICS=DESCRIPTIVES SCALE. ```