### Tutorial Files

Before we begin, you may want to download the dataset (.csv) used in this tutorial. Be sure to right-click and save the file to your R working directory.### The Scale() Function

The scale() function makes use of the following arguments.- x: a numeric object
- center: if TRUE, the objects' column means are subtracted from the values in those columns (ignoring NAs); if FALSE, centering is not performed
- scale: if TRUE, the centered column values are divided by the column's standard deviation (when center is also TRUE; otherwise, the root mean square is used); if FALSE, scaling is not performed

### Centering Variables

Normally, to center a variable, you would subtract the mean of all data points from each individual data point. With scale(), this can be accomplished in one simple call.You can verify these results by making the calculation by hand, as demonstrated in the following screenshot.

- > #center variable A using the scale() function
- > scale(A, center = TRUE, scale = FALSE)

Centering a variable with the scale() function and by hand

### Generating Z-Scores

Normally, to create z-scores (standardized scores) from a variable, you would subtract the mean of all data points from each individual data point, then divide those points by the standard deviation of all points. Again, this can be accomplished in one call using scale().Again, the following screenshot demonstrates equivalence between the function results and hand calculation.

- > #generate z-scores for variable A using the scale() function
- > scale(A, center = TRUE, scale = TRUE)

Generating z-scores from a variable by hand and using the scale() function

Thank you for posting!

ReplyDeleteVery good!! Thanks!!!

ReplyDeleteA huge thank-you for a clear and well explained tutorial.

ReplyDeletethis is really what I need. Thanks a lot

ReplyDeleteGood function, but why are there "center" and "scale" attributes attached to the scaled data? This will slow down the computation and take more memory when you copy the matrix.

ReplyDeleteIs there an easy way to get back from z-scores to original scores? Say you used the scaled values to generate a model, and you predict new values but they are in z-scores. Is there an easy command to get the prediction values in original units?

ReplyDelete