R Tutorial Series: Introduction to The R Project for Statistical Computing (Part 1)

R is a free, cross-platform, open-source statistical analysis language and program. It is also an alternative to expensive commercial statistics software such as SPSS. The environment for R differs from the typical point and click interface found in most professional office applications. Although it does take some effort to become familiar with, R ultimately proves to be an affordable, customizable, and expandable statistical analysis solution. This tutorial intends to quickly and easily bring new users up to speed with R. Only the most basic elements are covered. Detailed statistical analyses and advanced techniques will be covered in future articles.
Below is a list of the topics to be covered in part one of this tutorial. Feel free to jump to a particular section of interest at any time.
  • Acquiring R
  • The R Interface
  • R Commands
  • The Working Directory
  • Packages

Acquiring R

R is free, open-source software that runs on Mac OS, Windows, Linux, and Unix platforms. Download links for all versions can be found at the official R Project website (http://www.r-project.org). After downloading R, you should install the program as appropriate for your operating system. In Mac OS X 10.6, the installation process was simple and consistent with almost every other application that I have used.

The R Interface

The R interface is composed of three main parts. The first is the Console window, which resembles a simple programming interface. This is the default view that loads when R is launched. The Console executes functions and commands, and displays outputs related to those operations. A second view, called the Quartz window, displays visual information such as graphs, histograms, and plots. It automatically appears when a related command is executed in the console window and can also be manually displayed through the Window menu. Lastly, the Editor window, which resembles a basic word processor, is called when a text file is opened in R. This is especially useful for looking back at past work done in the Console that can be applied to a new project. Together, the Console, Quartz, and Editor windows compose the R interface. While all are useful components, the vast majority of your time working with R will be spent in the Console window.



R Commands

Commands are most commonly issued to R in the form of functions. These are called by entering the name of the function, followed by parentheses, and then pressing enter. Some functions have various arguments, or parameters, which can be specified inside the parentheses. Only one function can be called per line in the Console window. There is no terminal character (i.e. semicolon, comma) found at the end of a line, which differs from many other programming languages. A line is executed by pressing the return key. Afterwards, R will automatically display the output of the commands (where necessary) and jump down to a new, blank line in the Console. One example of an R function is q(). This is used to exit, or quit, the program and can be called as follows.
  1. q()

The Working Directory

One of the initial things that you want to do when you launch R for the first time is to set its working directory. This is the default location on your hard disk that R will look to read and write files. The working directory is comparable to what is called the "default folder" in many other applications. It is important to select a location that is easy to find and remember, so you can access your files when you need them. I dedicated an entire folder to R on my system, with subfolders for each project.
To display the current working directory, use the function getwd().
  1. > getwd()
  2. [1] "/Users/Admin/Documents/R"
To change the current working directory, use the function setwd('PATH'), replacing PATH with the directory path that you would like to use.
  1. > setwd('/Users/Admin/Documents/R/newProject')
Use getwd() again to verify that the change took place.
  1. > getwd()
  2. [1] "/Users/Admin/Documents/R/newProject"
Note that you have the option to set the working directory at any time. Do this when you want to access files in a new location, such as when you are working on multiple projects at the same time or at the start of a new project.

Packages

The ability to install packages is a major benefit of R over its competitors. As an open project, anyone can contribute quality custom commands to the R community. Packages extend the functionality of R by enabling additional visual capabilities, statistical methods, and discipline-specific functions, just to name a few.

Choosing A CRAN Repository

A number of CRAN repositories, or mirror sites that host R packages, are available. A complete listing can be found on the official website. When choosing a repository to download from, you may want to consider things such as its location, reputation, and relevance to your work. To set your mirror site, use the options(CRAN = "URL") command, where URL is the url to the CRAN repository. In the example below, the user connects to the UCLA repository.
  1. > options(CRAN = "http://cran.stat.ucla.edu")

Finding Available Packages

To obtain a list of all packages available at a given mirror site, use the available.packages() command.
  1. > available.packages()

Installing Packages

To install a specific package, use the install.packages("NAME") command, where NAME is the name of the desired package.
  1. > install.packages("foreign")
  2. trying URL 'http://cran.stat.ucla.edu/bin/macosx/leopard/contrib/2.9/foreign_0.8-38.tgz'
  3. Content type 'application/x-tar' length 254281 bytes (248 Kb)
  4. opened URL
  5. ===================================
  6. downloaded 248 Kb
Note that you can install all packages on a mirror site using the install.packages(available.packages()) command. This is recommended, although at nearly five gigabytes as of this writing, an entire CRAN repository can take a significant amount of time to download and install. The other option is to wait until you know that you need specific packages and install them on an a la carte basis.

Removing Packages

To remove a specific package, use the remove.packages("NAME") command, where NAME is the name of the desired package.
  1. remove.packages("foreign")

Updating Packages

To update all installed packages, use the update.packages() command. For each out of date package that is found, you will be prompted to confirm the update, as demonstrated below. In this case, you should type "y" and press enter to continue.
  1. > update.packages()
  2. foreign :
  3. Version 0.8-37 installed in /Library/Frameworks/R.framework/Resources/library
  4. Version 0.8-38 available at http://cran.stat.ucla.edu
  5. Update (y/N/c)? y
  6. ===================================
  7. downloaded 244 Kb
  8. trying URL 'http://cran.stat.ucla.edu/bin/macosx/leopard/contrib/2.9/foreign_0.8-38.tgz'
  9. Content type 'application/x-tar' length 254281 bytes (248 Kb)
  10. opened URL
  11. ===================================

Loading Packages

To use a package, it first needs to be loaded through the library(NAME) command, where NAME is the name of the package. Each time that R is run, you will have to reload any special packages that you need.
  1. > library(foreign)

Conclusion

This concludes part one of the introductory tutorial to using R. In part two, more of the basic features of R will be presented, including how to import data, create and use variables, and manage workspace and console files.

14 comments:

  1. The above procedure recommended for quitting < using q() > is not really advisable when using the Mac GUI. It is specifically advised against by Simon Urbanek, and is the cause of difficult to debug problems posted in the R Mac-SIG mailing list.

    ReplyDelete
  2. Thanks for your comment. Would you mind providing some URLs to the stated issues so other readers can become informed on this topic?

    ReplyDelete
  3. this is incredible info, i can not thank you enough!!

    ReplyDelete
  4. Thanks, I'm glad that it was helpful to you.

    ReplyDelete
  5. Thank you for presenting information that is clear, concise and so useful for people starting out with R.

    ReplyDelete
  6. this is awesome...
    but I have 1 question, recently I've installed a package (PathRanker) under ubuntu 10.04 using R 2.11.1 and the package have some dependents packages that need to be installed. After I installed all the packages and run it, I've noticed that I cant remove/uninstall the package and the folder shows a locked sign with the dependent packages as well. How am I able to remove it? Plz help..

    ReplyDelete
  7. Hi. I have no experience with the PathRanker package or with Linux installations of R. I recommend contacting the creator of the package and checking out the R Help listserv archives at https://stat.ethz.ch/mailman/listinfo/r-help.

    John

    ReplyDelete
  8. Hi.I cannot change working directory. What is the problem?

    ReplyDelete
  9. Hi. You should be able to use the instructions described in the "The Working Directory" section of this article to set and verify your working directory. Depending on your operating system and R version, you may also be able to set the working directory using your application menu/options/settings.
    John

    ReplyDelete
  10. Thanks so much for your information. Maybe I am just in over my head if I am having this much trouble simply locating a package named FitAR from a CRAN repository. You clearly state how to do this in your instructions above. Yet after looking at numerous repositories I can't locate this package or any that may be more generically called AR. I am able to easily locate your example "foreign" however. Is there another way to locate a package, or could FitAR have been removed? I find it discussed in the Journal of Statistical Software but it's a 2008 article so who knows. Thanks in advance if you are able to help me.

    John W.

    ReplyDelete
  11. Hi John W,

    I'm not familiar with the FitAR package, but here is the URL from the CRAN/R website: http://cran.r-project.org/web/packages/FitAR/index.html

    My advice is to contact the authors of your article and the ones listed at that URL.

    Not all R packages are put into the main repository. For example, some are available for download from personal websites. I also heard that R stopped accepting new packages for a period of time. Therefore, I think the authors of the package are the best people to contact.

    ReplyDelete
  12. Thanks a lot for contributing your ideas......You are helping beginners like me dramatically...So thanks again

    ReplyDelete
  13. Hi all,

    i am supposed to work with BIOMOD package in the future and I have just srarted learning R. I installed R in my computer. Could you advice me which handout is good for ZERO beginner, please.

    Thanks all for your kind help.

    Odno

    ReplyDelete