R

From BarikWiki
Jump to: navigation, search

R

The R Project for Statistical Computing
  • Wordfile for R. See UltraEdit for installation reminders. But it's syntax highlighting is broken when using square brackets.
  • What are the HOME and working directories? For you, this is C:\Users\tbarik\Documents. See help(Startup) for the intricate details. You can create a .Rprofile in this directory to initialize settings.
  • Your settings are located in Rconsole under Documents.
  • Revolution Analytics version of R. Free for academics after filling out form. This could be a really useful tool except that the Console colors are somewhat annoying and can't be changed (in particular, the background color).
  • The R Project
  • A list of books for learning the R Language. The R Book, available electronically at NCSU.
  • If you get annoyed with the "Save workspace image?" prompt on exit, you can add --no-save as an argument. Or, if you want to automatically save, use --save.
  • When you want to run a script from the command-line (or through UltraEdit), use Rscript. All the other random binaries are left overs.
  • The tapply (table apply) function is useful, but still looks too complicated for me.
  • It's good practice to use with instead of attach.
  • An R style guide, based on Google's style guide for R.
  • Quick-R home page, also for the book R in Action.
  • Notes on the use of R for psychology experiments and questionnaires.

Importing Data

  • To read a CSV file use read.csv. Remember to set set header = TRUE. read.table also returns a data frame, but has many more options.
  • If you want counts, there are many ways to get it: str, nrow, dim.
  • To serialize structures to disk, use dump("variable").
  • You can move data between Excel and R using clipboard!
  • For integer binning, you can use the tabulate function.

Library

  • Libraries are useful. To see the installed libraries, use library. To include a library, say library('MASS').
  • fitdistrplus has useful functions for AIC and BIC.

Packages

  • Installing packages is easy: install.packages("akima"). You can also update packages with update.packages().
  • To see which packages need updating, use old.packages(). You can also use installed.packages() or library() (less verbose) to see which packages you have installed.
  • To search for packages, use RSiteSearch('neutral networks').

Installing R Under RHEL

  • After setting up EPEL, we find that the package is simply called R.

RMySQL

Compiling RMySQL

This section contains notes on installing RMySQL under Windows, which is a considerable chore, since almost every site is incomplete for some critical detail.

It doesn't seem like it's possible to get this working under R-2.15.0, using binary builds. This means that you'll have to compile the package using the source. These instructions are specific to the 64-bit version of R in Windows; the 32-bit version has some minor variations from this.

Warning message:
package ‘RMySQL’ is not available (for R version 2.15.0)
  • Okay, start by grabbing RTools, which will just make you go to <CRAN mirror>/bin/windows/Rtools anyway, so here's the UCLA mirror. Make sure RTools is in the front of your path (to avoid clashes with anything like mingw).
  • The path should thus include C:\Rtools\bin;C:\Rtools\gcc-4.6.3\bin (see Appendix D).
  • Install MySQL Community Server, with development components. Then, set MYSQL_HOME to C:\Program Files\MySQL\MySQL Server 5.5, except that you actually need the short name version! I haven't found a good way to find this except through repeated application of dir /x from this discussion.
  • For me, the short path ends up being C:/PROGRA~1/MySQL/MYSQLS~1.5.
  • Copy libmySQL.dll into the MySQL bin folder (from lib).
  • As an alternative to adding it to your path (though this didn't work for me), you can instead add it to Renviron.site. Normally, this file isn't created by default, so you'll have to make it and put it in /etc. If you don't know where this file is, use R.home() at your R prompt.
  • Use the command: install.packages("RMySQL", type = "source").
  • Some discussions about RMySQL on stackoverflow.
  • Run R directly through R.exe, not through Rgui. You can run it through either, but I use the command-line version just for isolation.
  • Additional RMySQL Installation Notes from Vanderbilt. Some instructions by Jeff Walker.
  • Some relatively old instructions.

In R 2.15.1, if you don't have the PATH correct, you'll end up with an error such as:

 RS-DBI.c:1:0: sorry, unimplemented: 64-bit mode not compiled in

Running RMySQL

To ensure that R is functioning correctly:

> Sys.getenv("MYSQL_HOME")
[1] "C:/PROGRA~1/MySQL/MYSQLS~1.5"
> library("RMySQL")
Loading required package: DBI
MYSQL_HOME defined as C:/PROGRA~1/MySQL/MYSQLS~1.5 

The official instructions claim that MYSQL_HOME is needed only during installation, not loading, but I have not found this to be the case: MYSQL_HOME needs to be set at all times.

For other usage, take a look at the RMySQL Reference Manual. You may also need to examine the R/S-Database Interface.

Statistical Tests

  • For comparing distributions, you can use Kullback-Leibler. This shows up multiple times: seewave and kl.dist.
  • For t-test, use t.test. For chi-squared, use chisq.test.

Plotting

Plotting seems to be exceedingly difficult in R.

  • ggplot2 is a plotting system for R, which tries to take the good parts of base and lattice graphics and none of the bad parts. ggplot2 documentation. You can install ggplot2 with: install.packages("ggplot2").

Miscellaneous

  • Pearson's R using cor.test.
  • Couple Python and R with RPy. Unfortunately, the authors has stopped providing Windows binaries with rpy2. If you're using straight Python, scipy may be an alternative.

Useful Magic

  • gsubfn (simpler alternative may be to use sprintf; there is also paste). Sanitize your data first, since this is open to SQL injection attacks.