cluster:r
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
cluster:r [2018/10/01 17:39] – mcloughlin | cluster:r [2024/11/11 20:55] (current) – removed mcloughlin | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== R ====== | ||
- | R is available on the cluster. R can also be installed on your computer for free by visiting the [[http:// | ||
- | |||
- | For more information about R, you might want to use the [[http:// | ||
- | |||
- | ===== Running R on the Cluster ===== | ||
- | You can run R interactively on the cluster with:< | ||
- | You can also run a .R file in batch mode with< | ||
- | To run your R command in the background, see [[cluster: | ||
- | |||
- | ===== Introduction to R ===== | ||
- | The following section comes initially from an introductory talk on R given by Paul Bailey in February 2011. The data used in the examples is located at [[http:// | ||
- | |||
- | ===== R Background ===== | ||
- | * Based on Bell Labs S | ||
- | * Open source software | ||
- | * Large group of contributors | ||
- | * Most R code is written in R | ||
- | * Computationally intensive code written in FORTRAN or C | ||
- | * Datasets, matrices are native types | ||
- | * Easy, customizable graphics | ||
- | |||
- | ==== R Pros ==== | ||
- | * Free | ||
- | * Easy to get a sense of what is going on with data | ||
- | * Excellent at simulation | ||
- | * Interfaces with lots of other software (i.e. WINBUGS, SQL) | ||
- | |||
- | ==== R Cons ==== | ||
- | * Uses RAM to store data | ||
- | * Support mainly via listserves | ||
- | * Difficult to get started | ||
- | |||
- | ==== Read in Data ==== | ||
- | * Some type specific methods, and a general method < | ||
- | general methods < | ||
- | |||
- | ==Getting Help== | ||
- | * You can use < | ||
- | * To search for text in help text use < | ||
- | |||
- | ==Summary== | ||
- | * Getting summaries is easy | ||
- | | ||
- | * You can also focus on one variable | ||
- | | ||
- | | ||
- | * (Extended example) | ||
- | |||
- | ==Subset Data== | ||
- | * When you reference something with < | ||
- | | ||
- | | ||
- | |||
- | ==Linear Models== | ||
- | * The < | ||
- | lm1 <- lm(weekly_earn ~ age + year, | ||
- | | ||
- | * You can also treat a variable as a ``factor'' | ||
- | | ||
- | lm2 <- lm(weekly_earn ~ age + yearf, | ||
- | | ||
- | * And change constraints | ||
- | | ||
- | lm3 <- lm(weekly_earn ~ age + yearf, | ||
- | | ||
- | |||
- | ==Aggregate== | ||
- | * Allows you to create summary statistics for groups | ||
- | * First argument is what you want to summarize | ||
- | * Second argument is what you want to group by | ||
- | * Their argument is what to do to the groups | ||
- | | ||
- | * Results names a little odd. | ||
- | |||
- | |||
- | ==Merge== | ||
- | * Groups two datasets by shared columns | ||
- | merged <- merge(data.a, | ||
- | * Lots of options for this one | ||
- | |||
- | ==Parallel== | ||
- | Some basic info can be found at the [http:// | ||
- | |||
- | You can also use [http:// | ||
- | |||
- | {| class=" | ||
- | |- | ||
- | ! | ||
- | ! OpenMPI | ||
- | ! MPICH2 | ||
- | |- | ||
- | | Before anything (installation or usage) | ||
- | | >module load openmpi-x86_64 | ||
- | | >module load mpich2-x86_64 | ||
- | |- | ||
- | | Installation | ||
- | | R> install.packages("< | ||
- | | R> install.packages("< | ||
- | |} | ||
- | |||
- | A good intro guide is [http:// | ||
- | |||
- | ==Other functions== | ||
- | * < | ||
- | * < | ||
- | * < | ||
- | * [http:// | ||
cluster/r.1538415572.txt.gz · Last modified: 2018/10/01 17:39 by mcloughlin