The list of currently installed software on the cluster. If you wish to have additional software installed, please email econcluster@umd.edu.
Software | Version | Terminal Command |
---|---|---|
GCC | 11.4.1 | gcc |
Matlab | R2023a | matlab |
Python | 3.9 | python |
Python | 3.11 | python3.11 |
R | 4.4.2 | R |
Stata | 18 MP8 | stata-mp |
To run a pre-written python script, type
python script.py
To install a library that doesn't come with the initial installation, you first need to create a virtual environment (where $NAME is what you choose to name your virtual environment)
python -m venv $NAME
after creating, activate the environment
source environment_name/bin/activate
(you should now see the environment on the far left of the terminal line). After that, you can simply install any library using pip from the command line
pip install pandas
You can run an R file in batch mode with
R CMD BATCH filename.R
To run your R command in the background, see Managing Jobs.
To install an R package, type in the interactive mode
install.packages($PACKAGE_NAME)
The following section comes initially from an introductory talk on R given by Paul Bailey in February 2011. The data used in the examples is located at this link.
dat <- read.csv("MDemp.csv")
and general methods
dat <- read.table("MDemp.csv",sep=",")
?
??
summary(dat)
summary(dat$num_child) table(dat$num_child)
[condition,]
you can select rows:dat.lf <- dat[dat$emp %in% c("emp","unemp"),] dat.hs <- dat.lf[dat.lf$educ==39,]
lm1 <- lm(weekly_earn ~ age + year,data=dat) summary(lm1)
dat$yearf <- as.factor(dat$year) lm2 <- lm(weekly_earn ~ age + yearf,data=dat) summary(lm2)
contrasts(dat$yearf) <- "contr.sum" lm3 <- lm(weekly_earn ~ age + yearf,data=dat) summary(lm3)
agg.hs <- aggregate(dat.hs$emps,by=list(dat.lf$yq),mean)
merged <- merge(data.a,data.b)
* Lots of options for this one
Some basic info can be found at the High Performance Computing CRAN view. You can use the “parallel” package (which merges both “snow” and “multicore”).
You can also use Rmpi and npRmpi packages. You have your choice of MPI2 libraries (both OpenMPI and MPICH2). You will have to install the packages in your userspace (requiring compilation).
OpenMPI | MPICH2 | |
---|---|---|
Before anything (installation or usage) | >module load openmpi-x86_64 | >module load mpich2-x86_64 |
Installation | R> install.packages(“<package>”, configure.args=“–with-Rmpi-include=/usr/lib64/openmpi/1.4-gcc/include –with-Rmpi-libpath=/usr/lib64/openmpi/1.4-gcc/lib –with-Rmpi-type=OPENMPI”) | R> install.packages(“<package>”, configure.args=“-with-Rmpi-include=/usr/include/mpich2-x86_64 –with-Rmpi-libpath=/usr/lib64/mpich2/lib –with-Rmpi-type=MPICH”) |
A good intro guide is npRmpi: A package for parallel distributed kernel estimation in R.
merge
merges datasetsglm
fits limited dependent variable models.optim
minimizes / finds zerosYou can run a .do file in batch mode with
stata-mp -b do dofile.do
To allow your do-file to continue running when you log off from your terminal, preface the command with “nohup”. For example:
nohup stata-mp -b do dofile.do &
For more information on how to run your Stata command in the background, see Managing Jobs
By default, Stata saves tempfiles (from -tempfile- or -preserve-) to /nfs/home/$USERNAME/stata-tmp/. If you would like Stata to save temporary files in a new location (e.g. $HOME/statatmp) then from the command-line execute the follow before executing Stata:
export STATATMP=$HOME/statatmp
One reason you might want to do this is that files are removed from /home/stata-tmp/ if they haven't been touched for a day. If you have a Stata process that runs for longer this may cause problems with reading from tempfiles or -restore-.
If you are using extra packages on your home/work computer and need them installed on the cluster, you can install them via ssc:
ssc install outreg
You will then have a folder installed within your home directory called “ado”, which contains your new commands filed away.