User Tools

Site Tools


cluster:software

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
cluster:software [2018/10/01 17:18] – [SAS] mcloughlincluster:software [2024/11/14 14:47] (current) – external edit 127.0.0.1
Line 1: Line 1:
-====== Cluster Software ====== +======Cluster Software======
-===== C ===== +
-The cluster can compile C or C++ code using the [[http://gcc.gnu.org/|Gnu Compiler Collection.]] +
-===== Fortran ===== +
-The cluster has at least Fortran compiling capability from the [[http://gcc.gnu.org/|Gnu Compiler Collection]]. It might have your compiler too, or perhaps you can install it.+
  
-It also has IMSL Fortran Library 7.0 License Service. +The list of currently installed software on the clusterIf you wish to have additional software installedplease email econcluster@umd.edu.
-===== Gauss ===== +
-Gauss is installed on the ClusterTo run it in interactive modeuse the command **tgauss**.+
  
-To run Gauss in Batch Moderun:+^Software ^Version ^Terminal Command ^ 
 +| GCC | 11.4.1 | gcc | 
 +| Matlab | R2023a | matlab | 
 +| Python | 3.9 | python | 
 +| Python | 3.11 | python3.11 | 
 +| R | 4.4.2 | R | 
 +| Stata | 18 MP8 | stata-mp | 
 + 
 +=====Python===== 
 + 
 +To run a pre-written python script, type <code>python script.py</code> 
 + 
 +==== Installing Libraries via VENV ==== 
 + 
 +To install a library that doesn't come with the initial installation, you first need to create a virtual environment (where $NAME is what you choose to name your virtual environment) 
 +<code>python -m venv $NAME</code> 
 +after creating, activate the environment 
 +<code>source environment_name/bin/activate</code> 
 +(you should now see the environment on the far left of the terminal line). After that, you can simply install any library using pip from the command line 
 +<code>pip install pandas</code> 
 + 
 +=====R===== 
 +====Batch Mode==== 
 +You can run an R file in batch mode with <code>R CMD BATCH filename.R</code> 
 +To run your R command in the background, see [[cluster:managing_jobs|Managing Jobs]]. 
 + 
 +====Installing Packages===== 
 +To install an R package, type in the interactive mode <code>install.packages($PACKAGE_NAME)</code> 
 +  
 +====Introduction to R==== 
 +The following section comes initially from an introductory talk on R given by Paul Bailey in February 2011.  The data used in the examples is located at [[http://terpconnect.umd.edu/~pdbailey/R/MDemp.csv|this link]]. 
 + 
 +===R Background=== 
 +  * Based on Bell Labs S 
 +  * Open source software 
 +    * Large group of contributors 
 +    * Most R code is written in R 
 +    * Computationally intensive code written in FORTRAN or C 
 +  * Datasets, matrices are native types 
 +  * Easy, customizable graphics 
 + 
 +===R Pros=== 
 +  * Free 
 +  * Easy to get a sense of what is going on with data 
 +  * Excellent at simulation 
 +  * Interfaces with lots of other software (i.e. WINBUGS, SQL) 
 + 
 +===R Cons=== 
 +  * Uses RAM to store data 
 +  * Support mainly via listserves 
 +  * Difficult to get started 
 + 
 +===Read in Data=== 
 +  * Some type specific methods, and a general method <code>dat <- read.csv("MDemp.csv")</code> and general methods <code>dat <- read.table("MDemp.csv",sep=",")</code> 
 + 
 +===Getting Help=== 
 +  * You can use the following command to get the help page for a command: <code>?</code> 
 +  * To search for text in help text use the following command: <code>??</code> 
 + 
 +===Summary=== 
 +  * Getting summaries is easy''summary(dat)'' 
 +  * You can also focus on one variable
 <code> <code>
-tgauss -v -b gauss.in > gauss.out+summary(dat$num_child) 
 +table(dat$num_child)
 </code> </code>
-===== Mathematica ===== 
-(NEW LICENSES AVAILABLE: We are glad to announce that Mathematica is newly installed on econ1 and econ2, feel free to scatter the jobs on the unused resource on econ1 and econ2. In order to request a certain node (e.g. econ1), please refer to Request Node.) 
  
-Mathematica is currently installed on cluster econ8. To run it in interactive modetype the command in the command prompt:+===Subset Data=== 
 +  * When you reference something with ''[condition,]'' you can select rows:
 <code> <code>
-math+dat.lf <- dat[dat$emp %in% c("emp","unemp"),
 +dat.hs <- dat.lf[dat.lf$educ==39,]
 </code> </code>
  
-To run Mathematica in batch mode, use:+===Linear Models=== 
 +  * The **lm** function fits linear models with a formula:
 <code> <code>
-nohup math -noprompt -run "<<file.m" >  output &+lm1 <lm(weekly_earn ~ age + year,data=dat) 
 +summary(lm1) 
 +</code> 
 +  * You can also treat a variable as a factor: 
 +<code> 
 +dat$yearf <- as.factor(dat$year) 
 +lm2 <- lm(weekly_earn ~ age + yearf,data=dat) 
 +summary(lm2) 
 +</code> 
 +  * And change constraints: 
 +<code> 
 +contrasts(dat$yearf) <- "contr.sum" 
 +lm3 <- lm(weekly_earn ~ age + yearf,data=dat) 
 +summary(lm3)
 </code> </code>
  
-To run the GUI version, make sure you have X Forwarding working. +===Aggregate=== 
-===== Matlab =====+  * Allows you to create summary statistics for groups 
 +  * First argument is what you want to summarize 
 +  * Second argument is what you want to group by 
 +  * Their argument is what to do to the groups 
 +<code> 
 +agg.hs <- aggregate(dat.hs$emps,by=list(dat.lf$yq),mean) 
 +</code> 
 +  * Results names a little odd. 
 +  *  
 +===Merge=== 
 +  * Groups two datasets by shared columns 
 +<code> 
 +merged <- merge(data.a,data.b) 
 +</code> 
 +* Lots of options for this one
  
-The cluster has Matlab R2013a and R2014a.  To run Matlab, use the command<code>matlab -nojvm -nodisplay</code>+===Parallel=== 
 +Some basic info can be found at the [[http://cran.r-project.org/web/views/HighPerformanceComputing.html|High Performance Computing CRAN view]]. You can use the "parallel" package (which merges both "snow" and "multicore"). 
  
-Using just the below command will also work, though it will give errors from failing to find a display:<code>matlab</code> +You can also use [[http://cran.r-project.org/web/packages/Rmpi/index.html|Rmpi]] and [[http://cran.r-project.org/web/packages/npRmpi/index.html|npRmpi]] packages. You have your choice of MPI2 libraries (both OpenMPI and MPICH2). You will have to install the packages in your userspace (requiring compilation).
  
-To run Matlab in batch mode, use the command <code>matlab < your-file.m</code>+^^OpenMPI^MPICH2^ 
 +|Before anything (installation or usage)|>module load openmpi-x86_64|>module load mpich2-x86_64| 
 +|Installation| R> install.packages("<package>", configure.args="--with-Rmpi-include=/usr/lib64/openmpi/1.4-gcc/include --with-Rmpi-libpath=/usr/lib64/openmpi/1.4-gcc/lib --with-Rmpi-type=OPENMPI")|Rinstall.packages("<package>", configure.args="-with-Rmpi-include=/usr/include/mpich2-x86_64 --with-Rmpi-libpath=/usr/lib64/mpich2/lib --with-Rmpi-type=MPICH")|
  
-This command will print the results of **your-file.m** onto the screenIf you want to redirect this to an output file, use <code>matlab < your-file.m > log.txt</code>+A good intro guide is [[http://onlinelibrary.wiley.com/doi/10.1002/jae.1221/pdf|npRmpi: A package for parallel distributed kernel estimation in R]].
  
-==== Graphics ==== +===Other functions=== 
-You should bear in mind that the cluster excels at numerical analysis and NOT graphics.   By default, there is only a command line interface to Matlab.  It is possible to get graphics from the server forwarded to your computer, but it is VERY slow.  +  * ''merge'' merges datasets 
 +  * ''glm'' fits limited dependent variable models. 
 +  * ''optim'' minimizes / finds zeros 
 +  * [[http://cran.r-project.org/web/views/Econometrics.html|Contributed econometrics packages]]
  
-If you want to create graphs, this is still possible.  By assigning your plot command to a variable and then using **saveas**, you can get this effect:+=====Stata===== 
 +====Batch Mode===== 
 +You can run a .do file in batch mode with
 <code> <code>
-h = plot(x,y) +stata-mp -b do dofile.do
-saveas(h,'yourFigure.jpg')+
 </code> </code>
  
-==== Tomlab ==== +To allow your do-file to continue running when you log off from your terminalpreface the command with "nohup"For example: 
-The department has a license for Tomlabwhich contains several optimization algorithms that improve on Matlab's defaults If you are solving problems that are computationally difficult you might try solving using Tomlab instead.+<code> 
 +nohup stata-mp -b do dofile.do & 
 +</code> 
 + 
 +For more information on how to run your Stata command in the background, see [[Managing_Jobs|Managing Jobs]]
  
-==== Dynare ==== +==== Temporary Files ==== 
-Dynare is able to handle large class of dynamic economic models, especially good for solving DSGE and OLG modelsDynare is currently available on the cluster as toolboxYou will need to modify the access path by including the following line to your code:+By default, Stata saves tempfiles (from -tempfile- or -preserve-) to /nfs/home/$USERNAME/stata-tmp/If you would like Stata to save temporary files in new location (e.g. $HOME/statatmp) then from the command-line execute the follow before executing Stata:
 <code> <code>
-addpath /usr/local/matlab/toolbox/dynare+export STATATMP=$HOME/statatmp 
 </code> </code>
  
-For more information on how to use Dynare visit the [[http://www.dynare.org/|Dynare Website]].+One reason you might want to do this is that files are removed from /home/stata-tmp/ if they haven't been touched for a dayIf you have a Stata process that runs for longer this may cause problems with reading from tempfiles or -restore-.
  
-==== Parallel ==== +==== Installing Extra Packages ==== 
-The cluster also has Matlab Parallel Computing Toolbox.+If you are using extra packages on your home/work computer and need them installed on the cluster, you can install them via ssc: 
  
 +<code>
 +ssc install outreg
 +</code>
  
-===== R ===== +You will then have a folder installed within your home directory called "ado", which contains your new commands filed away.
-===== SAS ===== +
-===== Stata ===== +
-===== Other Software =====+
cluster/software.1538414298.txt.gz · Last modified: 2018/10/01 17:18 by mcloughlin