Skip to content

Commit 3a935a9

Browse files
committed
Convert to MD
1 parent 25147cd commit 3a935a9

23 files changed

+1517
-0
lines changed

.DS_Store

0 Bytes
Binary file not shown.

MPIuseguide.md

+138
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
---
2+
highlighter: none
3+
layout: default
4+
title: MPI Use Guide
5+
---
6+
7+
8+
Contents
9+
========
10+
11+
1. [Overview of installed MPI libraries](#overview)
12+
2. [Using the MPI libraries to compile your code](#lib)
13+
3. [Running your job](#run)
14+
15+
This page describes how to use the MPI libraries installed on the HPC
16+
cluster.
17+
18+
<a name="overview"></a>
19+
20+
**1. Overview of Installed MPI Libraries**
21+
======================================
22+
23+
There are multiple MPI libraries installed on the cluster, many compiled
24+
in at least two ways (see below). The SLURM scheduler also has
25+
integrated libraries for most MPI versions, which you can read about
26+
[here](https://computing.llnl.gov/linux/slurm/mpi_guide.html).
27+
28+
There are ten different MPI libraries installed:
29+
30+
- OpenMPI version 1.6.4
31+
- OpenMPI version 1.10.2
32+
- OpenMPI version 2.0.1
33+
- OpenMPI version 2.1.0
34+
- OpenMPI version 3.1.1
35+
- MPICH version 3.0.4
36+
- MPICH version 3.1
37+
- MPICH2 version 1.5
38+
- MVAPICH2 version 1.9
39+
- MVAPICH2 version 2.1
40+
41+
**IMPORTANT: If your software can use OpenMPI or MVAPICH2, these are the
42+
recommended MPI libraries for CHTC\'s HPC Cluster and will perform the
43+
fastest on the cluster\'s Infiniband networking.** MPICH and MPICH2 do
44+
not use Infiniband, by default, and will perform slower than OpenMPI or
45+
MVAPICH2, though we\'ve configured them to work as well as for
46+
ethernet-only clusters, so they\'ll still work if your software will
47+
*only* run with MPICH or MPICH2.
48+
49+
Note that many of our MPI libraries have been compiled with different
50+
base compilers, to accommodate codes that require these base compilers
51+
to be built successfully. The compiler used to build MPI can be seen in
52+
the names of our software modules (see [below](#lib)) and include:
53+
54+
- `gcc` - the base system version of `gcc`
55+
- `intel` - compiled with Intel Composer XE and Intel MPI Library
56+
Development Kit 4.1 compilers
57+
- `intel-2016` -
58+
- `2.1.0-GCC-7.3.0-2.30` - version 7.3 of `gcc`
59+
60+
<a name="lib"></a>
61+
62+
**2. Using the MPI Libraries to Compile Your Code**
63+
===============================================
64+
65+
In order to successfully compile and run your code using these MPI
66+
libraries you need to set a few environmental variables. To set these
67+
variables you will be using the Environmental Modules package
68+
(<http://modules.sourceforge.net>). This package is very easy to use and
69+
it will automatically set the environmental variables necessary to use
70+
the flavor and version of MPI that you need.
71+
72+
First, you are going to want to run the following command to see the
73+
available modules:
74+
75+
```
76+
[alice@service]$ module avail
77+
```
78+
{:.term}
79+
80+
When you run the above command you will receive output similar to this:
81+
82+
```
83+
[alice@service]$ module avail
84+
---------------------------------------- /etc/modulefiles ----------------------------------------
85+
mpi/gcc/mpich-3.0.4 mpi/gcc/openmpi-1.6.4 mpi/intel/mvapich2-1.9
86+
mpi/gcc/mvapich2-1.9 mpi/intel/mpich-3.0.4 mpi/intel/openmpi-1.6.4
87+
matlab-r2015b compile/intel-2016 mpi/gcc/openmpi/2.1.0-GCC-7.3.0-2.30
88+
```
89+
{:.term}
90+
91+
As you can see, the MPI libraries compiled with GCC compilers are listed
92+
under `mpi/gcc/` and the MPI libraries compiled with Intel compilers are
93+
listed under `mpi/intel/`.
94+
95+
To load a module (for example, OpenMPI compiled with GCC compilers),
96+
simply run this command:
97+
98+
```
99+
[alice@service]$ module load mpi/gcc/openmpi-1.6.4
100+
```
101+
{:.term}
102+
103+
Now all necessary environmental variables are set correctly and you can
104+
go ahead and compile your code!
105+
106+
If you loaded the wrong module, let\'s say MPICH compiled with Intel
107+
compilers, you can unload it by running:
108+
109+
```
110+
[alice@service]$ module unload mpi/intel/mpich-3.0.4
111+
```
112+
{:.term}
113+
114+
You can see what modules you already have loaded by running:
115+
116+
```
117+
[alice@service]$ module list
118+
```
119+
{:.term}
120+
121+
**NOTE:** Before using any of the MPI libraries under `mpi/intel/` you
122+
first need to load the `compile/intel` module.
123+
124+
<a name="run"></a>
125+
126+
127+
**3. Running Your Job**
128+
===================
129+
130+
**NOTE:** To run your job you need the same module loaded as when you
131+
compiled your code. When you log out of your terminal all loaded modules
132+
are automatically unloaded.
133+
134+
In order to ensure that your job has the appropriate modules loaded when
135+
it runs, we recommend adding the `module load` command to your submit
136+
file, with the appropriate modules. See our sample submit file in the
137+
[HPC Use Guide](/HPCuseguide.shtml#batch-job) to see what this looks
138+
like.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

jobs.shtml archive_html/jobs.shtml

File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

chtc-projects.md

+58
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
---
2+
layout: default
3+
title: CHTC Projects
4+
---
5+
6+
<b>As part of it's many services to UW-Madison and beyond, the CHTC is home to or supports the following projects:</b>
7+
8+
<br>
9+
10+
<div id = "tile-wrapper">
11+
12+
<a href = "http://research.cs.wisc.edu/htcondor/" class = "tile" id = "condor-tile">
13+
<h2>HTCondor project</h2>
14+
15+
<img src = "images/condor.png">
16+
17+
<p>
18+
The HTCondor project provides the globally-used HTCondor software that powers the CHTC's <i>HTC resources</i>. HTCondor enables the CHTC to
19+
maximize job throughput for many simultaneous users, each with different needs. It supports
20+
complex workflows, the ability to take checkpoints and then resume jobs, and it enables access
21+
to additional computing resources across the globe, like the Open Science Grid.
22+
</p>
23+
24+
</a>
25+
26+
<a href = "http://www.opensciencegrid.org" class = "tile" id = "osg-tile">
27+
<h2>Open Science Grid</h2>
28+
29+
<img src = "images/osg.png">
30+
31+
<p>
32+
The Open Science Grid (OSG) is a high-throughput computing infrastructure that supports science across the country.
33+
It is an expanding alliance of more than 100 universities, national laboratories,
34+
scientific collaborations, and software developers, all combining their computational resources with one
35+
another for maximal throughput of large-scale computational work.
36+
CHTC users have automatica and FREE access to OSG's considerable computing and storage resources.
37+
</p>
38+
39+
</a>
40+
41+
<a href = "http://www.neos-server.org" class = "tile" id = "neos-tile">
42+
<h2>NEOS Online Optimization Service</h2>
43+
44+
<img src = "images/neos.png">
45+
46+
<p>
47+
NEOS Online Optimization Service is a publicly-available, high-throughput computing infrastructure that provides solvers for a variety of
48+
optimization problems. It also provides background information on the field of optimization,
49+
in general, as well as case studies that provide an understanding
50+
of optimization problems and approaches to solving them.
51+
</p>
52+
53+
</a>
54+
55+
</div>
56+
57+
<p><a href="/projects.shtml">Learn more</a>
58+
about how the CHTC has helped advance a diverse body of research on campus.</p>

chtcpolicy.md

+30
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
---
2+
highlighter: none
3+
layout: default
4+
title: CHTC Job Policy Description
5+
---
6+
7+
These policies relate to the general resources which were funded by
8+
**WARF** to which **UW** researchers have equal access. There are
9+
additional **OWNED** resources in the center which are made available
10+
for Opportunistics use.
11+
12+
- A user may recycle a slot for up to 12 hours without having to
13+
renegotiate for the slot. No single job will be interrupted before
14+
it has run for 24 hours. If(for any reason) a job is suspended, this
15+
does not count against its run time.
16+
- A user may request a whole machine for use (ie a whole machine
17+
slot). \~45 of the current 200 machines are available in this
18+
capcity. When a user claims a machine, their job will be initially
19+
suspended while jobs on any single-core slots complete. Like
20+
single-core slots, no whole machine job will be interrupted before
21+
it has run for 24 hours.
22+
- Groups of users are identified by their associated project. (One
23+
user working on one project, counts as one group). This is used tp
24+
make accounting more readable.
25+
- User Priority Factors in this pool are adjusted to attempt to avoid
26+
a group getting a larger share of resources only because it has more
27+
users (or logical identities) submitting jobs. This policy currently
28+
weights all identified groups equal to the weight of one individual
29+
user.
30+
- User priority is calculated with a 3 day half life.

get-submit-node.md

+55
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
---
2+
highlighter: none
3+
layout: default
4+
title: Getting a Submit Node
5+
---
6+
7+
8+
In order to submit jobs to our campus-wide collection of resources, you
9+
will need access to a submit node. There are several options for getting
10+
access to a submit node:
11+
12+
1. **[Use ours](use-submit-node.shtml).** We operate a submit node that
13+
is shared by many researchers. This is a great way to get started
14+
quickly, and it is sufficient if you do not need to run tens of
15+
thousands of jobs with heavy data transfer requirements.
16+
2. **Use your department\'s.** Perhaps your department already has its
17+
own submit node, in which case you can contact your local
18+
administrator for an account. You will still need to provide all the
19+
info requested on the [getting started](get-started.shtml) form, so
20+
we can set up things on our end. The benefits of using a
21+
departmental or group submit node are: access to data on local file
22+
systems; limited impact from other, potentially new users; and,
23+
greater scalability in the number of simultaneous jobs you can run,
24+
as well as the amount of data you can transfer.
25+
3. **Set up a new submit node on a server.** If you do not already have
26+
one and need access to data on local file systems, or if you believe
27+
that you will have a significant job and/or data volume, getting
28+
your own submit node is probably the best way to go. Here\'s an
29+
example system configuration that we\'ve found works well for a
30+
variety of submit work loads. You can expect to spend around
31+
\$4,000 - \$5,000 for such a system.
32+
33+
**Typical submit node configuration**
34+
35+
- A 1U rack-mount enclosure, like a Dell PowerEdge 410.
36+
- Two processors with 12 cores total, for example Intel Xeon
37+
E5645, 2.4GHz 6-core processors
38+
- 24GB of 1.3 GHz RAM
39+
- Two drives for the operating system. 500GB each is enough. You
40+
can use mirroring or a RAID configuration like RAID-6 for
41+
reliability.
42+
- Two or more 2-3TB drives for data, depending on your needs.
43+
44+
4. **Use your desktop.** Depending on your department\'s level of
45+
system adminstration support, you may be able to have HTCondor
46+
installed on your desktop and configured to submit into our campus
47+
resources. Another option that is under development is
48+
[Bosco](https://twiki.grid.iu.edu/bin/view/CampusGrids/BoSCO), a
49+
user-installable software package that lets you submit jobs into
50+
resources managed by HTCondor, PBS or SGE.
51+
52+
Still not sure what option is right for you? No worries. This is one of
53+
the topics we discuss in our initial consultation. To schedule an
54+
initial consultation, fill out our [getting started](get-started.shtml)
55+
form.

0 commit comments

Comments
 (0)