Commit 1d64f878 authored by Mani Tofigh's avatar Mani Tofigh

1) Deleted Stallo's env module page and added our own completed version under…

1) Deleted Stallo's env module page and added our own completed version under the Software section. 2) Deleted Guides folder and moved the Slurm doc under the Jobs setion (Will further have to incorporate both versions)
parent 891dc118
...@@ -3,3 +3,4 @@ _site ...@@ -3,3 +3,4 @@ _site
.sass-cache .sass-cache
Gemfile.lock Gemfile.lock
.DS_Store
...@@ -3,3 +3,5 @@ source "https://rubygems.org" ...@@ -3,3 +3,5 @@ source "https://rubygems.org"
gem "jekyll-rtd-theme", "~> 2.0.10" gem "jekyll-rtd-theme", "~> 2.0.10"
gem "github-pages", group: :jekyll_plugins gem "github-pages", group: :jekyll_plugins
gem "webrick"
--- ---
sort: 3 sort: 2
--- ---
# Account # Account
......
This diff is collapsed.
---
sort: 100
---
# Guides
{% include list.liquid all=true %}
# I'm folder2
source: `{{ page.path }}`
# file1
source: `{{ page.path }}`
# file2
source: `{{ page.path }}`
# file3
source: `{{ page.path }}`
# I'm folder1
source: `{{ page.path }}`
# file1
source: `{{ page.path }}`
# file2
source: `{{ page.path }}`
# file3
source: `{{ page.path }}`
--- ---
sort: 2 sort: 1
--- ---
# Getting Help # Getting Help
......
---
sort: 3
---
# Slurm # Slurm
Slurm Workload Manager, or SLURM (Simple Linux Utility for Resource Management), is a free and open-source job scheduler for managing workloads on Linux and Unix-based clusters, grids, and supercomputers. Slurm is widely used in high-performance computing (HPC) environments, where it is used to manage the allocation of resources such as CPU time, memory, and storage across a large number of compute nodes. Slurm provides tools for users to submit, monitor, and control the execution of their jobs. Other key features include support for parallel and serial job execution, support for job dependencies and job arrays, support for resource reservations and QoS (quality of service), and support for job priority and backfilling. Slurm has a modular design that enables it to be highly configurable to be tailored to meet a wide variety of needs in different environments. It is widely used in academia and industry, and is supported by a large and active community of users and developers. Slurm Workload Manager, or SLURM (Simple Linux Utility for Resource Management), is a free and open-source job scheduler for managing workloads on Linux and Unix-based clusters, grids, and supercomputers. Slurm is widely used in high-performance computing (HPC) environments, where it is used to manage the allocation of resources such as CPU time, memory, and storage across a large number of compute nodes. Slurm provides tools for users to submit, monitor, and control the execution of their jobs. Other key features include support for parallel and serial job execution, support for job dependencies and job arrays, support for resource reservations and QoS (quality of service), and support for job priority and backfilling. Slurm has a modular design that enables it to be highly configurable to be tailored to meet a wide variety of needs in different environments. It is widely used in academia and industry, and is supported by a large and active community of users and developers.
# Environment modules
## Introduction to Environment Modules
In an HPC cluster, the diversity and quantity of installed software span many applications in various versions. Often, these applications are installed in non-standard locations for ease of maintenance, practicality, and security reasons. Due to the shared nature of the HPC cluster and its significant scale compared to typical desktop compute machinery, it's neither possible nor desirable to use all these different software versions simultaneously, as conflicts between them may arise. To manage this complexity, we provide the production environment for each application outside of the application itself, through a set of instructions and variable settings known as an application module. This approach not only prevents conflicts but also simplifies control over which application versions are available for use in any given session. We utilize the `lmod` module system for this purpose, with the `module` command being the primary tool for managing these software environments.
For example, if a user needs to work with a specific Python environment provided by Anaconda, they can simply load the Anaconda module by executing `module load anaconda`.
This is just one instance of how the module command can be utilized. In the following sections, we will fully discuss the module command and its use cases.
For a complete list of options with the `module` command:
```bash
[me@cluster ~]$ module --help
```
## Loading and Managing Modules
### Checking Loaded Modules
To see the modules currently active in your session:
```bash
module list
```
### Listing Available Modules
To view all available modules:
```bash
module avail
```
The list will include both local modules (specific to the node or head node) and shared modules (available from shared storage).
### Loading a Module
To load a module, for example, `gcc`:
```bash
module load gcc
```
To load a specific version of a module:
```bash
module load gcc/11.2.0
```
### Unloading Modules
To unload a module:
```bash
module unload gcc
```
### Switching Module Versions
To switch to a different version of a module:
```bash
module switch intel intel/2016b
```
### Avoiding Module Conflicts
Be aware of potential conflicts, especially with MPI modules. Loading conflicting modules like `openmpi` and `impi` simultaneously should be avoided.
Using the `shared` Module
The `shared` module provides access to shared libraries and is often a dependency for other modules. It is typically loaded first:
```bash
module load shared
```
### Setting Default Modules
To avoid manually loading the same modules every login, users can set an initial default state for modules using `module init*` subcommands:
* **Add a module to initial state**: `module initadd <module_name>`
* **Remove a module from initial state**: `module initrm <module_name>`
* **List initial modules**: `module initlist`
* **Clear initial modules**: `module initclear`
Example:
```bash
module initclear
module initadd shared gcc openmpi/gcc
```
### Available Commands and Practical Usage
For practical use of the modules commands:
* **Loading and unloading modules**: `module load <module_name>, module unload <module_name>`
* **Listing loaded and available modules**: `module list, module avail`
* **Switching modules**: `module switch <old_module> <new_module>`
* **Finding out what a module does**: `module whatis <module_name>`
Example of loading modules:
```bash
[fred@cluster ~]$ module load shared gcc openmpi/gcc
```
> Tab completion is available for suggesting modules for the add/load commands.
Example of unloading modules:
```bash
[fred@cluster ~]$ module unload gcc openmpi/gcc
```
### Managing the Default Environment
Users can customize their default environment using module `init*` commands. This ensures the desired modules are automatically loaded at login.
Example:
```bash
[fred@cluster ~]$ module initclear
[fred@cluster ~]$ module initadd shared gcc openmpi/gcc
[fred@cluster ~]$ module initlist
```
## Additional Information
* **Conflicts and Dependencies**: Users should be mindful of conflicts between modules and dependencies, particularly with MPI implementations.
* **Testing New Software Versions**: Modules allow for easy testing of new software versions without permanent installation.
For further details, users are encouraged to refer to the man pages for module and modulefile:
```bash
man module
```
\ No newline at end of file
---
sort: 1
---
# Star cluster # Star cluster
{% include list.liquid all=true %} {% include list.liquid all=true %}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment