Commit 8bbd1f17 authored by Mani Tofigh's avatar Mani Tofigh

Merge remote-tracking branch 'origin/master'

parents b675c22c 520c5632
root = true
[*]
end_of_line = lf # or cr if that's the format you want
......@@ -3,4 +3,5 @@ _site
.sass-cache
Gemfile.lock
.bundle
.DS_Store
......@@ -17,12 +17,18 @@ code examples are provided under the [MIT](https://opensource.org/licenses/MIT)
### Install build tools and dependencies.
<details>
<summary>Liquid 4.0.3</summary>
> [!WARNING]
> Due to Liquid not being updated to work with Ruby 3.2.x, **make sure you have Ruby 3.1.x or older installed**.
> If you have the latest dependencies installed, the following does not apply anymore.
>
> The original `jekyll-rtd-theme` 2.0.10 required `github-pages` 209, which effectively capped the version of Liquid to 4.0.3.
> Due to Liquid 4.0.3 and older not being updated to work with Ruby 3.2.x, Ruby 3.1.x or older was required for Liquid 4.0.3.
> https://talk.jekyllrb.com/t/liquid-4-0-3-tainted/7946/18
>
> #### With Cygwin
> As of this writing (8/8/2024), Cygwin provides Ruby versions 2.6.4-1 and 3.2.2-2. Make sure to install the former. Additionally, the version of bundler supplied with Ruby 2.6 is too old and the version of RubyGems is too new. *After installing the following dependencies*, you must then install the correct versions of RubyGems and bundler manually:
> As of 8/8/2024, Cygwin provided Ruby versions 2.6.4-1 and 3.2.2-2. You would need to make sure to install the former. As the version of bundler supplied with Ruby 2.6 is too old and the version of RubyGems is too new, the correct versions of RubyGems and bundler would need to be installed manually after installing all the other dependencies:
> ```
> gem update --system 3.2.3
> gem install bundler -v 2.1.4
......@@ -31,6 +37,8 @@ code examples are provided under the [MIT](https://opensource.org/licenses/MIT)
> bundler -v
> ```
</details>
To allow building of native extensions, install `ruby-devel`, `gcc`, and `make`.
Install `libxml2`, `libxml2-devel`, `libxslt`, `libxslt-devel`, `libiconv`,
......
......@@ -16,4 +16,4 @@ exclude:
plugins:
- jemoji
- jekyll-avatar
- jekyll-mentions
- jekyll-mentions
\ No newline at end of file
File mode changed from 100755 to 100644
......@@ -4,10 +4,13 @@ sort: 3
# Contact Us
If you need help, please file a support request via <support@starhpc.hofstra.io>,
and our team will try to assist you as soon as possible.
If you need help, please file a support request via the provided forms at our [Issue Tracker](https://github.com/StarHPC/Issues/issues/new/choose).
If you need to contact us directly, you can reach us at <support@starhpc.hofstra.io>, and our team will try to assist you as soon as possible.
<!-- ![rtfm]({{ site.baseurl }}/help/rtfm.png "rtfm") -->
<!-- ![rtfm]({{ site.baseurl }}/help/rtfm2.png "rtfm2") -->
<!-- ![rtfm]({{ site.baseurl }}/help/rtfm3.png "rtfm3") -->
![rtfm]({{ site.baseurl }}/help/rtfm4.png "rtfm4")
\ No newline at end of file
![rtfm]({{ site.baseurl }}/help/rtfm4.png "rtfm4")
......@@ -8,16 +8,15 @@ sort: 1
### I forgot my password - what now?
You can reset it here: [link to be provided]
{% comment %}You can reset it here: [link to be provided]{% endcomment %}
Please contact the [support team]({{site.baseurl}}{% link help/contact.md %}).
### How do I change my password on Star?
The password can be changed on the [password reset page](#). Log in using
your username on Star.
You can run the `passwd` command on the login node to change your password. Please note `passwd` will have no affect from the compute nodes.
The `passwd` command known from other Linuxes does not work. The Star
system is using a centralised database for user management. This will
override the password changes done locally on Star.
{% comment %}A web portal is currently under development. Once launched, your password can also be changed from the password reset page, [link to be provided]. Log in using your username on Star.{% endcomment %}
### What is the ssh key fingerprint for star.hofstra.edu?
......
File mode changed from 100755 to 100644
File mode changed from 100755 to 100644
......@@ -12,13 +12,7 @@ practices.
## Never send support requests directly to staff members
Please do not contact your cluster administrator directly. Always send
requests and inquiries to <support@starhpc.hofstra.io> for the quickest
response. On <support@starhpc.hofstra.io>, requests get tracked and have
higher visibility. Some of our staff members only work part time.
Sending the request to <support@starhpc.hofstra.io> makes sure that
somebody will pick it up. Please note in the request that it is for Star,
as there are other systems that are managed by us.
Please do not contact your cluster administrator directly. Please visit the [Issue Tracker](https://github.com/StarHPC/Issues) first, as there are different forms provided for various queries.
## Please do not treat us as "Let me Google that for you" assistants
......@@ -47,7 +41,7 @@ png, tiff, etc) of what you saw on your monitor. From these, we would be
unable to copy and paste commands or error messages, unnecessarily slowing
down our research on your problem.
## New problem == new E-mail
## New problem == new ticket
Do not send support requests by replying to unrelated issues. Every
issue gets a number and this is the number that you see in the subject
......@@ -61,16 +55,16 @@ know the solution but sometimes we don't know the problem.
In short (quoting from <http://xyproblem.info>):
- User wants to do X.
- User doesn't know how to do X, but thinks they can fumble their way
to a solution if they can just manage to do Y.
- User doesn't know how to do Y either.
- User asks for help with Y.
- Others try to help user with Y, but are confused because Y seems
like a strange problem to want to solve.
- After much interaction and wasted time, it finally becomes clear
that the user really wants help with X, and that Y wasn't even a
suitable solution for X.
- User wants to do X.
- User doesn't know how to do X, but thinks they can fumble their way
to a solution if they can just manage to do Y.
- User doesn't know how to do Y either.
- User asks for help with Y.
- Others try to help user with Y, but are confused because Y seems
like a strange problem to want to solve.
- After much interaction and wasted time, it finally becomes clear
that the user really wants help with X, and that Y wasn't even a
suitable solution for X.
To avoid the XY problem, if you struggle with Y but really what you are
after is X, please also tell us about X. Tell us what you really want to
......@@ -106,7 +100,7 @@ The better you describe the problem the less we have to guess and ask.
Sometimes, just seeing the actual error message is enough to give an
useful answer. For all but the simplest cases, you will need to make the
problem reproducible, which you should *always* try anyway. See the
problem reproducible, which you should _always_ try anyway. See the
following points.
## Complex cases: Create an example which reproduces the problem
......
File mode changed from 100755 to 100644
File mode changed from 100755 to 100644
---
sort: 2
---
# Creating Jobs
This page is mainly dedicated to examples of different job types. For a more comprehensive explanation on different job types, please refer to `/jobs/Overview.html`.
## Batch jobs (Non-interactive)
Batch jobs allow users to execute tasks without direct interaction with the computing environment. Jobs are written as scripts that consist of the commands to be executed and a specification of the requested resources.
### BATCH directives
BATCH directives are essentially instructions embedded at the beginning of a batch job script and are interpreted by the scheduler (like Slurm in our case). These lines are prefixed with `#SBATCH` for Slurm and inform the scheduler about the resources needed for the job and any other execution preferences.
Here are a list of common directives: <br>
* `#SBATCH --nodes=<some-value>`: Requests a specific number of nodes to run your job on.
* `#SBATCH --mem=<some-value>`: Specifies the amount of RAM required.
* `#SBATCH --time=<some-value>`: Sets the maximum runtime.
* `#SBATCH --output=<some-value>`: Directs the job's output to a specific file.
**Note:** These bullets are just for a better basic understanding on the topic. Complete examples and line-by-line explanations are provided further down in this page.
### Queues and partitions
Queues (or partitions in Slurm terminology) are categories within the cluster that organize jobs based on their resource requirements, priority, and other factors.
Our cluster is partitioned into the following categories: <br>
* **Standard Partition (future_parition_name)**: For general-purpose jobs with moderate resource requirements.
* **High-Memory Partition (future_parition_name)**: For jobs requiring significant amounts of memory.
* **GPU Partition (future-partition_name)**: For jobs that need GPU resources, such as gpu1 and gpu2 in our setup.
Choosing the right partition ensures your job is queued in an environment suited to its needs, and can potentially reduce wait times.
### Basic batch job example
In this example we are going to run a Python program with specified resource limits through our batch script. <br>
Create two files, one containing your `.sbatch` script and the other containing your `.py` program: <br>
I'm going to name them `my_script.sbatch` and `my_script.py`. <br>
Then add the following to `my_script.sbatch`:
```bash
#!/bin/bash
#SBATCH --job-name=test_job
#SBATCH --output=test_job.out
#SBATCH --error=test_job.err
#SBATCH --nodes=1
#SBATCH --time=10:00
#SBATCH --mem=1G
module load python3
python3 my_script.py
```
And add a simple `print` statement to `my_script.py`:
```python
print("Hello World!")
```
**NOTE:** In this example it's assumed that both your batch and Python script files are in the same directory. If that is not the case, please make sure you are providing the full path (starting from the root directory `/`) to your Python file, inside of your script. For example:
```bash
python3 /path/to/python_script/my_script.py
```
Now let's walk through `my_script.sbatch` line by line to see what each directive does.
* `#!/bin/bash`: This line needs to be included at the start of **all** your batch scripts. It basically specifies the script to be run with a shell called `bash`.
Lines 2-7 are your `SBTACH` directives. These lines are where you specify different options for your job including its name, output and error files path/name, list of nodes you want to use, resource limits, and more if required. Let's walk through them line by line:
* `#SBATCH --job-name=test_job`: This directive gives your job a name that you can later use to easier track and manage your job when looking for it in the queue. In this example, we've called it `test_job`. You can read about job management at `/software/env-modules.html`.
* `#SBATCH --output=test_job.out`: Used to specify where your output file is generated, and what it's going to be named. In this example, we have not provided a path, but only provided a name. When you use the `--output` directive without specifying a full path, just providing a filename, Slurm will store the output file in the current working directory from which the `sbatch` command was executed.
* `#SBATCH --error=test_job.err`: Functions similar to `--output` except it contains error messages generated during the execution of your job, if any. **The `.err` file is always going to be generated even if your job execution is successful; however, it's going to be empty if there are no errors.**
* `#SBATCH --nodes=1`: Specifies your job to run on one available node. This directive basically tells the scheduler "Run my job on any available node you find, and I don't care which one". **It's also possible to specify the name of the node(s) you'd like to use which we will cover in future examples.**
* `#SBATCH --time=10:00`: This line specifies how long you want your job to run, after it's out the queue and starts execution. In this case, the job will be **terminated** after 10 minutes. Acceptable time formats include `mm`, `mm:ss`, `hh:mm:ss`, `days-hh`, `days-hh:mm` and `days-hh:mm:ss`.
* `#SBATCH --mem=1G` Specifies the maximum main memory required *per* node. In this case we set the cap to 1 gigabyte. If you don't use a memory unit, Slurm automatically uses MegaBytes: `#SBATCH --mem=4096` requests 4096MB of RAM. **If you want to request all the memory on a node, you can use `--mem=0`.**
After the last `#SBATCH` directive, commands are ran like any other regular shell script.
* `module load python3`: Loads necessary files and modules in order for the command `python3` to be valid when used. Please refer to `/software/env-modules.html` for more detail on how the command `module` works.
* `python3 my_script.py`: Just like any other `python3` command, this line runs the `my_script.py` file using Python. **Later, the output(s) and/or error(s) of this operation is written to the files we have specified in our directives.**
#### Submit the job
This script as discussed previously, is a non-interactive job. Non-interactive jobs are submitted to the queue with the use of the `sbatch` command. In this case, we submit our job using `sbatch my_script.sbatch`. Read more about job submission at
### Jupyter Notebook batch job example
As you know, there is no Graphical User Interface (GUI) available when you connect to the cluster through your shell, hence in order to have access to some application's GUI, port fortforwarding is necessary [(What is SSH port forwarding?)](https://www.youtube.com/watch?v=x1yQF1789cE&ab_channel=TonyTeachesTech). In this example, we will do port forwarding to access Jupyter Notebook's web portal. You will basically send and receive your data through a specified port on your local machine that is tunneled to the port on the cluster where the Jupyter Notebook server is running. This setup enables you to work with Jupyter Notebooks as if they were running locally on your machine, despite actually being executed on a remote cluster node. After a successful setup, you can access Jupyter's portal through your desired browser through a generated link by Jupyter **on your local machine**.
First, create your sbatch script file. I'm going to call mine `jupyterTest.sbatch`. Then add the following to it:
```bash
#!/bin/bash
#SBATCH --nodelist=cn01
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --time=00:30:00
#SBATCH --job-name=jupyterTest1
#SBATCH --output=/home/mani/outputs/jupyterTest1.out
#SBATCH --error=/home/mani/outputs/jupyterTest1.err
# get tunneling info
XDG_RUNTIME_DIR=""
node=$(hostname -s)
user=$(whoami)
port=9001
# print tunneling instructions to jupyterTest1
echo -e "
Use the following command to set up ssh tunneling:
ssh -p5010 -N -f -L ${port}:${node}:${port} ${user}@binary.star.hofstra.edu"
module load jupyter
jupyter notebook --no-browser --port=${port} --ip=${node}
```
Replace `/home/username/outputs/` with your actual directory path for storing output and error files.
First, let's take a look at what the new directives and commands in this script do:
Note that most of the directives at the start of this script have previously been discussed at "Basic batch job example", so we are only going to discuss the new ones:
* `--nodelist=cn01`: Using `--nodelist` you can specify the exact name(s) of the node(s) you want your job to run on. In this case, we have specified it to be `cn01`.
* `--ntasks=1`: This directive tells SLURM to allocate resources for one task. A "task" in this context is essentially an instance of your application or script running on the cluster. For many applications, especially those that don't explicitly parallelize their workload across multiple CPUs or nodes, specifying a single task is sufficient. However, if you're running applications that can benefit from parallel execution, you might increase this number. This directive is crucial for optimizing resource usage based on the specific needs of your job. For instance, running multiple independent instances of a data analysis script on different subsets of your data could be a scenario where increasing the number of tasks is beneficial.
* `--cpus-per-task=1`: This sets the number of CPUs allocated to each task specified by `--ntasks`. By default, setting it to 1 assigns one CPU to your task, which is fine for tasks that are not CPU-intensive or designed to run on a single thread. However, for applications that are multi-threaded and can utilize more than one CPU core for processing, you would increase this value to match the application's capability to parallelize its workload.
* The variable initializations such as `node=...`, `user=...` are used to retrieve some information from the node you are running your job on to produce the right command for you to later run **locally**, and set up the SSH tunnel. You don't need to worry about these.
* The `echo` command is going to write the ssh tunneling command to your `.out` file with the help of the variables. We will explain how to use that generated command further below.
* `module load jupyter`: Loads the required modules to add support for the command `jupyter`.
* `jupyter notebook --no-browser --port=${port} --ip=${node}` runs a jupyter notebook and makes it listen on our specified port and address to later be accessible through your local machine's browser.
Then, submit your Batch job using `sbatch jupyterTest.sbatch`. Make sure to replace `jupyterTest.sbatch` with whatever file name and extension you choose.
At this stage, if you go and read the content of `jupyterTest.out`, there is a generated command that must look like the following:
```bash
ssh -p5010 -N -f -L 9001:cn01:9001 <your-username>@binary.star.hofstra.edu
```
Copy that line and run it in your local machine's command line. Then, enter your login credentials for `binary` and hit enter. You should not expect anything magical to happen. In fact, if everything is successful, your shell would go to a new line without generating any output.
You can now access Jupyter's GUI through a browser of your choice on your local machine, at the address that jupyter notebook has generated for you. For some reason, Jupyter writes the address to `stderr`, so you must look for it inside your `jupyterTest.err` file. Inside that file, there must be a line containing a link similar to the following:
```bash
http://127.0.0.1:9001/?token=...(your token is here)...
```
Copy that address and paste it into your browser, and you must successfuly access Jupyter's GUI.
### Apptainer TensorFlow batch job example
This example shows how to execute a TensorFlow script, `tfTest.py`, that trains a simple neural network on the MNIST dataset using GPUs.
First, create a Python script called `tfTest.py` with the provided content:
```python
import tensorflow as tf
physical_devices = tf.config.list_physical_devices(device_type=None)
print("Num of Devices:", len(physical_devices))
print("Devices:\n", physical_devices)
print("Tensorflow version information:\n",tf.__version__)
print("begin test...")
mnist = tf.keras.datasets.mnist
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
predictions = model(x_train[:1]).numpy()
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
loss_fn(y_train[:1], predictions).numpy()
model.compile(optimizer='adam',
loss=loss_fn,
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10)
model.evaluate(x_test, y_test, verbose=2)
```
Next, create a SLURM batch job script named `job-test-nv-tf.sh`. This script requests GPU resources, loads necessary modules, and runs your TensorFlow script inside an Apptainer container:
```bash
#!/bin/bash
#SBATCH --job-name=tensorflow_test_job
#SBATCH --output=result.txt
#SBATCH --nodelist=gpu1
#SBATCH --gres=gpu:A100:2
#SBATCH --ntasks=1
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=1000
module load python3
module load apptainer
echo "run Apptainer TensorFlow GPU"
apptainer run --nv tensorflowGPU.sif python3 tfTest.py
```
This script runs the `tfTest.py` script inside the TensorFlow GPU container (`tensorflowGPU.sif`)
You can now submit your job to Slurm using `sbatch job-test-nv-tf.sbatch`.
After the job completes, you can check the output in `result.txt`. The output should include information about the available physical devices (GPUs), the TensorFlow version, and the output from training the model on the MNIST dataset.
The beginning and end of the file might look something like this:
```text
run Apptainer TensorFlow GPU
Num of Devices: X
Devices:
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), ...]
Tensorflow version information:
X.XX.X
begin test...
...
313/313 - 0s - loss: X.XXXX - accuracy: 0.XXXX
```
## Interactive jobs
### Starting an Interactive job
To start an interactive job, you use the `srun` command with specific parameters that define your job's resource requirements. Here's an example:
```bash
srun --pty --nodelist=cn01 --ntasks=1 --cpus-per-task=1 --time=01:00:00 --mem=4G /bin/bash
```
Here's a break down of what each part of this command does:
**All of these options except the command `srun` and the option `--pty` are discussed in great detail under the "Batch jobs" section of this page. Refer back to them if you need a more comprehensive explanation.
* `srun`: The command used to start an interactive session on the cluster.
* `--pty`: Requests a pseudo-terminal, which is necessary for interactive sessions.
* `--nodelist=cn01`: Specifies the node on which to run the interactive session. Here, cn01 is used as an example.
* `--ntasks=1`: Allocates resources for one task.
* `--cpus-per-task=1`: Assigns one CPU to the task. This is enough for tasks that don't require parallel processing across multiple CPUs.
* `--time=01:00:00`: Sets the maximum duration of the interactive session to 1 hour.
* `--mem=4G`: Specifies that the job requires 4 gigabytes of memory.
* `/bin/bash`: After allocating the requested resources, srun starts a Bash shell in the interactive session.
When your job exits the queue, you will see your shell prompt change, as you receive a shell on the allocated compute node. This shift means that you're now directly interacting with the HPC environment through an interactive session and you are now enabled to execute commands and run applications using the cluster's computational resources in real-time.
### Additional `srun` Options for Interactive jobs
**Specifying a Partition**: If your job has specific resource requirements, you may want to specify a partition that matches those needs.
```bash
--partition=your_partition_name
```
**Allocating GPUs**: For tasks requiring GPU acceleration, you can request specific GPUs.
```bash
--gres=gpu:2
```
This option requests 2 GPUs for your job. Adjust the number according—and within the supported limits—to your task's requirements.
**Exclusive Node Access**: To ensure that no other jobs share your allocated node, you can request exclusive access.
```
--exclusive
```
**Memory per CPU**: If your job requires a specific amount of memory per CPU, this can be specified.
```bash
--mem-per-cpu=4G
```
This option requests 4 GB of memory per allocated CPU.
**Quality of Service (QoS)**: For prioritizing jobs, you can specify the Quality of Service.
```bash
--qos=your_qos
```
**Email Notifications**: Stay informed about your job's status with email notifications.
```bash
--mail-type=ALL
--mail-user=your_email@example.com
```
This configuration sends an email to the specified address at the start, completion, and failure of the job.
## Array jobs
To submit an array job, you use the `--array` as a part of your `sbatch`. This option specifies a range of indices that SLURM uses to create multiple tasks from a single job submission. Each task in the array is assigned a unique SLURM_ARRAY_TASK_ID that can be used within your scripts to differentiate between them.
### Array job example
Suppose you have a dataset split into multiple files and you want to process each file independently. Instead of submitting a separate job for each file, you can submit a single array job where each task processes a different file.
Now, let's have a simple, low-resource Python task that you can run as part of an array job, let's create a Python script that generates a basic report based on the `SLURM_ARRAY_TASK_ID`. This script will read from a specific input file based on the task ID, perform a simple operation (like counting the number of lines or words), and output the results to a file.
First, let's create your input files. You can make these as simple text files with a few lines of content. For example:
`input1.txt`, `input2.txt`, and `input3.txt`.
Each file could contain a few lines of arbitrary text. You can create these files manually or use the command line:
```bash
echo "This is a simple file." > input1.txt
echo "This file contains\nseveral lines of text." > input2.txt
echo "Each file will have\na different\nnumber of lines." > input3.txt
```
Now here's the content of the Python script, `process_data.py`, which reads from one of these input files based on the `SLURM_ARRAY_TASK_ID` and counts the number of lines:
```python
import sys
# Get the task ID from the command line arguments
task_id = sys.argv[1]
# Construct the filename based on the task ID
filename = f"input{task_id}.txt"
# Try to open and read the file
try:
with open(filename, 'r') as file:
lines = file.readlines()
num_lines = len(lines)
# Output the results
output_filename = f"output{task_id}.txt"
with open(output_filename, 'w') as outfile:
outfile.write(f"File: {filename}\nNumber of lines: {num_lines}\n")
print(f"Processed {filename} successfully.")
except FileNotFoundError:
print(f"File {filename} not found.")
```
This script basically takes an argument from the command line (Expected to be the `SLURM_ARRAY_TASK_ID`). Then constructs a filename from this ID, reads the corresponding input file, counts its lines, writes the count to an output file, and in case of any missing files it handles them by printing a message instead of crashing.
Finally, to run this script as part of an array job on 3 files, adjust the `--array` option in your SLURM script (`process_array.sbatch`) to `1-3`.
```bash
#!/bin/bash
#SBATCH --job-name=array_job
#SBATCH --output=array_job_%A_%a.out
#SBATCH --error=array_job_%A_%a.err
#SBATCH --array=1-3
#SBATCH --mem=1G
#SBATCH --time=00:10:00
module load python3
python3 process_data.py $SLURM_ARRAY_TASK_ID
```
In the context of SLURM job submission scripts, %A and %a are special placeholders used within directives like --output and --error to dynamically generate filenames based on the job's array ID and the individual task ID within the array. Here's what each placeholder represents:
* `%A`: This placeholder is replaced by the SLURM job array's ID. The job array ID is a unique identifier assigned by SLURM to the entire array job at the time of submission. It helps you group and identify all tasks belonging to the same array job.
* `%a`: This placeholder is substituted with the specific task ID within the job array. Since an array job consists of multiple tasks, each with a unique task ID (determined by the `--array` option when the job is submitted), `%a` allows you to create distinct output or error files for each task, making it easier to troubleshoot and analyze the results of individual tasks.
For example, if you submit an array job with the --array=1-10 option and use the following in your script:
```bash
#SBATCH --output=job_output_%A_%a.out
#SBATCH --error=job_error_%A_%a.err
```
SLURM will create separate output and error files for each of the ten tasks in the array. If the array job's ID is 12345, the files for the first task will be named job_output_12345_1.out and job_error_12345_1.err, the files for the second task will be job_output_12345_2.out and job_error_12345_2.err, and so on.
Now submit this job using `sbatch process_array.sbatch` and you must see 6 different output files (3 ending in `.out` and 3 in `.err`). The `.out` files each contain the content of the relevant text file they read from, and the `.err` files are expected to be empty if everything has run smoothly.
## Parallel jobs
### Open MPI job example
First, you need to create your MPI program. In this example, we are calling it `mpi_hello_world.c`. This program initializes the MPI environment, gets the rank of each process, the total number of processes, and the name of the processor, then prints a greeting from each process.
```c
#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
MPI_Init(NULL, NULL);
int PID;
MPI_Comm_rank(MPI_COMM_WORLD, &PID);
int number_of_processes;
MPI_Comm_size(MPI_COMM_WORLD, &number_of_processes);
char processor_name[MPI_MAX_PROCESSOR_NAME];
int name_length;
MPI_Get_processor_name(processor_name, &name_length);
printf("Hello MPI user: from process PID %d out of %d processes on machine %s\n", PID, number_of_processes, processor_name);
MPI_Finalize();
return 0;
}
```
Next, create the compilation and execution script. <br>
Create a Bash script named mpi_hello_world.sh to compile and run the MPI program. This script takes a parameter for the number of processes to spawn.
```bash
#!/bin/bash
SRC=mpi_hello_world.c
OBJ=mpi_hello_world
NUM=$1
mpicc -o $OBJ $SRC
mpirun -n $NUM ./$OBJ
```
This script compiles the MPI program using `mpicc` and runs it with `mpirun`, and specifies the number of processes with `-n`.
Next, prepare a SLURM batch job script named `job-test-mpi.sbatch` to submit your MPI job. This script requests cluster resources and runs your MPI program through `mpi_hello_world.sh`:
```bash
#!/bin/bash
#SBATCH --job-name=mpi_job_test
#SBATCH --output=result.txt
#SBATCH --error=error.txt
#SBATCH --nodelist=gpu1,gpu2,cn01
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=1000
module load openmpi4
echo "run mpi program using parallel processes"
sh mpi_hello_world.sh $1
```
This script sets up a job with the name mpi_job_test, specifies output and error files, requests resources (All 3 nodes from the cluster), and loads the OpenMPI module. It then runs the `mpi_hello_world.sh` script and passes the number of processes as an argument.
In the end, submit your parallel MPI job to SLURM using the sbatch command `sbatch job-test-mpi.sbatch`, specifying the desired number of parallel processes with `-n`. For example, to run with 8 parallel processes:
```bash
sbatch -n 8 job-test-mpi.sh 8
```
After the job completes, check the output in `result.txt`. You should see greetings from each MPI process. The content might look something like this, shortened with `...` for brevity:
```bash
Hello MPI user: from process PID 0 out of 8 processes on machine gpu1
...
Hello MPI user: from process PID 7 out of 8 processes on machine cn01
```
File mode changed from 100755 to 100644
File mode changed from 100755 to 100644
File mode changed from 100755 to 100644
File mode changed from 100755 to 100644
File mode changed from 100755 to 100644
File mode changed from 100755 to 100644
File mode changed from 100755 to 100644
File mode changed from 100755 to 100644
File mode changed from 100755 to 100644
---
sort: 3
---
# Monitoring Jobs
Here you can see how to manage and monitor your jobs on our HPC cluster. Whether you're running batch jobs, interactive sessions, or array jobs, these tools and commands will help you keep track of your work and manage your resources.
## Checking Job Status
### Using `squeue`
The `squeue` command is possibly your most common tool for viewing the status of jobs in the queue. Here's a basic usage:
```bash
squeue
```
Sample output:
```bash
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1234 batch my_job jsmith R 5:23 1 cn01
1235 batch array_job jdoe R 2:45 1 cn02
1236 gpu gpu_task asmith PD 0:00 1 (Resources)
```
To see **only** your job:
```bash
squeue -u your_username
```
To see jobs on a specific partition:
```bash
squeue -p partition_name
```
### Jobs States
These are common job states that you might see under the `ST` column of `squeue`'s output:
- R: Running
- PD: Pending
- CG: Completing
- CD: Completed
- F: Failed
- TO: Timeout
- CA: Cancelled
## Detailed Job Information
### Using `scontrol`
To get detailed informatnio about a specific job:
```bash
scontrol show job job_id
```
Sample output:
```bash
JobId=1234 JobName=my_job
UserId=jsmith(1001) GroupId=users(1001) MCS_label=N/A
Priority=4294901758 Nice=0 Account=default QOS=normal
JobState=RUNNING Reason=None Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
RunTime=00:10:12 TimeLimit=01:00:00 TimeMin=N/A
SubmitTime=2023-06-01T10:00:00 EligibleTime=2023-06-01T10:00:00
AccrueTime=2023-06-01T10:00:00
StartTime=2023-06-01T10:05:00 EndTime=2023-06-01T11:05:00 Deadline=N/A
PreemptEligibleTime=2023-06-01T10:05:00 PreemptTime=None
SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-06-01T10:05:00
Partition=batch AllocNode:Sid=login01:12345
ReqNodeList=(null) ExcNodeList=(null)
NodeList=cn01
BatchHost=cn01
NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=1,mem=4G,node=1,billing=1
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryNode=4G MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=/home/jsmith/my_script.sh
WorkDir=/home/jsmith
StdErr=/home/jsmith/my_job.err
StdIn=/dev/null
StdOut=/home/jsmith/my_job.out
Power=
```
### Cancelling Jobs
To cancel a job:
```bash
scancel job_id
```
To cancel all your jobs:
```bash
scancel -u your_username
```
To cancel all your pending jobs:
```bash
scancel -t PENDING -u your_username
```
## Modifying Jobs
If you initially submit a job and then remember some attribute needs to be changed, you don't need to cacnel and resubmit the whole job. You can modify certain attributes of a job that's already in the queue using the `scontrol update` command.
For example, to change the time limit of a job:
```bash
scontrol update JobId=job_id TimeLimit=2:00:00
```
To change the number of CPUs:
```bash
scontrol update JobId=job_id NumCPUs=4
```
## Monitoring Resource Usage
### Using `sstat`
For running jobs, you can use `sstat` to get resource usage statistics:
```bash
sstat -j job_id --format=JobID,AveCPU,AveRSS,AveVMSize
```
Sample output:
```bash
JobID AveCPU AveRSS AveVMSize
-------- ------------ ------------ -----------
1234.0 00:05:23 1234K 4567K
```
### Using `sacct`
For completed jobs, use `sacct` o view accounting data:
```bash
sacct -j job_id --format=JobID,JobName,MaxRSS,Elapsed
```
Sample output:
```bash
JobID JobName MaxRSS Elapsed
------------ ---------- ---------- ----------
1234 my_job 4096K 00:15:23
```
## Monitoring Cluster Status
### Using `sinfo`
To see the overall status of the cluster:
```bash
sinfo
```
Sample output:
```bash
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
defq* up infinite 1 mix gpu1
defq* up infinite 2 idle cn01,gpu2
```
To see more detailed node information on for example `gpu1`:
```bash
sinfo -n gpu1 -o "%n %c %m %t %f %G %D %P %C %O"
```
Sample output:
```bash
HOSTNAMES CPUS MEMORY STATE AVAIL_FEATURES GRES NODES PARTITION CPUS(A/I/O/T) CPU_LOAD
gpu1 128 1 mix location=local (null) 1 defq* 5/123/0/128 1.67
```
## Job Arrays
For job arrays, you can use most of the above commands with some modifications.
To see the status of all tasks in a job array:
```bash
squeue -j array_job_id
```
Sample output:
```bash
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1234_1 batch array_job jdoe R 5:23 1 cn01
1234_2 batch array_job jdoe R 5:23 1 cn02
1234_3 batch array_job jdoe PD 0:00 1 (Resources)
```
## Troubleshooting
If a job fails, try checking the following:
1. Look at the job's output and error files.
2. Check the job's resource usage with `sacct`
3. Verify that you requested sufficient resources, and your job did not get terminated due to needing more resources than requested.
Remember, if you're having persistent issues, don't hesitate to reach out to the support team.
---
sort: 3
sort: 2
---
# Submitting Jobs
In `/jobs/creating-jobs.md`, we briefly touched on how to submit specific job types with the use of commands like `sbatch`, `srun`, etc. Here, we are going to focus on *how* to benefit the most out of your job submissions:
* What can help your jobs leave the queue faster.
* How Scheduler (Slurm) Policies affect your job.
This page is mainly dedicated to examples of different job types. For a more comprehensive explanation on different job types, please refer to [Jobs Overview]({{site.baseurl}}{% link jobs/Overview.md %})
## Batch jobs (Non-interactive)
Batch jobs allow users to execute tasks without direct interaction with the computing environment. Jobs are written as scripts that consist of the commands to be executed and a specification of the requested resources.
### BATCH directives
BATCH directives are essentially instructions embedded at the beginning of a batch job script and are interpreted by the scheduler (like Slurm in our case). These lines are prefixed with `#SBATCH` for Slurm and inform the scheduler about the resources needed for the job and any other execution preferences.
Here are a list of common directives: <br>
- `#SBATCH --nodes=<some-value>`: Requests a specific number of nodes to run your job on.
- `#SBATCH --mem=<some-value>`: Specifies the amount of RAM required.
- `#SBATCH --time=<some-value>`: Sets the maximum runtime.
- `#SBATCH --output=<some-value>`: Directs the job's output to a specific file.
**Note:** These bullets are just for a better basic understanding on the topic. Complete examples and line-by-line explanations are provided further down in this page.
### Queues and partitions
Queues (or partitions in Slurm terminology) are categories within the cluster that organize jobs based on their resource requirements, priority, and other factors.
Our cluster is partitioned into the following categories: <br>
- **Standard Partition (future_parition_name)**: For general-purpose jobs with moderate resource requirements.
- **High-Memory Partition (future_parition_name)**: For jobs requiring significant amounts of memory.
- **GPU Partition (future-partition_name)**: For jobs that need GPU resources, such as gpu1 and gpu2 in our setup.
Choosing the right partition ensures your job is queued in an environment suited to its needs, and can potentially reduce wait times.
### Basic batch job example
In this example we are going to run a Python program with specified resource limits through our batch script. <br>
Create two files, one containing your `.sbatch` script and the other containing your `.py` program: <br>
I'm going to name them `my_script.sbatch` and `my_script.py`. <br>
Then add the following to `my_script.sbatch`:
```bash
#!/bin/bash
#SBATCH --job-name=test_job
#SBATCH --output=test_job.out
#SBATCH --error=test_job.err
#SBATCH --nodes=1
#SBATCH --time=10:00
#SBATCH --mem=1G
module load python3
python3 my_script.py
```
And add a simple `print` statement to `my_script.py`:
```python
print("Hello World!")
```
**NOTE:** In this example it's assumed that both your batch and Python script files are in the same directory. If that is not the case, please make sure you are providing the full path (starting from the root directory `/`) to your Python file, inside of your script. For example:
```bash
python3 /path/to/python_script/my_script.py
```
Now let's walk through `my_script.sbatch` line by line to see what each directive does.
- `#!/bin/bash`: This line needs to be included at the start of **all** your batch scripts. It basically specifies the script to be run with a shell called `bash`.
Lines 2-7 are your `SBTACH` directives. These lines are where you specify different options for your job including its name, output and error files path/name, list of nodes you want to use, resource limits, and more if required. Let's walk through them line by line:
- `#SBATCH --job-name=test_job`: This directive gives your job a name that you can later use to easier track and manage your job when looking for it in the queue. In this example, we've called it `test_job`. You can read about job management at `/software/env-modules.html`.
- `#SBATCH --output=test_job.out`: Used to specify where your output file is generated, and what it's going to be named. In this example, we have not provided a path, but only provided a name. When you use the `--output` directive without specifying a full path, just providing a filename, Slurm will store the output file in the current working directory from which the `sbatch` command was executed.
- `#SBATCH --error=test_job.err`: Functions similar to `--output` except it contains error messages generated during the execution of your job, if any. **The `.err` file is always going to be generated even if your job execution is successful; however, it's going to be empty if there are no errors.**
- `#SBATCH --nodes=1`: Specifies your job to run on one available node. This directive basically tells the scheduler "Run my job on any available node you find, and I don't care which one". **It's also possible to specify the name of the node(s) you'd like to use which we will cover in future examples.**
- `#SBATCH --time=10:00`: This line specifies how long you want your job to run, after it's out the queue and starts execution. In this case, the job will be **terminated** after 10 minutes. Acceptable time formats include `mm`, `mm:ss`, `hh:mm:ss`, `days-hh`, `days-hh:mm` and `days-hh:mm:ss`.
- `#SBATCH --mem=1G` Specifies the maximum main memory required _per_ node. In this case we set the cap to 1 gigabyte. If you don't use a memory unit, Slurm automatically uses MegaBytes: `#SBATCH --mem=4096` requests 4096MB of RAM. **If you want to request all the memory on a node, you can use** `--mem=0`.
After the last `#SBATCH` directive, commands are ran like any other regular shell script.
- `module load python3`: Loads necessary files and modules in order for the command `python3` to be valid when used. Please refer to `/software/env-modules.html` for more detail on how the command `module` works.
- `python3 my_script.py`: Just like any other `python3` command, this line runs the `my_script.py` file using Python. **Later, the output(s) and/or error(s) of this operation is written to the files we have specified in our directives.**
### Batch Job Submission
This script as discussed previously, is a non-interactive job. Non-interactive jobs are submitted to the queue with the use of the `sbatch` command. In this case, we submit our job using `sbatch my_script.sbatch`.
### Jupyter Notebook batch job example
As you know, there is no Graphical User Interface (GUI) available when you connect to the cluster through your shell, hence in order to have access to some application's GUI, port fortforwarding is necessary [(What is SSH port forwarding?)](https://www.youtube.com/watch?v=x1yQF1789cE&ab_channel=TonyTeachesTech). In this example, we will do port forwarding to access Jupyter Notebook's web portal. You will basically send and receive your data through a specified port on your local machine that is tunneled to the port on the cluster where the Jupyter Notebook server is running. This setup enables you to work with Jupyter Notebooks as if they were running locally on your machine, despite actually being executed on a remote cluster node. After a successful setup, you can access Jupyter's portal through your desired browser through a generated link by Jupyter **on your local machine**.
First, create your sbatch script file. I'm going to call mine `jupyterTest.sbatch`. Then add the following to it:
```bash
#!/bin/bash
#SBATCH --nodelist=cn01
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --time=00:30:00
#SBATCH --job-name=jupyterTest1
#SBATCH --output=/home/mani/outputs/jupyterTest1.out
#SBATCH --error=/home/mani/outputs/jupyterTest1.err
# get tunneling info
XDG_RUNTIME_DIR=""
node=$(hostname -s)
user=$(whoami)
port=9001
# print tunneling instructions to jupyterTest1
echo -e "
Use the following command to set up ssh tunneling:
ssh -p5010 -N -f -L ${port}:${node}:${port} ${user}@binary.star.hofstra.edu"
module load jupyter
jupyter notebook --no-browser --port=${port} --ip=${node}
```
Replace `/home/username/outputs/` with your actual directory path for storing output and error files.
First, let's take a look at what the new directives and commands in this script do:
Note that most of the directives at the start of this script have previously been discussed at "Basic batch job example", so we are only going to discuss the new ones:
- `--nodelist=cn01`: Using `--nodelist` you can specify the exact name(s) of the node(s) you want your job to run on. In this case, we have specified it to be `cn01`.
- `--ntasks=1`: This directive tells SLURM to allocate resources for one task. A "task" in this context is essentially an instance of your application or script running on the cluster. For many applications, especially those that don't explicitly parallelize their workload across multiple CPUs or nodes, specifying a single task is sufficient. However, if you're running applications that can benefit from parallel execution, you might increase this number. This directive is crucial for optimizing resource usage based on the specific needs of your job. For instance, running multiple independent instances of a data analysis script on different subsets of your data could be a scenario where increasing the number of tasks is beneficial.
- `--cpus-per-task=1`: This sets the number of CPUs allocated to each task specified by `--ntasks`. By default, setting it to 1 assigns one CPU to your task, which is fine for tasks that are not CPU-intensive or designed to run on a single thread. However, for applications that are multi-threaded and can utilize more than one CPU core for processing, you would increase this value to match the application's capability to parallelize its workload.
- The variable initializations such as `node=...`, `user=...` are used to retrieve some information from the node you are running your job on to produce the right command for you to later run **locally**, and set up the SSH tunnel. You don't need to worry about these.
- The `echo` command is going to write the ssh tunneling command to your `.out` file with the help of the variables. We will explain how to use that generated command further below.
- `module load jupyter`: Loads the required modules to add support for the command `jupyter`.
- `jupyter notebook --no-browser --port=${port} --ip=${node}` runs a jupyter notebook and makes it listen on our specified port and address to later be accessible through your local machine's browser.
Then, submit your Batch job using `sbatch jupyterTest.sbatch`. Make sure to replace `jupyterTest.sbatch` with whatever file name and extension you choose.
At this stage, if you go and read the content of `jupyterTest.out`, there is a generated command that must look like the following:
```bash
ssh -p5010 -N -f -L 9001:cn01:9001 <your-username>@binary.star.hofstra.edu
```
Copy that line and run it in your local machine's command line. Then, enter your login credentials for `binary` and hit enter. You should not expect anything magical to happen. In fact, if everything is successful, your shell would go to a new line without generating any output.
You can now access Jupyter's GUI through a browser of your choice on your local machine, at the address that jupyter notebook has generated for you. For some reason, Jupyter writes the address to `stderr`, so you must look for it inside your `jupyterTest.err` file. Inside that file, there must be a line containing a link similar to the following:
```bash
http://127.0.0.1:9001/?token=...(your token is here)...
```
Copy that address and paste it into your browser, and you must successfuly access Jupyter's GUI.
### Apptainer TensorFlow batch job example
This example shows how to execute a TensorFlow script, `tfTest.py`, that trains a simple neural network on the MNIST dataset using GPUs.
First, create a Python script called `tfTest.py` with the provided content:
```python
import tensorflow as tf
physical_devices = tf.config.list_physical_devices(device_type=None)
print("Num of Devices:", len(physical_devices))
print("Devices:\n", physical_devices)
print("Tensorflow version information:\n",tf.__version__)
print("begin test...")
mnist = tf.keras.datasets.mnist
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
predictions = model(x_train[:1]).numpy()
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
loss_fn(y_train[:1], predictions).numpy()
model.compile(optimizer='adam',
loss=loss_fn,
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10)
model.evaluate(x_test, y_test, verbose=2)
```
Next, create a SLURM batch job script named `job-test-nv-tf.sh`. This script requests GPU resources, loads necessary modules, and runs your TensorFlow script inside an Apptainer container:
```bash
#!/bin/bash
#SBATCH --job-name=tensorflow_test_job
#SBATCH --output=result.txt
#SBATCH --nodelist=gpu1
#SBATCH --gres=gpu:A100:2
#SBATCH --ntasks=1
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=1000
module load python3
module load apptainer
echo "run Apptainer TensorFlow GPU"
apptainer run --nv tensorflowGPU.sif python3 tfTest.py
```
This script runs the `tfTest.py` script inside the TensorFlow GPU container (`tensorflowGPU.sif`)
You can now submit your job to Slurm using `sbatch job-test-nv-tf.sbatch`.
After the job completes, you can check the output in `result.txt`. The output should include information about the available physical devices (GPUs), the TensorFlow version, and the output from training the model on the MNIST dataset.
The beginning and end of the file might look something like this:
```text
run Apptainer TensorFlow GPU
Num of Devices: X
Devices:
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), ...]
Tensorflow version information:
X.XX.X
begin test...
...
313/313 - 0s - loss: X.XXXX - accuracy: 0.XXXX
```
## Interactive jobs
### Starting an Interactive job
To start an interactive job, you use the `srun` command with specific parameters that define your job's resource requirements. Here's an example:
```bash
srun --pty --nodelist=cn01 --ntasks=1 --cpus-per-task=1 --time=01:00:00 --mem=4G /bin/bash
```
Here's a break down of what each part of this command does:
\*\*All of these options except the command `srun` and the option `--pty` are discussed in great detail under the "Batch jobs" section of this page. Refer back to them if you need a more comprehensive explanation.
- `srun`: The command used to start an interactive session on the cluster.
- `--pty`: Requests a pseudo-terminal, which is necessary for interactive sessions.
- `--nodelist=cn01`: Specifies the node on which to run the interactive session. Here, cn01 is used as an example.
- `--ntasks=1`: Allocates resources for one task.
- `--cpus-per-task=1`: Assigns one CPU to the task. This is enough for tasks that don't require parallel processing across multiple CPUs.
- `--time=01:00:00`: Sets the maximum duration of the interactive session to 1 hour.
- `--mem=4G`: Specifies that the job requires 4 gigabytes of memory.
- `/bin/bash`: After allocating the requested resources, srun starts a Bash shell in the interactive session.
When your job exits the queue, you will see your shell prompt change, as you receive a shell on the allocated compute node. This shift means that you're now directly interacting with the HPC environment through an interactive session and you are now enabled to execute commands and run applications using the cluster's computational resources in real-time.
### Additional `srun` Options for Interactive jobs
**Specifying a Partition**: If your job has specific resource requirements, you may want to specify a partition that matches those needs.
```bash
--partition=your_partition_name
```
**Allocating GPUs**: For tasks requiring GPU acceleration, you can request specific GPUs.
```bash
--gres=gpu:2
```
This option requests 2 GPUs for your job. Adjust the number according—and within the supported limits—to your task's requirements.
**Exclusive Node Access**: To ensure that no other jobs share your allocated node, you can request exclusive access.
```
--exclusive
```
**Memory per CPU**: If your job requires a specific amount of memory per CPU, this can be specified.
```bash
--mem-per-cpu=4G
```
This option requests 4 GB of memory per allocated CPU.
**Quality of Service (QoS)**: For prioritizing jobs, you can specify the Quality of Service.
```bash
--qos=your_qos
```
**Email Notifications**: Stay informed about your job's status with email notifications.
```bash
--mail-type=ALL
--mail-user=your_email@example.com
```
This configuration sends an email to the specified address at the start, completion, and failure of the job.
### Interactive Job Submission
To submit an interactive job, simply run the `srun` command with your desired options directly in the terminal. For example:
```bash
srun --pty --nodelist=cn01 --ntasks=1 --cpus-per-task=1 --time=01:00:00 --mem=4G /bin/bash
```
Once submitted, you'll be placed in an interactive shell on the allocated node when resources become available. Your prompt will change to indicate you're on the compute node.
## Array jobs
To submit an array job, you use the `--array` as a part of your `sbatch`. This option specifies a range of indices that SLURM uses to create multiple tasks from a single job submission. Each task in the array is assigned a unique SLURM_ARRAY_TASK_ID that can be used within your scripts to differentiate between them.
### Array job example
Suppose you have a dataset split into multiple files and you want to process each file independently. Instead of submitting a separate job for each file, you can submit a single array job where each task processes a different file.
Now, let's have a simple, low-resource Python task that you can run as part of an array job, let's create a Python script that generates a basic report based on the `SLURM_ARRAY_TASK_ID`. This script will read from a specific input file based on the task ID, perform a simple operation (like counting the number of lines or words), and output the results to a file.
First, let's create your input files. You can make these as simple text files with a few lines of content. For example:
`input1.txt`, `input2.txt`, and `input3.txt`.
Each file could contain a few lines of arbitrary text. You can create these files manually or use the command line:
```bash
echo "This is a simple file." > input1.txt
echo "This file contains\nseveral lines of text." > input2.txt
echo "Each file will have\na different\nnumber of lines." > input3.txt
```
Now here's the content of the Python script, `process_data.py`, which reads from one of these input files based on the `SLURM_ARRAY_TASK_ID` and counts the number of lines:
```python
import sys
# Get the task ID from the command line arguments
task_id = sys.argv[1]
# Construct the filename based on the task ID
filename = f"input{task_id}.txt"
# Try to open and read the file
try:
with open(filename, 'r') as file:
lines = file.readlines()
num_lines = len(lines)
# Output the results
output_filename = f"output{task_id}.txt"
with open(output_filename, 'w') as outfile:
outfile.write(f"File: {filename}\nNumber of lines: {num_lines}\n")
print(f"Processed {filename} successfully.")
except FileNotFoundError:
print(f"File {filename} not found.")
```
This script basically takes an argument from the command line (Expected to be the `SLURM_ARRAY_TASK_ID`). Then constructs a filename from this ID, reads the corresponding input file, counts its lines, writes the count to an output file, and in case of any missing files it handles them by printing a message instead of crashing.
Finally, to run this script as part of an array job on 3 files, adjust the `--array` option in your SLURM script (`process_array.sbatch`) to `1-3`.
```bash
#!/bin/bash
#SBATCH --job-name=array_job
#SBATCH --output=array_job_%A_%a.out
#SBATCH --error=array_job_%A_%a.err
#SBATCH --array=1-3
#SBATCH --mem=1G
#SBATCH --time=00:10:00
module load python3
python3 process_data.py $SLURM_ARRAY_TASK_ID
```
In the context of SLURM job submission scripts, %A and %a are special placeholders used within directives like --output and --error to dynamically generate filenames based on the job's array ID and the individual task ID within the array. Here's what each placeholder represents:
- `%A`: This placeholder is replaced by the SLURM job array's ID. The job array ID is a unique identifier assigned by SLURM to the entire array job at the time of submission. It helps you group and identify all tasks belonging to the same array job.
- `%a`: This placeholder is substituted with the specific task ID within the job array. Since an array job consists of multiple tasks, each with a unique task ID (determined by the `--array` option when the job is submitted), `%a` allows you to create distinct output or error files for each task, making it easier to troubleshoot and analyze the results of individual tasks.
For example, if you submit an array job with the --array=1-10 option and use the following in your script:
```bash
#SBATCH --output=job_output_%A_%a.out
#SBATCH --error=job_error_%A_%a.err
```
SLURM will create separate output and error files for each of the ten tasks in the array. If the array job's ID is 12345, the files for the first task will be named job_output_12345_1.out and job_error_12345_1.err, the files for the second task will be job_output_12345_2.out and job_error_12345_2.err, and so on.
Now submit this job using `sbatch process_array.sbatch` and you must see 6 different output files (3 ending in `.out` and 3 in `.err`). The `.out` files each contain the content of the relevant text file they read from, and the `.err` files are expected to be empty if everything has run smoothly.
### Array Job Submission
To submit an array job, use the `sbatch` command with your SLURM script that includes the `--array` option. For example:
```bash
sbatch process_array.sbatch
```
This will submit the entire array job. SLURM will then manage the execution of individual tasks within the array based on available resources.
## Parallel jobs
### Open MPI job example
First, you need to create your MPI program. In this example, we are calling it `mpi_hello_world.c`. This program initializes the MPI environment, gets the rank of each process, the total number of processes, and the name of the processor, then prints a greeting from each process.
```c
#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
MPI_Init(NULL, NULL);
int PID;
MPI_Comm_rank(MPI_COMM_WORLD, &PID);
int number_of_processes;
MPI_Comm_size(MPI_COMM_WORLD, &number_of_processes);
char processor_name[MPI_MAX_PROCESSOR_NAME];
int name_length;
MPI_Get_processor_name(processor_name, &name_length);
printf("Hello MPI user: from process PID %d out of %d processes on machine %s\n", PID, number_of_processes, processor_name);
MPI_Finalize();
return 0;
}
```
Next, create the compilation and execution script. <br>
Create a Bash script named mpi_hello_world.sh to compile and run the MPI program. This script takes a parameter for the number of processes to spawn.
```bash
#!/bin/bash
SRC=mpi_hello_world.c
OBJ=mpi_hello_world
NUM=$1
mpicc -o $OBJ $SRC
mpirun -n $NUM ./$OBJ
```
This script compiles the MPI program using `mpicc` and runs it with `mpirun`, and specifies the number of processes with `-n`.
Next, prepare a SLURM batch job script named `job-test-mpi.sbatch` to submit your MPI job. This script requests cluster resources and runs your MPI program through `mpi_hello_world.sh`:
```bash
#!/bin/bash
#SBATCH --job-name=mpi_job_test
#SBATCH --output=result.txt
#SBATCH --error=error.txt
#SBATCH --nodelist=gpu1,gpu2,cn01
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=1000
module load openmpi4
echo "run mpi program using parallel processes"
sh mpi_hello_world.sh $1
```
This script sets up a job with the name mpi_job_test, specifies output and error files, requests resources (All 3 nodes from the cluster), and loads the OpenMPI module. It then runs the `mpi_hello_world.sh` script and passes the number of processes as an argument.
### Parallel Job Submission
Submit your parallel MPI job to SLURM using the `sbatch` command `sbatch job-test-mpi.sbatch`, specifying the desired number of parallel processes with `-n`. For example, to run with 8 parallel processes:
```bash
sbatch -n 8 job-test-mpi.sh 8
```
After the job completes, check the output in `result.txt`. You should see greetings from each MPI process. The content might look something like this, shortened with `...` for brevity:
```bash
Hello MPI user: from process PID 0 out of 8 processes on machine gpu1
...
Hello MPI user: from process PID 7 out of 8 processes on machine cn01
```
File mode changed from 100755 to 100644
......@@ -6,114 +6,91 @@ sort: 1
## Overview
The Star HPC Cluster is a computing facility designed for a variety of research and computational tasks. It combines advanced computing **nodes** and a high-speed **storage system** with a suite of **software applications**.
The Star cluster is a High-Performance Computing (HPC) system at the Science and Innovation Center (SIC) that is designed for a variety of advanced research and computational tasks. It combines NVIDIA HGX-based **compute nodes**, a high-speed all-flash parallel file system-based **storage system**, an ultra-high-throughput/low-latency HDR200Gb/s Infiniband **network fabric**, and a suite of **software applications**. The compute nodes feature high-end H100 and A100 GPUs, AMD EPYC and Intel Xeon processors, and over 7 Terabytes of combined RAM.
SLURM (Simple Linux Utility for Resource Management) is our chosen job scheduler and queueing system that efficiently manages resource allocation, ensuring everyone gets the right amount of resources at the right time.
The cluster runs SLURM (Simple Linux Utility for Resource Management), a job scheduler and queueing system that efficiently allocates the cluster's resources to manage competing resource demands.
Apptainer (formerly Singularity) is also a major application on the cluster. Apptainer is a containerization platform similar to Docker with the major difference that it runs under user privileges instead of `root`. This platform is enhanced by NGC (NVIDIA GPU Cloud) which provides access to a wide array of pre-built, GPU-optimized software containers for diverse applications. This integration saves all users a lot of time as they don’t need to set up the software applications from scratch and can just pull and use the NGC images with Apptainer.
Users run many different applications on the cluster based on their needs, such as Python projects via Jupyter Notebooks, OpenMPI-based parallel jobs, NetCDF (often used to manage large datasets in climatology, meteorology, oceanography, and GIS applications). Programs are run directly on the hardware (bare-metal) to maximize performance and minimize overhead.
Containerization is also increasingly popular in HPC it provides isolated environments that allow for the reuse of images for better reproducibility and software portability without the performance impact of other methods or the hastle of manualy installing dependencies. Containers are run using Apptainer (formerly Singularity), a containerization platform similar to Docker with the major difference that it runs under user privileges instead of `root`. Users can deploy images from NGC (NVIDIA GPU Cloud), which provides access to a wide array of pre-built images with GPU-optimized software for diverse applications. Leveraging container images can save a lot of time as users don’t need to set up the software applications from scratch and can just pull and use the NGC images with Apptainer.
The cluster also supports various software applications tailored to different needs: Python and R for data analysis, MATLAB for technical computing, Jupyter for interactive projects, and OpenMPI for parallel computing. Anaconda broadens these capabilities with packages for scientific computing, while NetCDF manages large datasets. For big data tasks, Hadoop/Spark offers powerful processing tools.
## Hardware
### Login Node
- IBM System x3550 with 128GB RAM
- DL325 Gen10+ with an 8-core EPYC processor, 128GB RAM
- DL385 Gen10+ v2 with 2x AMD 32-core EPYC processors, 256GB RAM
### Compute Nodes
- Two Apollo 6500 Gen10+ HPE nodes, _each_ containing 8 NVIDIA A100 SXM GPUs.
- One HPE ProLiant DL385 Gen10+ v2, containing 2 A30 SXM NVIDIA GPUs.
- Two XL675d Gen10+ servers (Apollo 6500 Gen10+ chassis), _each_ containing 8 NVIDIA A100 SXM4 GPUs.
- One HPE DL385 Gen10+ v2 with 2 A30 PCIe GPUs.
- Two HPE DL380a Gen11 servers, _each_ containing 2 NVIDIA H100 80GB GPUs.
- Two Cray XD665 nodes, _each_ containing 4 NVIDIA HGX H100 80GB GPUs.
- One Cray XD670 node, containing 8 NVIDIA HGX H100 80GB GPUs.
{%comment%}
- 1x HPE DL385 Gen10+ v2 node with 2x NVIDIA A30 24GB PCIe GPUs, 2x AMD EPYC 32-core processors, 256GiB DDR4 RAM
- 2x HPE XL675d Gen10+ nodes (Apollo 6500 Gen10+ chassis), each with 8x NVIDIA A100 80GB SXM GPUs, 1TiB 3200 DDR4 RAM, 2x 480GB SSD
- 2x HPE DL380a Gen11, each with 2x NVIDIA H100 80GB PCIe GPUs, 2x Intel Xeon 32-core processors, 512GiB DDR5 RAM
- 2x Cray XD665 nodes, each with 4x NVIDIA H100 80GB SXM GPUs, 2x AMD EPYC 32-core processors, 768GiB DDR5 RAM
- 1x Cray XD670 node with 8x NVIDIA H100 80GB SXM GPUs, 2x Intel Xeon 32-core processors, 2TiB DDR5 RAM
{%endcomment%}
#### HPE Apollo 6500 Gen10
| Attribute\Node Name | gpu1 | gpu2 |
| ----------------------------- | -------------------------------------------------------------- | -------------------------------------------------------------- |
| Model Name | HPE ProLiant XL675d Gen10 Plus; Apollo 6500 Gen10 Plus Chassis | HPE ProLiant XL675d Gen10 Plus; Apollo 6500 Gen10 Plus Chassis |
| Sockets | 2 | 2 |
| Cores per Socket | 32 | 32 |
| Threads per Core | 2 | 2 |
| Memory | 1024 GiB Total Memory (16 x 64GiB DIMM DDR4) | 1024 GiB Total Memory (16 x 64GiB DIMM DDR4) |
| GPU | 8 SXM NVIDIA A100s | 8 SXM NVIDIA A100s |
| Local Storage (Scratch space) | 407GB | 407GB |
| Attribute\Node Name | gpu1 and gpu2 |
| ----------------------------- | -------------------------------------------------------------- |
| Model Name | HPE ProLiant XL675d Gen10 Plus; Apollo 6500 Gen10 Plus Chassis |
| Processors | AMD EPYC 7513 |
| Sockets | 2 |
| Cores per Socket | 32 |
| Threads per Core | 2 |
| Memory | 1024 GiB Total Memory (16 x 64GiB DIMM DDR4) |
| GPU | 8 SXM NVIDIA A100s |
| Local Storage (Scratch space) | 6.4TB (5.8TiB) SSD |
#### HPE DL385 Gen10
| Attribute\Node Name | cn01 |
| ----------------------------- | ------------------------------------------ |
| Model Name | HPE ProLiant DL385 Gen10 Plus v2 |
| Processors | AMD EPYC 7513 32-Core Processor |
| Sockets | 2 |
| Cores per Socket | 32 |
| Threads per Core | 2 |
| Memory | 256GiB Total Memory (16 x 16GiB DIMM DDR4) |
| GPU | 2 SXM NVIDIA A30s |
| Local Storage (Scratch Space) | 854G |
#### XL675d Gen10+ (Apollo 6500 Chassis)
| Attribute\Node Name | gpu4 | gpu5 |
| ----------------------------- | -------------------------------------- | -------------------------------------- |
| Model Name | HPE ProLiant XL675d Gen10 Plus Chassis | HPE ProLiant XL675d Gen10 Plus Chassis |
| Sockets | 2 (AMD EPYC 7513 @ 2.60 GHz) | 2 (AMD EPYC 7513 @ 2.60 GHz) |
| Cores per Socket | 64 Physical Cores | 64 Physical Cores |
| Threads per Core | 2 (128 Logical Cores) | 2 (128 Logical Cores) |
| Memory | 1024 GiB DDR4 3200 RAM | 1024 GiB DDR4 3200 RAM |
| GPU | 8 NVIDIA A100 80GB SXM4 GPUs | 8 NVIDIA A100 80GB SXM4 GPUs |
| Local Storage (Scratch Space) | 2x 480GB SSD | 2x 480GB SSD |
#### HPE DL385 Gen10+ v2
| Attribute\Node Name | cn02 |
| ----------------------------- | -------------------------------- |
| Model Name | HPE ProLiant DL385 Gen10 Plus v2 |
| Sockets | 2 (AMD EPYC 7513 @ 2.60 GHz) |
| Cores per Socket | 64 Physical Cores |
| Threads per Core | 2 (128 Logical Cores) |
| Memory | 256GiB DDR4 RAM |
| GPU | 2 NVIDIA A30 24GB HBM2 PCIe GPUs |
| Local Storage (Scratch Space) | 854G |
| Local Scratch | None |
#### HPE DL380a Gen11
| Attribute\Node Name | gpu6 | gpu7 |
| ----------------------------- | -------------------------------------------- | -------------------------------------------- |
| Model Name | HPE DL380a Gen11 | HPE DL380a Gen11 |
| Sockets | 2 (Intel Xeon-P 8462Y+ @ 2.8GHz) | 2 (Intel Xeon-P 8462Y+ @ 2.8GHz) |
| Cores per Socket | 64 | 64 |
| Threads per Core | 2 (128 Logical Cores) | 2 (128 Logical Cores) |
| Memory | 512 GiB DDR5 RAM | 512 GiB DDR5 RAM |
| GPU | 2 NVIDIA H100 80GB GPUs (NVAIE subscription) | 2 NVIDIA H100 80GB GPUs (NVAIE subscription) |
| Network | 4-port GbE, 1-port HDR200 InfiniBand | 4-port GbE, 1-port HDR200 InfiniBand |
| Local Storage (Scratch Space) | 1TB SSD | 1TB SSD |
| Attribute\Node Name | gpu3 and gpu4 |
| ----------------------------- | -------------------------------------------- |
| Model Name | HPE DL380a Gen11 |
| Processors | 64 Physical cores / 128 Logical Cores (2 x Intel Xeon-P 8462Y+ @ 2.8GHz) |
| Memory | 512 GiB DDR5 RAM |
| GPU | 2 NVIDIA H100 80GB GPUs (NVAIE subscription) |
| Network | 4-port GbE, 1-port HDR200 InfiniBand |
| Local Storage (Scratch Space) | None |
#### Cray XD665 Nodes
| Attribute\Node Name | cray01 | cray02 |
| ----------------------------- | -------------------------------------- | -------------------------------------- |
| Model Name | Cray XD665 | Cray XD665 |
| Sockets | 2 (AMD EPYC Genoa 9334 @ 2.7GHz) | 2 (AMD EPYC Genoa 9334 @ 2.7GHz) |
| Cores per Socket | 64 | 64 |
| Threads per Core | 2 (128 Logical Cores) | 2 (128 Logical Cores) |
| Memory | 768 GiB DDR5 RAM | 768 GiB DDR5 RAM |
| GPU | 4 NVIDIA HGX H100 80GB SXM GPUs | 4 NVIDIA HGX H100 80GB SXM GPUs |
| Network | 2-port 10GbE, 1-port HDR200 InfiniBand | 2-port 10GbE, 1-port HDR200 InfiniBand |
| Local Storage (Scratch Space) | 1TB SSD | 1TB SSD |
| Attribute\Node Name | gpu5 and gpu6 |
| ----------------------------- | -------------------------------------- |
| Model Name | Cray XD665 |
| Processors | 64 Physical cores / 128 Logical Cores (2 x AMD EPYC Genoa 9334 @ 2.7GHz) |
| Memory | 768 GiB DDR5 RAM |
| GPU | 4 NVIDIA HGX H100 80GB SXM GPUs |
| Network | 2-port 10GbE, 1-port HDR200 InfiniBand |
| Local Storage (Scratch Space) | None |
#### Cray XD670 Node
| Attribute\Node Name | cray03 |
| Attribute\Node Name | gpu7 |
| ----------------------------- | -------------------------------------- |
| Model Name | Cray XD670 |
| Sockets | 2 (Intel Xeon-P 8462Y+ @ 2.8GHz) |
| Cores per Socket | 64 Physical Cores |
| Threads per Core | 2 (128 Logical Cores) |
| Processors | 64 Physical cores / 128 Logical Cores (2 x Intel Xeon-P 8462Y+ @ 2.8GHz) |
| Memory | 2048 GiB DDR5 RAM |
| GPU | 8 NVIDIA HGX H100 80GB SXM GPUs |
| Network | 2-port 10GbE, 1-port HDR200 InfiniBand |
| Local Storage (Scratch Space) | 2TB SSD |
| Local Storage (Scratch Space) | None |
### Storage System
......
......@@ -65,7 +65,7 @@ Submit the above information through the online registration form.
## Login node
Access to the cluster is provided through SSH [(What is SSH?)](https://www.youtube.com/watch?v=qWKK_PNHnnA&ab_channel=Tinkernut) to the login node. The login node serves as the gateway or entry point to the cluster. It is important to understand that the login node is not for running computationally intensive tasks itself. Instead, it is for tasks such as file management, editing, and job submission. The actual computational work is done on the compute nodes, which you access indirectly by submitting jobs through Slurm, the job scheduling system.
Access to the cluster is provided through SSH access to the login node. The login node serves as the gateway or entry point to the cluster. Note that most software tools are not available on the login node and it is not for prototyping, building software, or running computationally intensive tasks itself. Instead, the login node is specifically for accessing the cluster and performing only very basic tasks, such as copying and moving files, submitting jobs, and checking the status of existing jobs. For development tasks, you would use one of the development nodes, which are accessed the same way as the large compute nodes. The compute nodes are where all the actual computational work is performed. They are accessed by launching jobs through Slurm with `sbatch` or `srun`.
## Scheduler policies
......
File mode changed from 100755 to 100644
......@@ -820,7 +820,7 @@ In C++, there's no virtual environment to delete. However, you can remove your l
Remember to also remove or comment out the environment variable settings in your `~/.bashrc` or `~/.bash_profile` if you no longer need them.
# Rust
## Rust
### How to simulate a virtual environment with rust
......
File mode changed from 100755 to 100644
......@@ -48,6 +48,20 @@ Rsync is a particularly useful tool and is recommended for transferring files to
When transferring very large files or datasets, it is advised to use rsync and to calculate and confirm checksums to ensure data integrity.
## Cyberduck
Cyberduck is a file transfer application with an intuitive graphical interface for transfering files to or from a remote machine. Cyberduck is available for both Windows and Mac. Download it from [cyberduck.io](https://cyberduck.io/).
Click "Open Connection" and a new window will be displayed like below. Select "SFTP (SSH File Transfer Protocol)" from the top dropdown menu. Enter the server, port number, your username, and Linux Lab password. Then click "Connect".
![3-connection.png]({{ site.baseurl }}/images/cyberduck_setup_images/3-connection.png "3-connection.png")
If you see a window asking about an "Unknown fingerprint", click "Always" and then "Allow".
![4-fingerprint.png]({{ site.baseurl }}/images/cyberduck_setup_images/4-fingerprint.png "4-fingerprint.png")
You should now be able to see your user's home directory on the cluster. You can transfer files to and from it by dragging and dropping files between this window and your "Finder" windows.
## Network Interfaces and Bandwidth
All file transfer access to the Star HPC Cluster is currently through the login node's 1GbE interface. Users should be aware of potential bandwidth limitations, especially when transferring large amounts of data.
......@@ -55,3 +69,4 @@ All file transfer access to the Star HPC Cluster is currently through the login
## User Authentication and Permissions
File transfers are authenticated in the same way as SSH access. SSH keys are the preferred method for secure authentication, although password authentication is currently allowed. Plans for implementing Multi-Factor Authentication (MFA) are being considered for future security enhancements.
File mode changed from 100755 to 100644
File mode changed from 100755 to 100644
File mode changed from 100755 to 100644
File mode changed from 100755 to 100644
File mode changed from 100755 to 100644
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment