Did you know 'Setonix' is actually the scientific name of Australian native animal 'Quokka'? I didn't know until I started using Pawsey's Setonix for deep learning. This is a personal note to use Setnoix supercomputer for deep learning workflow. Please be informed that things might have changed since I last accessed and/or I might be mistaken on my notes.

Important constraints

/home directory quota is 1GB.
/home directory has inode quota of 10K.
/software/ directory has inode quota of 100K per user (i.e., even if you are in multiple project, your quota is still fixed to 100K).
/scratch/ directory has inode quota of 1M per user.
/scratch/ files are deleted after 21 days of inactivity.

Access

I access through VSCode remote SSH extension. It has a side-effect of hogging the small HOME quota, as mentioned here.

Solution 1

Create a symlink to the .vscode-server directory in the /scratch directory.

# Open VSCode and connect to the remote server. It will create the .vscode-server directory in the HOME directory. Then, move it to the scratch directory and create a symlink.
mv .vscode-server /scratch/pawsey1001/rakib/
ln -s /scratch/pawsey1001/rakib/.vscode-server .vscode-server

Solution 2

Note: VSCode 1.93 version caused some issues with this approach (details here)

I can configure the remote SSH extension to use a different directory (e.g., /scratch) for the .vscode-server directory. Open the VSCode settings (Ctrl + ,) and search for "Server install path". Then, add items like this:

Item	Value
setonix.pawsey.org.au	`/scratch/<project_id>/<user_id>/`

Just open the remote explorer in VSCode and add a new SSH host. Then, select the host and connect. It will ask for the password and then it will be connected. To avoid password, we can use the public key authentication (detailed here).

GPU Computing

Based on this, SLURM command to access the GPU node interactively:

salloc -N <n> --gres=gpu:<n> -A <project_id>-gpu --partition=<gpu or gpu-dev or gpu-highmem> --time=<hh:mm:ss>
ssh <node_name> # node_name is the name of the node you get from the previous command

Importants notes
- "Project name to access the GPU nodes is different." It is <project_id>-gpu instead of just <project_id>.
- "The request of resources only needs the number of nodes (–-nodes, -N) and the number of allocation-packs per node (--gres=gpu:number)." "Users should not indicate any other Slurm allocation option related to memory or CPU cores. Therefore, users should not use --ntasks, --cpus-per-task, --mem, etc."

Pytorch and Python

Pawsey-installed Pytorch

Containers and modules are made available by Pawsey. We can load the module using module load. Details are available here.

module load pytorch/... # Load the correct Pytorch version
bash # starts singularity shell

# Now, let's create a virtual environment - only first time
# Important to add --system-site-packages to use the loaded Pytorch
python3 -m venv --system-site-packages <path/to/venv>

source <path/to/venv>/bin/activate # activate the virtual environment
python ... # run Python

The above order of calling Python is important. I.e., python must be run within a singularity shell. The alternative is to use singularity exec command shown below.

With the above container-based approach, SLURM job submission requires some adjustments, especially to instantiate singularity shell. As shown below, the singularity exec command is used, following bash -c to run multiple commands, which includes activating the virtual environment, setting an environment variable and running the Python script.

# Usual #SBATCH tags

module load pytorch/2.2.0-rocm5.7.3

singularity exec -e $SINGULARITY_CONTAINER bash -c "\
source $MYSOFTWARE/.venv/bin/activate && \
export TOKENIZERS_PARALLELISM=false && \
python src/main.py"

Pyenv – Works fine but consumes additional ~30K inodes (including ~4K from pyenv itself) than the above module approach

pyenv is a great alternative with simpler interface. For large projects, it leads to more files (mostly due to PyTorch installation; I checked pyenv usage leads to additional ~4000 inodes only) and subsequently consumes the limited inode quota. In future, I will symlink the .Pyenv directory to /scratch and reinstall every 21 days but still I would prefer pyenv over Pawsey-provided Pytorch.
To install ROCm-compatible Pytorch, I can follow the official pytorch guideline from https://pytorch.org/get-started/locally/, for example:

pip3 install torch --index-url https://download.pytorch.org/whl/rocm6.0

Offloading pyenv files in a different directory due to limited quota

The primary .pyenv is located in the home directory, which I symlinked to the /software directory. After symlinking, the ls -al shows .pyenv -> /software/projects/pawsey1001/rakib/.pyenv. The command sequence of moving the files and symlinking may look like this:

mv .pyenv /software/projects/pawsey1001/rakib/
ln -s /software/projects/pawsey1001/rakib/.pyenv .pyenv

Multiple virtual environments using pyenv

Following this GitHub issue, I tried to create virual envirornment using basic python -m venv command.

# list available python versions using pyenv versions and see which one is active. Change if needed.
python -m venv <path/to/venv> # create a new virtual environment
source <path/to/venv>/bin/activate # activate the virtual environment

# Optional: Create symlink to change using pyenv
cd ~/.pyenv/versions
ln -s <path/to/venv> env_name
pyenv activate env_name # activate the virtual environment

To get around the inode quota issue, even if I symlinked the environment folder, it seems the lib folder is common for all environments and thus getting quota error.
- I thought of offloading files of less-significant environment to the scratch directory (as scratch has file purge policy, I would need to re-install packages after some days). After creating the new virtual environment, I symlinked corresponding virtual environment files to scratch. Inside /home/rakib/.pyenv/versions/3.12.3/envs, I symlinked the created virtual environment, which looks like: env_name -> /scratch/pawsey1001/rakib/extr_pyenv/env_name.

So, overall, the Pawsey-installed Pytorch approach would be better to have multiple environments as I could just creat a new environment in the /scratch directory and use it without any quota issue.

Mamba/Conda – Ok but high inode usage

I have tried Miniforge3. It worked well initially, but there's no ROCm-compatible Pytorch available through conda/mamba. Installing through pip within conda environment could be a solution. Another problem was that the installation consumed the limited inode quota of /software/ directory, and there's disk quota exceeded error frequently.
Having said that, I had to use it in July 2025 (I needed FEniCS, which was straightforward through conda system. Apart from FEniCS, the vtk package installed through pip was missing libXrender.so.1, but through conda, it was smooth like butter). I installed Miniforge3 in the /scratch directory (the installation process asks for the installation path, so it was easy). The caveat is that I had to re-install Miniforge3 within 24 hours (perhaps the modification time of the files were older than 21 days, so they were purged automatically).

File system

As mentioned in this Pawsey's documentation,
- $MYSOFTWARE, i.e, /software/projects/<project_id>/<user_id>/ to install software packages.
- $MYSCRATCH, i.e, /scratch/<project_id>/<user_id> for temporary storage.

Acacia

Quick start. It's important to save the access keys key in the $HOME/.config/rclone/rclone.conf file. To do that, corresponding client configure command is available on the window after clicking the "Create New Key" button. Feel free to customise the profile name.
User guide
- Install S3 client
- Listing the contents

Careful with the modification time

"If copying to Setonix /scratch file system please be aware that rclone sets atime to the same as modtime (which it gets from the S3 storage). This could result in data being purged from /scratch even though it has not been on the file system for 21 days. To prevent this you can use the --local-no-set-modtime option to rclone." - Acacia - Troubleshooting
unzip FILE.ZIP command preserves the original modification time of the files while unpacking. To update the modification time to the current time of unzipping, use unzip -DD FILE.ZIP command.

SLURM job submission

account is the project ID, e.g., pawsey1234
Sample job submission script is here for CPU and here for GPU.

Important points

Home directory quota is 1GB only. Therefore, I should offload large files/folders from home to other directory. Especially, the .cache, .local and/or .conda files must be in another directory (e.g., $MYSOFTWARE). But please note that /software/ has a smaller inode quota (mentioned in the top of this note). Managing the cache and conda files through environment variable can be found here. Alternatively (Better), I can create symbolic links to those resource-intensive directories in the home directory.

mkdir -p .cache && ln -s $MYSOFTWARE/.cache $HOME/.cache
mkdir -p .local && ln -s $MYSOFTWARE/.local $HOME/.local

If there are multiple projects, configure default project name in $HOME/.pawsey_project to appropriately set $MYSCRATCH and $MYSOFTWARE environment variables.

Important commands

pawseyAccountBalance -p pawsey1001-gpu -user # Check user-wise usage of GPU and CPU from a project
pawseyAccountBalance -p pawsey1001-gpu -year # Check yearly usage of GPU and CPU from a project
lfs quota /software # Check quota of the software directory, both user-wise and group-wise. Same can be checked for /scratch
quota -s # Check the quota of the home directory
du -sh [path] # Check the size of a directory, human-readable summary format
ls --inode -sh [path] # Check the inode usage of a directory, human-readable summary format
ls -ltu # List files sorted by access time. Access time is important for the 21-day purge policy of /scratch

Course / Manuals / helpful resources

Rakibul Hasan

Pawsey Setonix for deep learning

Important constraints

Access

Solution 1

Solution 2

Next

GPU Computing

Pytorch and Python

Pawsey-installed Pytorch

Pyenv – Works fine but consumes additional ~30K inodes (including ~4K from pyenv itself) than the above module approach

Offloading pyenv files in a different directory due to limited quota

Multiple virtual environments using pyenv

Mamba/Conda – Ok but high inode usage

File system

Acacia

Careful with the modification time

SLURM job submission

Important points

Important commands

Course / Manuals / helpful resources

Important constraints

Access

Solution 1

Solution 2

Next

GPU Computing

Pytorch and Python

Pawsey-installed Pytorch

Pyenv – Works fine but consumes additional ~30K inodes (including ~4K from pyenv itself) than the above module approach

Offloading pyenv files in a different directory due to limited quota

Multiple virtual environments using pyenv

Mamba/Conda – Ok but high inode usage

File system

Acacia

Careful with the modification time

SLURM job submission

Important points

Important commands

Course / Manuals / helpful resources

find me on