GPU support in SWAN

chgrech · July 19, 2019, 9:48am

Hi all,

First of all, I am glad to start using SWAN in my projects at CERN and would like to thank all the developers for the efforts made so far!

Looking at some of the previous posts on this forum, there was a reference about GPU support in SWAN a few months ago. Is there an approximate launch date for this yet? I am planning a year long project which might benefit from GPUs, so it would be help us to have even a vague idea on when this might happen.

etejedor · July 19, 2019, 11:44am

Hi Christian,

Thank you for your kind words!

We are very close to having a prototype SWAN server with GPU support. This means you can start a session with a software stack that includes Cuda and the GPU-compiled version of tensorflow. When you do that, a GPU is made visible to your user session and, if you use tensorflow from your notebook, it will offload computations to the GPU.

We can let you know when the prototype is ready (should happen in the next one or two weeks) so that you try it out and give us feedback. On the other hand, we would like to know about your use case: what group you belong to, what you want to use the GPU from SWAN for, what libraries you need, etc…

Cheers,
Enric

rdemaria · July 19, 2019, 1:24pm

Hello Enric,

Thanks for the good news!

Could you build pycuda and pyopencl packages?

Which type of GPU would be available?

Cheers,
Riccardo

etejedor · July 19, 2019, 1:41pm

Hi Riccardo,

So far we got a VM from IT for testing purposes with the following card:

NVIDIA Corporation GV100GL [Tesla V100 PCIe 32GB]

What is your use case for GPUs? Could you elaborate a bit on who would use them from SWAN and how?

Cheers,
Enric

chgrech · July 19, 2019, 2:29pm

Hi Enric,

In my case, as part of the B-train renovation (TE-MSC-MM) we are planning a neural network model using tensorflow, which will have to be trained with a large amount of data. It is still very early stages for us but we are willing to participate and try out any prototypes.

Christian

etejedor · July 19, 2019, 2:52pm

Hi Christian,

Ok, I will then keep you posted about the prototype.

Cheers,
Enric

rdemaria · July 19, 2019, 5:47pm

In my group there are several particle tracking codes that are being developed for GPUs, for which the V100 are particularly strong due to their memory and double precision performance. We use primarily Python to set up simulations and use pyopencl/pycuda (recently cupy) both for development and production (as well as compiled code). The effort GPU has just started, therefore the usage would be limited at the beginning, but I would find useful to be able to run these code for demonstration and development in SWAN.

etejedor · July 22, 2019, 7:20am

Hi Riccardo,

Thank you for the explanation. We will keep you posted as well about the prototype.

As for the inclusion of pyopencl and pycuda in the software stack, I would recommend that you open a ticket to the librarians:

https://sft.its.cern.ch/jira/projects/SPI

Please describe the packages that you need, why you need them and potential number of users. Also please specify that you would like those packages to be included in the Cuda10 stack that SWAN is using.

csteinfi · September 3, 2020, 7:04am

Hello! What’s the status here?

etejedor · September 3, 2020, 10:16am

Hi,

We have a test node but it is currently in maintenance (migration to Kubernetes infrastructure).

I added you to the corresponding egroup so you have access. It should be available beginning of next week by logging into:

https://swan-k8s.cern.ch

(you will need an SSH tunnel, it’s not publicly visible)

Before starting your session, you will need to select the CUDA software stack (“Cuda 10 Python3”) and, if the GPU is not taken by anyone else at that moment, your session will start with the GPU attached.

arogachev · July 25, 2021, 1:27pm

Greetings, Enric.
This topic seems to be outdated a bit, but still,
is there any possibility now to run notebooks with Torch using Cuda via SWAN now?
May be there is some specific Software stack I should choose during the env setup?
Thanks a lot!

ozapatam · July 26, 2021, 12:19pm

Hello @arogachev

Welcome to the SWAN community forum,

we already have support for GPUs and you can use PyTorch in our Cuda software stack.
The service is on the website https://swan-k8s.cern.ch/
but you have to be enabled in the beta testers users. Do you want to be enabled?

What is your use case? what are you planning to run there? we are asking because we would like to know more about the use cases for our users.

Cheers
Omar.

arogachev · July 28, 2021, 8:03am

Thank you, @ozapatam.
Yes, I would definetly want to join the beta testing group.
In my case I am going to run some GAN-training notebooks for the curent simulation research project.

ozapatam · August 4, 2021, 11:50am

Hello @arogachev
I am sorry for my late reply.

I added you to the SWAN egroup for the gpu access, please in a couple of hours
let’s go to https://swan-k8s.cern.ch/ and select the CUDA stack to get a gpu.

Cheers
Omar.

arogachev · August 5, 2021, 12:23pm

Thanks a lot, the env was added to the list, I’ll try to test it ASAP.

Btw, how can my colleagues get an access to these CUDA envs as well? Should they send a kind of email or make a post in this topic?

ozapatam · August 5, 2021, 1:33pm

Hello @arogachev,

please open a ticket in service now saying they want to test the GPUs on SWAN and what their use cases are.
https://cern.service-now.com/service-portal?id=functional_element&name=swan

Cheers
Omar.

arogachev · September 2, 2021, 11:58am

Hi Omar,
Is there any time limits or notebook timeout settings that can prevent users from running long-time GPU experiments?
Thanks

ozapatam · September 2, 2021, 3:02pm

Hi @arogachev

Yes, at the moment it is 6 hours, but I think it will be reduced.

Cheers
Omar.

arogachev · September 2, 2021, 6:08pm

Is it possible to request an increase in “quota” somehow?

shelena · October 26, 2021, 10:49am

Hi,
I am not sure if this is the right thread but I’ll ask anyway.
I have a project of b-tagging training running in SWAN but I would like to try a setup with GPUs given I’ll use a much larger data sample soon.
Can I have access to a SWAN server with GPU? How?

Thank you!
Best wishes,
Helena