I am kind of new to SWAN, I have managed to run over my eos and read/process csv data from there (<1GB) in order to create a ML model, but now that I want to increase the data sample, I was wondering which Spark Cluster configuration could be optimal.
From https://github.com/swan-cern/help/blob/master/spark/clusters.md I understand that Cloud containers should be my choice in order to increase the performance, but I am not sure if I got it right since I seem to see the opposite when choosing that, at least compared to Analytix.
I am not sure if it is related to the fact that Cloud Containers do not have 97 version of the software stack, so that I had to choose 96 instead (or bleeding edge). Besides I could not find info about QA Spark Cluster, so in general any tip or documentation about recommended option is very welcome.
Thank you very much in advance.