Hey @etejedor ,
Thanks for the reply. Yes I did go through these to understand the steps of initialisation and scaling of the cluster. The thing I am not sure on is what is the exact way of running code on the cluster and ensuring that it is not running locally.
Let me detail my problem a little bit. I have two files that are supposed to run on the cluster: One is to create a fileset JSON and cumulate all the files from DAS. The code I had used to do this is below:
outputname = os.path.join("filesets",f"EGamma0_Data_nanov15_fileset_Era_[B-J]")
with client:
Data_ddc.do_preprocess(output_file=outputname,
file_exceptions=(OSError, lzma.LZMAError, DeserializationError))
This code ran successfully on the cluster.
The second file is a selection code and this is where I am encountering the problem that I am not able to ascertain if the analysis is running on the cluster or not. The code I have been guided to use for this is below:
with ProgressBar():
with gzip.open(FILESET_DATA_LOC, "rt") as file:
fileset_full = json.load(file)
# fileset_cleaned = filter_files(fileset_full, lambda x: not x[0].startswith("root://maite"))
# fileset_cleaned = filter_files(fileset_full)
fileset_ready = max_files(fileset_full, 10) ##### test with 10 files#### None for all
# fileset_ready = change_sources(fileset_ready)
analyze_func = partial(createSingleElectronCR, delayed=True, isMC=False)
outputs_elecCR, reports_elecCR = apply_to_fileset(analyze_func, fileset_ready, uproot_options=UPROOT_OPTIONS)
# outputs_elecCR, reports_elecCR = apply_to_fileset(analyze_func, fileset_full, uproot_options=UPROOT_OPTIONS)
# print(events.fields)
print("Applying Selections...")
# print("Selections: OBJECT PAIR REQUIREMENT --> delta_R + Opp sign + Vis Z mass #All files ")
t0 = time.time()
coutputs_elecCR, creports_elecCR = dask.compute(outputs_elecCR, reports_elecCR)
print("Done. Time taken:", time.time() - t0)
The code here is taking a very long time to run (I suspect it is running locally) and even if it is running on the cluster I don’t see the job by doing condor_q. Hence I want to understand the process of running code on the cluster through SWAN a bit better.
Also to make it clear I did run the following code before executing the main code in each of the files.
from dask.distributed import Client
client = Client("tls://10.100.106.244:32694")
client
I understand that my problem is a bit specific so let me know if you need any more details.
Thanks,
AG