Hi,
I am running a Coffea-based analysis on CERN SWAN (JupyterLab) using Dask + HTCondor workers, and I’m encountering persistent XRootD access issues.
Setup
-
Platform: CERN SWAN
-
Executor:
coffea.processor.DaskExecutor -
Backend: HTCondor cluster (via SWAN)
-
Data format: NanoAOD (MC, public datasets)
Minimal Reproducible Code
from dask.distributed import Client
from coffea import processor
from coffea.nanoevents import NanoAODSchema
client = Client("tls://<scheduler-address>")
location = "root://xrootd-cms.infn.it//"
file_set = {
"tt_to_2l": {
"treename": "Events",
"files": [
f"{location}store/mc/Run3Winter22NanoAOD/TTTo2L2Nu_CP5_13p6TeV_powheg-pythia8/NANOAODSIM/122X_mcRun3_2021_realistic_v9-v1/40000/5441585e-9d9b-4186-83ba-86e2053e1084.root",
],
},
"ttz_to_ml": {
"treename": "Events",
"files": [
f"{location}store/mc/Run3Summer22NanoAODv13/TTZ_Zto2L_SMEFT_TuneCP5_13p6TeV_amcatnlo-pythia8/NANOAODSIM/133X_mcRun3_2022_realistic_ForNanov13_v1-v2/50000/e62b5483-0bfd-4d40-b3c8-c34437ac2a11.root",
],
},
}
class CountEvents(processor.ProcessorABC):
def process(self, events):
dataset = events.metadata["dataset"]
return {dataset: {"events": len(events)}}
def postprocess(self, accumulator):
return accumulator
runner = processor.Runner(
executor=processor.DaskExecutor(client=client),
schema=NanoAODSchema,
skipbadfiles=False,
savemetrics=True,
)
result = runner(fileset=file_set, processor_instance=CountEvents())
Error
OSError: File did not open properly:
[ERROR] Server responded with an error:
[3011] No servers have read access to the file
Observations
-
The same file opens successfully in the SWAN notebook using:
import uproot uproot.open("root://cms-xrd-global.cern.ch//store/mc/...") -
The failure occurs only when executed on Dask workers (HTCondor jobs)
-
I tested:
-
root://cms-xrd-global.cern.ch// -
root://xrootd-cms.infn.it// -
Correct double-slash (
//store/...) usage
-
-
Disabling
skipbadfilesreveals the full traceback
Questions
-
Is this due to worker site locality / network restrictions (i.e., workers cannot access certain replicas)?
-
Are there recommended redirectors or configurations for SWAN + HTCondor + Coffea workflows?
-
Is there a way to ensure replica-aware file selection (e.g., via Rucio/DAS integration)?
-
Do SWAN HTCondor workers require any specific configuration or proxy setup for XRootD access to public MC datasets?
Any guidance or best practices for stable dataset access in this setup would be greatly appreciated.
Thanks!