XRootD [3011] No servers have read access error in SWAN (Coffea + DaskExecutor with HTCondor)

Hi,

I am running a Coffea-based analysis on CERN SWAN (JupyterLab) using Dask + HTCondor workers, and I’m encountering persistent XRootD access issues.

Setup

  • Platform: CERN SWAN

  • Executor: coffea.processor.DaskExecutor

  • Backend: HTCondor cluster (via SWAN)

  • Data format: NanoAOD (MC, public datasets)


Minimal Reproducible Code

from dask.distributed import Client
from coffea import processor
from coffea.nanoevents import NanoAODSchema

client = Client("tls://<scheduler-address>")

location = "root://xrootd-cms.infn.it//"

file_set = {
    "tt_to_2l": {
        "treename": "Events",
        "files": [
            f"{location}store/mc/Run3Winter22NanoAOD/TTTo2L2Nu_CP5_13p6TeV_powheg-pythia8/NANOAODSIM/122X_mcRun3_2021_realistic_v9-v1/40000/5441585e-9d9b-4186-83ba-86e2053e1084.root",
        ],
    },
    "ttz_to_ml": {
        "treename": "Events",
        "files": [
            f"{location}store/mc/Run3Summer22NanoAODv13/TTZ_Zto2L_SMEFT_TuneCP5_13p6TeV_amcatnlo-pythia8/NANOAODSIM/133X_mcRun3_2022_realistic_ForNanov13_v1-v2/50000/e62b5483-0bfd-4d40-b3c8-c34437ac2a11.root",
        ],
    },
}

class CountEvents(processor.ProcessorABC):
    def process(self, events):
        dataset = events.metadata["dataset"]
        return {dataset: {"events": len(events)}}

    def postprocess(self, accumulator):
        return accumulator

runner = processor.Runner(
    executor=processor.DaskExecutor(client=client),
    schema=NanoAODSchema,
    skipbadfiles=False,
    savemetrics=True,
)

result = runner(fileset=file_set, processor_instance=CountEvents())

Error

OSError: File did not open properly:
[ERROR] Server responded with an error:
[3011] No servers have read access to the file

Observations

  • The same file opens successfully in the SWAN notebook using:

    import uproot
    uproot.open("root://cms-xrd-global.cern.ch//store/mc/...")
    
  • The failure occurs only when executed on Dask workers (HTCondor jobs)

  • I tested:

    • root://cms-xrd-global.cern.ch//

    • root://xrootd-cms.infn.it//

    • Correct double-slash (//store/...) usage

  • Disabling skipbadfiles reveals the full traceback


Questions

  1. Is this due to worker site locality / network restrictions (i.e., workers cannot access certain replicas)?

  2. Are there recommended redirectors or configurations for SWAN + HTCondor + Coffea workflows?

  3. Is there a way to ensure replica-aware file selection (e.g., via Rucio/DAS integration)?

  4. Do SWAN HTCondor workers require any specific configuration or proxy setup for XRootD access to public MC datasets?

Any guidance or best practices for stable dataset access in this setup would be greatly appreciated.

Thanks!