CERN Accelerating science

Spark


Topic Replies Activity
Scheduling Spark notebooks with Gitlab CI/CD

Dear SWAN users, We noticed interest on productionizing Spark notebooks for analytics pipelines, versioned with git. The proposed pipeline will consist of prototyping a Spark job in SWAN in interactive mode, committing…

2 December 4, 2019
About the Spark category 1 January 24, 2019
Reading from/writing to HDFS on NXCALS 9 December 5, 2019
PyRDF issues 32 December 4, 2019
Spark monitoring from notebook 5 November 29, 2019
Spark configuration 7 November 26, 2019
Multithread in SWAN and spark backends 2 November 12, 2019
Support for spark-root (like "LHCb open data" example) 4 October 21, 2019
Eos access on spark clusters 10 October 17, 2019
Using spark with ROOT DataFrames 7 April 9, 2019