How to save data to file with BE NXCals cluster?

hi all,

For a quicker access to data previously queried to NXCALS DB, I would like to save my pandas dataframe as a parquet file.

I looked around but could not find any documentation on how to do this, specially in how to define the file path. I guess I should save the file in my personal space in CERNbox or HDFS, but not sure what’s the best choice so some advice here is also appreciated. For your information, I tried to browse HDFS via the small elephant button but I get the following error

“HDFS Browser not available, no active hdfs namenode”


Currently I’m using the SW stack NXCals Python 3 and the cluster BE NXCALS.


Dear Diogo

You can save spark dataframe as parquet file as below


and if it is pandas dataframe you have, then you need to convert it to spark dataframe and save it

df = spark.createDataFrame(pdf)

Please note that spark creates the folder_name. Also thanks for reporting the issue on hdfs browser, we will investigate and fix it soon

Best Regards,