Support other language such as Julia?

Perhaps I am wrong, I understand that the /registries is populated with all the packages that are imported in a Julia session and this is changing very often. So, making a read-only /registries would mean that we know a priori all packages that need to be imported. This is certainly possible (as we somehow do it with Python) but it makes sense? Would this by much better?

the registries is populated with ALL the packages known to Julia (by default just the General registry hosted on github), but only the information needed for version resolving, so total of <100MB.

The local stuff are (i.e actual code of a package, what version of what pkgs are installed in the default “project” etc.) stored by default at .julia/environments/, which will be under the path recorded at DEPOT_PATH[1]

btw, the empty environment variable translates to:

julia> DEPOT_PATH
3-element Vector{String}:
 "/eos/user/j/jiling/.julia"
 "/cvmfs/sft.cern.ch/lcg/releases/julia/1.6.0-82998/x86_64-centos7-gcc8-opt/local/share/julia"
 "/cvmfs/sft.cern.ch/lcg/releases/julia/1.6.0-82998/x86_64-centos7-gcc8-opt/share/julia"

so technically we could also maintain a read-only registries folder in one of those two locations but I assume we don’t want any moving parts in /releases.

I see now. I think would be possible as you suggest to git clone regularly the registries outside the lcg\releases and in CVMFS and have a softlink from the julia package in lcg\releases to this location. I am assuming here that different versions and binaries of Julia can share the registries.

yes, registries are universally compatible. (a better PyPI you can say. since one doesn’t need to download multiple full copies of the same set of packages to resolve versions)

yeah, this sounds good! Let me know if I can help you test it in any ways!

I have cloned the general registry in /cvmfs/sft.cern.ch/lcg/contrib/julia/registries
and created symlink in

 $ ls -ls  /cvmfs/sft.cern.ch/lcg/releases/julia/1.6.0-82998/x86_64-centos7-gcc8-opt/share/julia
total 6666
   5 drwxr-xr-x.  7 cvmfs cvmfs    4096 Apr 26 14:50 base
6436 -rw-r--r--.  1 cvmfs cvmfs 6589836 Apr 26 14:50 base.cache
 217 -rw-r--r--.  1 cvmfs cvmfs  221418 Apr 26 14:50 cert.pem
   4 -rwxr-xr-x.  1 cvmfs cvmfs    3853 Apr 26 14:50 julia-config.jl
   1 lrwxrwxrwx.  1 cvmfs cvmfs      47 May 25 10:54 registries -> /cvmfs/sft.cern.ch/lcg/contrib/julia/registries
   1 drwxr-xr-x.  3 cvmfs cvmfs      18 Apr 26 14:50 stdlib
   5 drwxr-xr-x. 15 cvmfs cvmfs    4096 Apr 26 14:50 test

It works but still I find very slow any command that invokes using <package>
I have not yet automated the procedure because I want to get confirmation that is fine.

1 Like

this is working great:

$ rm ~/.julia
$ time (julia -e 'using Pkg; Pkg.add("CSV")')

real    0m31.067s
user    0m20.861s
sys     0m4.667s

before it was going way over 10 minutes.

this is expected because the “actual” (source code, binary deps) stuff is still in ~/.julia/. This is just a common slow down everyone has to swallow not just Julia:

bash-4.2$ time (julia -e 'using CSV')

real    0m1.486s
user    0m0.770s
sys     0m0.567s

bash-4.2$ time (python3 -c 'import matplotlib')

real    0m1.696s
user    0m0.453s
sys     0m0.305s

With the upcoming Julia v1.7 (currently at v1.0.0-beta4), the registry problem should be largely solved: Julia 1.7 now downloads and keeps it as a single file, instead of trying to write a large number of small files.