path.chain: Concise Structure for Chainable Paths
en r eponge
path.chain
package provides an intuitive and easy-to-use system of
nested objects, which represents different levels of some directory’s
structure in the file system. It allows us to created a nested structure, which returns a string
from every its leaf.
Look at the path.chain
Sometimes one picture can say more, than a thousand words, and this is exactly the case.
Motivation
I’ve been working on the ML project, when we decided to keep strcture of the input data in the YAML config. The structure were getting complicated, and at some point on this history ou config get a form like this:
default:
kData:
kRoot: 'our/super/dir'
kTerroir:
kRoot: 'terroir'
kSoils: 'soils.fst'
kTemperature: 'temperature.fst'
kRains: 'rains.fst'
kWineQuality:
kChemicalParams: 'chemical_params.fst'
kContestResults: 'contest_results.fst'
For your infomation: the example above is totally fictitious and has nothing to do with the actual project I’ve been woking on. Moreover, in our project, several times more of paths were defined. As you can imagine, such structure forced us to load data in the following manner:
config <- config::get(
config = "default"
file = "path/to/config",
use_parent = FALSE
)
path <- file.path(
config$kData$kRoot,
config$kData$kTerroir$kRoot,
config$kData$kTerroir$kSoils
)
vineyard_soils <- fst::read_fst(path)
Doesn’t it look redundant? So, I’ve written a path.chain
package:
using it we can perform the same action with less code:
library(path.chain)
vineyard_soils <- fst::read_fst(
config$kData$kTerroir$kSoils
)
Isn’t it nice for your eyes?
If I would like to modify the config, say, with the following change,
default:
kData:
kRoot: 'our/super/dir'
kTerroir:
kRoot: 'terroir'
kSoils: 'vineyard_soils.fst' # <- This is the change
kTemperature: 'temperature.fst'
kRains: 'rains.fst'
kWineQuality:
kChemicalParams: 'chemical_params.fst'
kContestResults: 'contest_results.fst'
the code is still working.
What if we would like to reconfigure our list of paths wthout changing
the code? It may probably break desired behaviour of our scripts, but
with path.chain
we can easily detect the cause looking into logs.
Simply use on_path_not_exists
or on_validate_path
on_validate_path(
~ if(tools::file_ext(.x) == '.fst') print("Invalid file")
)
on_path_not_exists(~ log_error("Path {.x} not exists"))
To learn more, read the package documentation.