safetensors is a brand-new, basic, quickly, and safe file format for saving tensors. The style of the file format and its initial application are being led
by Hugging Face, and it’s getting mostly embraced in their popular ‘transformers’ structure. The safetensors R bundle is a pure-R application, enabling to both read and compose safetensor files.
The preliminary variation (0.1.0) of safetensors is now on CRAN.
The primary inspiration for safetensors in the Python neighborhood is security. As kept in mind
in the main documents:
The primary reasoning for this cage is to get rid of the requirement to utilize pickle on PyTorch which is utilized by default.
Pickle is thought about a risky format, as the action of packing a Pickle file can
set off the execution of approximate code. This has actually never ever been an issue for torch
for R users, because the Pickle parser that is consisted of in LibTorch just supports a subset
of the Pickle format, which does not consist of performing code.
Nevertheless, the file format has extra benefits over other typically utilized formats, consisting of:
Assistance for lazy loading: You can pick to check out a subset of the tensors kept in the file.
No copy: Checking out the file does not need more memory than the file itself.
( Technically the existing R application does makes a single copy, however that can
be enhanced out if we truly require it eventually).
Simple: Executing the file format is basic, and does not need complicated reliances.
This implies that it’s a great format for exchanging tensors in between ML structures and
in between various programs languages. For example, you can compose a safetensors file
in R and load it in Python, and vice-versa.
There are extra benefits compared to other file formats typical in this area, and
you can see a contrast table here
The safetensors format is explained in the figure listed below. It’s essentially a header file
consisting of some metadata, followed by raw tensor buffers.
safetensors can be set up from CRAN utilizing:
We can then compose any called list of torch tensors:
< List of 2 #> > $ x: Drift #> > $ y: Drift tensors