Data Compression#
When dealing with Projects which produce huge amounts of data,
it is easy to blow out allocated storage with uncompressed data if you run lots
of Batch Experiments. Thus, it is often useful to
compress data for such projects; that's where this plugin comes in. Keep in
mind that this plugin runs during stage 3, so if you generate so much data
during stage 2 so as to blow out your disk, this plugin can't help. However,
you can look at
IExpRunShellCmdsGenerator and add
whatever commands needed after each run to compress the data if you generate
ungodly amounts of data.
This plugin processes at the file level for each Experimental Run. The
entire output tree is compressed to a .tar.gz file. Optionally, the
uncompressed data can be removed after compression with
--compress-remove-after. No
data is lost--it's all in the archive!
Ordering Considerations#
Statistics Generation and/or Intra-Experiment Data Collation should proceed
this plugin in the --proc chain if you want processed outputs to be included
in the archive in addition to raw outputs.
Usage#
This plugin can be selected by adding proc.compress to the list passed to
--proc.
Cmdline Interface#
sierra - CLI interface#
sierra [--compress-remove-after]
sierra Stage 3 options#
Options for processing experiment results
If the
proc.compressplugin is run, remove the uncompressed Raw Output Data files after compression. This can save TONS of disk space. No data is lost because everything output by each Experimental Run is in the compressed archive.