sierra.plugins.proc.collate.plugin#

Classes for collating data within a Batch Experiment.

Collation is the process of "lifting" data from Experimental Runs across all Experiment for all experiments in a Batch Experiment into a single file (a reduce operation). This is needed to correctly calculate summary statistics for performance measures in stage 3: you can't just run the calculated stddev through the calculations because comparing curves of stddev is not meaningful.

Classes#

ExpDataGatherer

Gather Raw Output Data files across all runs for Data Collation.

Functions#

proc_batch_exp(→ None)

Generate Collated Output Data files for each experiment.

Module Contents#

class sierra.plugins.proc.collate.plugin.ExpDataGatherer(*args, **kwargs)[source]#
Inheritance diagram of sierra.plugins.proc.collate.plugin.ExpDataGatherer

Gather Raw Output Data files across all runs for Data Collation.

The configured output directory for each run is searched recursively for files to gather. To be eligible for gathering and later processing, files must:

  • Be non-empty

  • Have a suffix which supported by the selected --storage plugin.

  • Have a name (last part of absolute path, including extension) which matches a configured Product in a YAML file. E.g., a graph from the Graph Generation plugin

sierra.plugins.proc.collate.plugin.proc_batch_exp(main_config: dict, cmdopts: sierra.core.types.Cmdopts, pathset: sierra.core.batchroot.PathSet, criteria: sierra.core.variables.batch_criteria.XVarBatchCriteria) None[source]#

Generate Collated Output Data files for each experiment.

Collated Output Data files generated from Raw Output Data files across Experimental Runs. Gathered in parallel for each experiment for speed, unless disabled with --processing-parallelism.