Storage (--storage)#
Storage plugins tell SIERRA how to handle file I/O in stages 3-5. Specifically:
How to read Raw Output Data from Experimental Runs.
How to write Processed Output Data, Collated Output Data, etc., files to disk.
Each plugin can support any number of input formats, identified by file extensions, and any number of output types. This is summarized below for the storage plugins which come with SIERRA; additional formats can be supported via New Storage Plugin (--storage).
Plugin |
Supported input formats |
Allowed file extensions |
Output type |
|---|---|---|---|
CSV |
|
|
|
|
|
||
|
|
Other plugins in stages 3-5 may require a specific output format; see individual docs for details.
Tip
If you are New Storage Plugin (--storage), follow the Unix philosophy of doing one thing well, and make multiple smaller plugins, rather than 1 storage plugin which handles all of your custom types/formats.
CSV#
Select the CSV format for all data I/O in stages 3-5. This storage plugin can
be selected via --storage=storage.csv. This is the default storage type
which SIERRA will use if none is specified on the cmdline.
Since this plugin produces pd.DataFrame objects, it is suitable for
processing numeric data.
Changed in version 1.3.28: The CSV files read by this plugin must be comma (,) separated. Previously
it was semicolon (;) separated.
Apache Arrow#
Select the arrow format for all data I/O in
stages 3-5. This storage plugin can be selected via
--storage=storage.arrow.
Since this plugin produces pd.DataFrame objects, it is suitable for
processing numeric data.
GraphML#
Select the GraphML format for all data I/O
in stages 3-5. This storage plugin can be selected via
--storage=storage.graphml.
Since this plugin produces nx.Graph objects, it is not suitable for
processing numeric data. E.g., running the Statistics Generation plugin
with this plugin selected will cause an error.