API Reference¶
CampaignManager¶
- class sem.CampaignManager(campaign_db, campaign_runner, check_repo=True)[source]¶
This Simulation Execution Manager class can be used as an interface to execute simulations and access the results of simulation campaigns.
The CampaignManager class wraps up a DatabaseManager and a SimulationRunner, which are used internally but can also be accessed as public member variables.
- check_repo_ok()[source]¶
Make sure that the ns-3 repository’s HEAD commit is the same as the one saved in the campaign database, and that the ns-3 repository is clean (i.e., no untracked or modified files exist).
- create_runner(script, runner_type='Auto', optimized=True, skip_configuration=False, max_parallel_processes=None)[source]¶
Create a SimulationRunner from a string containing the desired class implementation, and return it.
- Parameters:
ns_path (str) – path to the ns-3 installation to employ in this SimulationRunner.
script (str) – ns-3 script that will be executed to run simulations.
runner_type (str) – implementation of the SimulationRunner to use. Value can be: SimulationRunner (for running sequential simulations locally), ParallelRunner (for running parallel simulations locally), GridRunner (for running simulations using a DRMAA-compatible parallel task scheduler). If Auto, automatically pick the best available runner (GridRunner if DRMAA is available, ParallelRunner otherwise).
optimized (bool) – whether to configure the runner to employ an optimized ns-3 build.
skip_configuration (bool) – whether to skip the configuration step, and only perform compilation.
- files_in_dictionary()[source]¶
Parsing function that returns a dictionary containing one entry for each file. Typically used to perform parsing externally.
- get_missing_simulations(param_list, runs=None, with_time_estimate=False)[source]¶
Return a list of the simulations among the required ones that are not available in the database.
- Parameters:
param_list (list) – a list of dictionaries containing all the parameters combinations.
runs (int) – an integer representing how many repetitions are wanted for each parameter combination, None if the dictionaries in param_list already feature the desired RngRun value.
with_time_estimate (bool) – a boolean representing …
- get_results_as_dataframe(result_parsing_function, columns=None, params=None, runs=None, param_columns='all', drop_constant_columns=False, parallel_parsing=False, verbose=False)[source]¶
Return a Pandas DataFrame containing results parsed using a user-specified function.
If function_yields_multiple_results if False, result_parsing_function is expected to return a list of outputs for each parsed result, and column should contain an equal number of labels describing the contents of the output list.
If function_yields_multiple_results is True, instead, result_parsing_function is expected to return multiple lists of outputs, as described by the labels in columns, for each result. In this case, each result in the database will yield a number of rows in the output dataframe that is equal to the length of the result_parsing_function output computed on that result.
- Parameters:
result_parsing_function (function) – user-defined function, taking a result dictionary as input and returning a list of outputs or a list of lists of outputs.
- get_results_as_numpy_array(parameter_space, result_parsing_function, runs=None, extract_complete_results=True)[source]¶
Return the results relative to the desired parameter space in the form of a numpy array.
- Parameters:
parameter_space (dict) – dictionary containing parameter/list-of-values pairs.
result_parsing_function (function) – user-defined function, taking a result dictionary as argument, that can be used to parse the result files and return a list of values.
runs (int) – number of runs to gather for each parameter combination.
- get_results_as_xarray(parameter_space, result_parsing_function, output_labels, runs=None, extract_complete_results=True)[source]¶
Return the results relative to the desired parameter space in the form of an xarray data structure.
- Parameters:
parameter_space (dict) – The space of parameters to export.
result_parsing_function (function) – user-defined function, taking a result dictionary as argument, that can be used to parse the result files and return a list of values.
output_labels (list) – a list of labels to apply to the results dimensions, output by the result_parsing_function.
runs (int) – the number of runs to export for each parameter combination.
- get_space(current_result_list, current_query, param_space, result_parsing_function, runs=None, extract_complete_results=True)[source]¶
Convert a parameter space specification to a nested array structure representing the space. In other words, if the parameter space is:
param_space = { 'a': [1, 2], 'b': [3, 4] }
the function will return a structure like the following:
[ [ {'a': 1, 'b': 3}, {'a': 1, 'b': 4} ], [ {'a': 2, 'b': 3}, {'a': 2, 'b': 4} ] ]
where the first dimension represents a, and the second dimension represents b. This nested-array structure can then be easily converted to a numpy array via np.array().
- Parameters:
current_query (dict) – the query to apply to the structure.
param_space (dict) – representation of the parameter space.
result_parsing_function (function) – user-defined function to call on results, typically used to parse data and outputting metrics.
runs (int) – the number of runs to query for each parameter combination.
- classmethod load(campaign_dir, ns_path=None, runner_type='Auto', optimized=True, check_repo=True, skip_configuration=False, max_parallel_processes=None)[source]¶
Load an existing simulation campaign.
Note that specifying an ns-3 installation is not compulsory when using this method: existing results will be available, but in order to run additional simulations it will be necessary to specify a SimulationRunner object, and assign it to the CampaignManager.
- Parameters:
campaign_dir (str) – path to the directory in which to save the simulation campaign database.
ns_path (str) – path to the ns-3 installation to employ in this campaign.
runner_type (str) – implementation of the SimulationRunner to use. Value can be: SimulationRunner (for running sequential simulations locally), ParallelRunner (for running parallel simulations locally), GridRunner (for running simulations using a DRMAA-compatible parallel task scheduler).
optimized (bool) – whether to configure the runner to employ an optimized ns-3 build.
skip_configuration (bool) – whether to skip the configuration step, and only perform compilation.
- classmethod new(ns_path, script, campaign_dir, runner_type='Auto', overwrite=False, optimized=True, check_repo=True, skip_configuration=False, max_parallel_processes=None)[source]¶
Create a new campaign from an ns-3 installation and a campaign directory.
This method will create a DatabaseManager, which will install a database in the specified campaign_dir. If a database is already available at the ns_path described in the specified campaign_dir and its configuration matches config, this instance is used instead. If the overwrite argument is set to True instead, the specified directory is wiped and a new campaign is created in its place.
Furthermore, this method will initialize a SimulationRunner, of type specified by the runner_type parameter, which will be locked on the ns-3 installation at ns_path and set up to run the desired script.
Finally, note that creation of a campaign requires a git repository to be initialized at the specified ns_path. This will allow SEM to save the commit at which the simulations are run, enforce reproducibility and avoid mixing results coming from different versions of ns-3 and its libraries.
- Parameters:
ns_path (str) – path to the ns-3 installation to employ in this campaign.
script (str) – ns-3 script that will be executed to run simulations.
campaign_dir (str) – path to the directory in which to save the simulation campaign database.
runner_type (str) – implementation of the SimulationRunner to use. Value can be: SimulationRunner (for running sequential simulations locally), ParallelRunner (for running parallel simulations locally), GridRunner (for running simulations using a DRMAA-compatible parallel task scheduler). Use Auto to automatically pick the best runner.
overwrite (bool) – whether to overwrite already existing campaign_dir folders. This deletes the directory if and only if it only contains files that were detected to be created by sem.
optimized (bool) – whether to configure the runner to employ an optimized ns-3 build.
skip_configuration (bool) – whether to skip the configuration step, and only perform compilation. NOTE: if skip_configuration=True and optimized=True, the build folder should be manually set to –out=build/optimized.
- run_missing_simulations(param_list, runs=None, condition_checking_function=None, callbacks=[], stop_on_errors=True)[source]¶
Run the simulations from the parameter list that are not yet available in the database.
This function also makes sure that we have at least runs replications for each parameter combination.
Additionally, param_list can either be a list containing the desired parameter combinations or a dictionary containing multiple values for each parameter, to be expanded into a list.
- Parameters:
param_list (list, dict) – either a list of parameter combinations or a dictionary to be expanded into a list through the list_param_combinations function.
runs (int) – the number of runs to perform for each parameter combination. This parameter is only allowed if the param_list specification doesn’t feature an ‘RngRun’ key already.
callbacks (list) – list of objects extending CallbackBase to be triggered during the run.
stop_on_errors (bool) – whether or not to stop the execution of the simulations if an error occurs.
- run_simulations(param_list, show_progress=True, callbacks: list = [], stop_on_errors=True)[source]¶
Run several simulations specified by a list of parameter combinations.
Note: this function does not verify whether we already have the required simulations in the database - it just runs all the parameter combinations that are specified in the list.
- Parameters:
param_list (list) – list of parameter combinations to execute. Items of this list are dictionaries, with one key for each parameter, and a value specifying the parameter value (which can be either a string or a number).
show_progress (bool) – whether or not to show a progress bar with percentage and expected remaining time.
callbacks (list) – list of objects extending CallbackBase to be triggered during the run.
stop_on_errors (bool) – whether or not to stop the execution of the simulations if an error occurs.
- save_to_mat_file(parameter_space, result_parsing_function, filename, runs)[source]¶
Return the results relative to the desired parameter space in the form of a .mat file.
- Parameters:
parameter_space (dict) – dictionary containing parameter/list-of-values pairs.
result_parsing_function (function) – user-defined function, taking a result dictionary as argument, that can be used to parse the result files and return a list of values.
filename (path) – name of output .mat file.
runs (int) – number of runs to gather for each parameter combination.
SimulationRunner¶
- class sem.SimulationRunner(path, script, optimized=True, skip_configuration=False, max_parallel_processes=None)[source]¶
The class tasked with running simulations and interfacing with the ns-3 system.
- configure_and_build(show_progress=True, optimized=True, skip_configuration=False)[source]¶
Configure and build the ns-3 code.
- Parameters:
show_progress (bool) – whether or not to display a progress bar during compilation.
optimized (bool) – whether to use an optimized build. If False, use a standard configure.
skip_configuration (bool) – whether to skip the configuration step, and only perform compilation.
- get_build_output(process, build_program)[source]¶
Parse the output of the ns-3 build process to extract the information that is needed to draw the progress bar.
- Parameters:
process – the subprocess instance to listen to.
- run_simulations(parameter_list, data_folder, callbacks: [<class 'sem.utils.CallbackBase'>] = None, stop_on_errors=False)[source]¶
Run several simulations using a certain combination of parameters.
Yield results as simulations are completed.
- Parameters:
parameter_list (list) – list of parameter combinations to simulate.
data_folder (str) – folder in which to save subfolders containing simulation output.
callbacks (list) – list of callbacks to be triggered
stop_on_errors (bool) – if true, when a simulation outputs an error the whole campaign will be stopped
DatabaseManager¶
- class sem.DatabaseManager(db, campaign_dir)[source]¶
This serves as an interface with the simulation campaign database.
A database can either be created from scratch or loaded, via the new and load @classmethods.
- get_all_values_of_all_params()[source]¶
Return a dictionary containing all values that are taken by all available parameters.
Always returns the parameter list in alphabetical order.
- get_complete_results(params=None, result_id=None, files_to_load='.*')[source]¶
Return available results, analogously to what get_results does, but also read the corresponding output files for each result, and incorporate them in the result dictionary under the output key, as a dictionary of filename: file_contents.
- Parameters:
params (dict) – parameter specification of the desired parameter values, as described in the get_results documentation.
In other words, results returned by this function will be in the form:
{ 'params': { 'param1': value1, 'param2': value2, ... 'RngRun': value3 }, 'meta': { 'elapsed_time': value4, 'id': value5 } 'output': { 'stdout': stdout_as_string, 'stderr': stderr_as_string, 'file1': file1_as_string, ... } }
Note that the stdout and stderr entries are always included, even if they are empty.
- get_config()[source]¶
Return the configuration dictionary of this DatabaseManager’s campaign.
This is a dictionary containing the following keys:
script: the name of the script that is executed in the campaign.
params: a list of the command line parameters that can be used on the script.
commit: the commit at which the campaign is operating.
- get_next_values(values_list)[source]¶
Given a list of integers, this method yields the lowest integers that do not appear in the list.
>>> import sem >>> v = [0, 1, 3, 4] >>> sem.DatabaseManager.get_next_values(v)
[2, 5, 6, …]
- get_result_files(result)[source]¶
Return a dictionary containing filename: filepath values for each output file associated with an id.
Result can be either a result dictionary (e.g., obtained with the get_results() method) or a result id.
- get_results(params=None, result_id=None)[source]¶
Return all the results available from the database that fulfill some parameter combinations.
If params is None (or not specified), return all results.
If params is specified, it must be a dictionary specifying the result values we are interested in, with multiple values specified as lists.
For example, if the following params value is used:
params = { 'param1': 'value1', 'param2': ['value2', 'value3'] }
the database will be queried for results having param1 equal to value1, and param2 equal to value2 or value3.
Not specifying a value for all the available parameters is allowed: unspecified parameters are assumed to be ‘free’, and can take any value.
- Returns:
A list of results matching the query. Returned results have the same structure as results inserted with the insert_result method.
- have_same_structure(d2)[source]¶
Given two dictionaries (possibly with other nested dictionaries as values), this function checks whether they have the same key structure.
>>> from sem import DatabaseManager >>> d1 = {'a': 1, 'b': 2} >>> d2 = {'a': [], 'b': 3} >>> d3 = {'a': 4, 'c': 5} >>> DatabaseManager.have_same_structure(d1, d2) True >>> DatabaseManager.have_same_structure(d1, d3) False
>>> d4 = {'a': {'c': 1}, 'b': 2} >>> d5 = {'a': {'c': 3}, 'b': 4} >>> d6 = {'a': {'c': 5, 'd': 6}, 'b': 7} >>> DatabaseManager.have_same_structure(d1, d4) False >>> DatabaseManager.have_same_structure(d4, d5) True >>> DatabaseManager.have_same_structure(d4, d6) False
- insert_result(result)[source]¶
Insert a new result in the database.
This function also verifies that the result dictionaries saved in the database have the following structure (with {‘a’: 1} representing a dictionary, ‘a’ a key and 1 its value):
{ 'params': { 'param1': value1, 'param2': value2, ... 'RngRun': value3 }, 'meta': { 'elapsed_time': value4, 'id': value5 } }
Where elapsed time is a float representing the seconds the simulation execution took, and id is a UUID uniquely identifying the result, and which is used to locate the output files in the campaign_dir/data folder.
- classmethod load(campaign_dir)[source]¶
Initialize from an existing database.
It is assumed that the database json file has the same name as its containing folder.
- Parameters:
campaign_dir (str) – The path to the campaign directory.
- classmethod new(script, commit, params, campaign_dir, overwrite=False)[source]¶
Initialize a new class instance with a set configuration and filename.
The created database has the same name of the campaign directory.
- Parameters:
script (str) – the ns-3 name of the script that will be used in this campaign;
commit (str) – the commit of the ns-3 installation that is used to run the simulations.
params (list) – a list of the parameters that can be used on the script.
campaign_dir (str) – The path of the file where to save the DB.
overwrite (bool) – Whether or not existing directories should be overwritten.
Utils¶
- class sem.utils.CallbackBase(verbose: int = 0)¶
Base class for SEM callbacks. :param verbose: Verbosity level: 0 for no output, 1 for info messages, 2 for debug messages
- init_callback(controlled_by_parent) None ¶
Initialize the callback.
- is_controlled_by_parent() bool ¶
Whether this runner is aware of all simulations (false) or it has been triggered by a multithread runner and is thus aware of a subset of all runs only (true).
- on_run_end(sim_uuid: str, return_code: int, sim_time: int) bool ¶
This method will be called when each simulation run finishes # TODO maybe it does not make a lot of sense since this will be eventually overridden by the callback user :return: If the callback returns False, a run has failed.
- on_run_start(configuration, sim_uuid) None ¶
- Parameters:
configuration (dict) – dictionary representing the combination of parameters simulated in this specific
sim_uuid (str) – unique identifier string for the simulation. This value is used to name the result folder, and it is referenced in the result JSON file.
- sem.utils.automatic_parser(result, dtypes={}, converters={})¶
Try and automatically convert strings formatted as tables into nested list structures.
Under the hood, this function essentially applies the genfromtxt function to all files in the output, and passes it the additional kwargs.
- Parameters:
result (dict) – the result to parse.
dtypes (dict) – a dictionary containing the dtype specification to perform parsing for each available filename. See the numpy genfromtxt documentation for more details on how to format these.
- sem.utils.compute_sensitivity_analysis(campaign, result_parsing_function, ranges, salib_sample_function=<function sample>, salib_analyze_function=<function analyze>, samples=100)¶
Compute sensitivity analysis on a campaign using the passed SALib sample and analyze functions.
- sem.utils.constant_array_parser(result)¶
Dummy parser, used for testing purposes.
- sem.utils.get_bounds(ranges)¶
Format bounds for SALib, starting from a dictionary of ranges for each parameter. The values for the parameters contained in ranges can be one of the following: 1. A dictionary containing min and max keys, describing a range of possible values for the parameter. 2. A list of allowed values for the parameter.
- sem.utils.get_command_from_result(script, result, debug=False)¶
Return the command that is needed to obtain a certain result.
- Parameters:
params (dict) – Dictionary containing parameter: value pairs.
debug (bool) – Whether the command should include the debugging template.
- sem.utils.list_param_combinations(param_ranges)¶
Create a list of all parameter combinations from a dictionary specifying desired parameter values as lists.
Example
>>> param_ranges = {'a': [1], 'b': [2, 3]} >>> list_param_combinations(param_ranges) [{'a': 1, 'b': 2}, {'a': 1, 'b': 3}]
Additionally, this function is robust in case values are not lists:
>>> param_ranges = {'a': 1, 'b': [2, 3]} >>> list_param_combinations(param_ranges) [{'a': 1, 'b': 2}, {'a': 1, 'b': 3}]
- sem.utils.salib_param_values_to_params(ranges, values)¶
Convert SALib’s parameter specification to a SEM-compatible parameter specification.
- sem.utils.stdout_automatic_parser(result)¶
Try and automatically convert strings formatted as tables into a matrix.
Under the hood, this function essentially applies the genfromtxt function to the stdout.
- Parameters:
result (dict) – the result to parse.