utils
General utilities
Functions
| Name | Description |
|---|---|
| detect_age_anniversary | Detect people who crossed a specific age_anniversary, does not |
| digitize_ages_1yr | This function returns the indices of the 1-year age bins in the range |
| generate_age_bin_labels | Generates consistent age bin labels. |
| generate_unique_filename | Generate a unique filename – useful when running lots of simulations, in a |
| get_age_distribution_un | Parse age distribution data from UN sources. |
| get_attr_vals | Get values of the attribute in attr_path |
| get_data_home | Return a path to the directory for default datasets. |
| get_dataset_names | Provides a list of available datasets in data_home |
| load_age_dist_un | Load UN population age distribution that can be used in place of Pakistan demographics |
| load_dataset | Load default dataset from typhoidsim data directory. |
| stratify_parameter_by_age | Returns a callable that, given an age, returns the value of a parameter |
| test_cpu_performance | Normalize performance across CPUs |
| to_df | Export results as a Pandas dataframe. Available in newer starsim versions (ie, 2.0) |
detect_age_anniversary
utils.detect_age_anniversary(sim, age_anniversary)Detect people who crossed a specific age_anniversary, does not have to be birthday necessarily.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| sim | starsim.Sim object | the current simulation object | required |
| age_anniversary | float | the age we wish to detect | required |
Returns
| Name | Type | Description |
|---|---|---|
| reached_anniv | Boolean array |
digitize_ages_1yr
utils.digitize_ages_1yr(ages)This function returns the indices of the 1-year age bins in the range (0, tyd.max_age). The bin index is used as an integer representation of the agent’s age.
generate_age_bin_labels
utils.generate_age_bin_labels(age_bins, inclusive_range=False)Generates consistent age bin labels.
This function creates age bin labels in the format of “start-end”, and they define a semi-open interval of age [start, end) by default. If inclusive is True, the age bin labels are [start, end-1]
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| age_bins | list | A list of age bins given as single age values. | required |
| inclusive_range | bool | whether the age bin labels should represent an inclusive age range [age_low, age_high] or a semi-open interval [age_low, age_high) | False |
Returns
| Name | Type | Description |
|---|---|---|
| list | A list of strings representing each age bin in the format “start-end”. |
Example
generate_age_bin_labels([10, 20, 30, 40]) [“10-20”, “20-30”, “30-40”] generate_age_bin_labels([10, 20, 30, 40], inclusive_range=True) [“10-19”, “20-19”, “30-39”]
generate_unique_filename
utils.generate_unique_filename(root_str='typhoidsim')Generate a unique filename – useful when running lots of simulations, in a distributed manner
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| root_str | str | the string that will be used at the beginning of any filename generated by this function. | 'typhoidsim' |
Returns
| Name | Type | Description |
|---|---|---|
| filename | str | a unique filename string without file extension. |
get_age_distribution_un
utils.get_age_distribution_un(loc_type='Low-and-middle-income countries')Parse age distribution data from UN sources.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| loc_type | chr | which country grouping to extract distribution from | 'Low-and-middle-income countries' |
| Options | Low-and-Lower-middle-income countries, Low-and-middle-income countries, | required |
Returns: df (pd.DataFrame): A dataframe (with columns age and value) that specifies the age distribution for this population.
get_attr_vals
utils.get_attr_vals(sim, attr_path, attr_name)Get values of the attribute in attr_path
get_data_home
utils.get_data_home(data_home=None)Return a path to the directory for default datasets. This function is needed by load_dataset(), and avoids the problem of using relative paths.
If the data_home argument is not provided, it will use a directory specified by the TYPHOIDSIM_DATA environment variable (if it exists) or otherwise it will use the default data directory.
get_dataset_names
utils.get_dataset_names(data_home=None)Provides a list of available datasets in data_home
load_age_dist_un
utils.load_age_dist_un(csv_file='un_pop_dist_bylocation.csv')Load UN population age distribution that can be used in place of Pakistan demographics
load_dataset
utils.load_dataset(ds_name, data_home=None, **kwargs)Load default dataset from typhoidsim data directory.
This function provides access to a small collection of datasets that are useful to set empirical distributions (ie, demographics, or gallstone probs by age and gender), rather than having those hardcoded in the code.
The small datasets are expected to be simple tabular data saved in csv files. This function may apply some small amount of preprocessing, but it’s not intended to be a full ingest and preprocessing pipelines. The csv files are expected to be in an already ‘ingestable’ form and simply loaded with pandas.read_csv().
Use get_dataset_names to see a list of available datasets.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| ds_name | str | name of the dataset ({name}.csv). |
required |
| data_home | str / Path |
the directory in which to cache data. | None |
| kwargs | additional keyword arguments passed through to pandas.read_csv. |
{} |
data, type depends on dataset
| Name | Type | Description |
|---|---|---|
| df | pandas.DataFrame |
tabular data, with some preprocessing applied |
| (depends on the dataset) | ||
| or | ||
| arr | numpy.ndarray |
array data |
stratify_parameter_by_age
utils.stratify_parameter_by_age(age_bin_edges, par_bin_values)Returns a callable that, given an age, returns the value of a parameter assigned to the age bin the given age falls into.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| age_bin_edges | np.ndarray | The edges of the age bins. Should be in ascending order. | required |
| par_bin_values | np.ndarray | The parameter values assigned to each age bin. Should be the of length age_bin_edges - 1. | required |
Returns
| Name | Type | Description |
|---|---|---|
| age_bin_function | callable | A function that takes an age and returns the parameter value for the bin that the age falls into. |
age_bin_edges = np.array([0, 2, 5, 120]) par_bin_values = np.array([904.4, 240.9, 0.0]) age_stratified_parameter = stratify_parameter_by_age(age_bin_edges, par_bin_values) age_stratified_parameter(25) # should return 0.0
test_cpu_performance
utils.test_cpu_performance()Normalize performance across CPUs
to_df
utils.to_df(sim, sep='_')Export results as a Pandas dataframe. Available in newer starsim versions (ie, 2.0) Only saves 1D result arrays, discards the results from analyzers, unless they are 1D arrays.