utils

utils

General utilities

Functions

Name	Description
detect_age_anniversary	Detect people who crossed a specific age_anniversary, does not
digitize_ages_1yr	This function returns the indices of the 1-year age bins in the range
generate_age_bin_labels	Generates consistent age bin labels.
generate_unique_filename	Generate a unique filename – useful when running lots of simulations, in a
get_age_distribution_un	Parse age distribution data from UN sources.
get_attr_vals	Get values of the attribute in attr_path
get_data_home	Return a path to the directory for default datasets.
get_dataset_names	Provides a list of available datasets in data_home
load_age_dist_un	Load UN population age distribution that can be used in place of Pakistan demographics
load_dataset	Load default dataset from typhoidsim data directory.
stratify_parameter_by_age	Returns a callable that, given an age, returns the value of a parameter
test_cpu_performance	Normalize performance across CPUs
to_df	Export results as a Pandas dataframe. Available in newer starsim versions (ie, 2.0)

detect_age_anniversary

utils.detect_age_anniversary(sim, age_anniversary)

Detect people who crossed a specific age_anniversary, does not have to be birthday necessarily.

Parameters

Name	Type	Description	Default
sim	starsim.Sim object	the current simulation object	required
age_anniversary	float	the age we wish to detect	required

Returns

Name	Type	Description
reached_anniv	Boolean array

digitize_ages_1yr

utils.digitize_ages_1yr(ages)

This function returns the indices of the 1-year age bins in the range (0, tyd.max_age). The bin index is used as an integer representation of the agent’s age.

generate_age_bin_labels

utils.generate_age_bin_labels(age_bins, inclusive_range=False)

Generates consistent age bin labels.

This function creates age bin labels in the format of “start-end”, and they define a semi-open interval of age [start, end) by default. If inclusive is True, the age bin labels are [start, end-1]

Parameters

Name	Type	Description	Default
age_bins	list	A list of age bins given as single age values.	required
inclusive_range	bool	whether the age bin labels should represent an inclusive age range [age_low, age_high] or a semi-open interval [age_low, age_high)	`False`

Returns

Name	Type	Description
list		A list of strings representing each age bin in the format “start-end”.

Example

generate_age_bin_labels([10, 20, 30, 40]) [“10-20”, “20-30”, “30-40”] generate_age_bin_labels([10, 20, 30, 40], inclusive_range=True) [“10-19”, “20-19”, “30-39”]

generate_unique_filename

utils.generate_unique_filename(root_str='typhoidsim')

Generate a unique filename – useful when running lots of simulations, in a distributed manner

Parameters

Name	Type	Description	Default
root_str	str	the string that will be used at the beginning of any filename generated by this function.	`'typhoidsim'`

Returns

Name	Type	Description
filename	str	a unique filename string without file extension.

get_age_distribution_un

utils.get_age_distribution_un(loc_type='Low-and-middle-income countries')

Parse age distribution data from UN sources.

Parameters

Name	Type	Description	Default
loc_type	chr	which country grouping to extract distribution from	`'Low-and-middle-income countries'`
Options		Low-and-Lower-middle-income countries, Low-and-middle-income countries,	required

Returns: df (pd.DataFrame): A dataframe (with columns age and value) that specifies the age distribution for this population.

get_attr_vals

utils.get_attr_vals(sim, attr_path, attr_name)

Get values of the attribute in attr_path

get_data_home

utils.get_data_home(data_home=None)

Return a path to the directory for default datasets. This function is needed by load_dataset(), and avoids the problem of using relative paths.

If the data_home argument is not provided, it will use a directory specified by the TYPHOIDSIM_DATA environment variable (if it exists) or otherwise it will use the default data directory.

get_dataset_names

utils.get_dataset_names(data_home=None)

Provides a list of available datasets in data_home

load_age_dist_un

utils.load_age_dist_un(csv_file='un_pop_dist_bylocation.csv')

Load UN population age distribution that can be used in place of Pakistan demographics

load_dataset

utils.load_dataset(ds_name, data_home=None, **kwargs)

Load default dataset from typhoidsim data directory.

This function provides access to a small collection of datasets that are useful to set empirical distributions (ie, demographics, or gallstone probs by age and gender), rather than having those hardcoded in the code.

The small datasets are expected to be simple tabular data saved in csv files. This function may apply some small amount of preprocessing, but it’s not intended to be a full ingest and preprocessing pipelines. The csv files are expected to be in an already ‘ingestable’ form and simply loaded with pandas.read_csv().

Use get_dataset_names to see a list of available datasets.

Parameters

Name	Type	Description	Default
ds_name	str	name of the dataset (`{name}.csv`).	required
data_home	str / `Path`	the directory in which to cache data.	`None`
kwargs		additional keyword arguments passed through to `pandas.read_csv`.	`{}`

data, type depends on dataset

Name	Type	Description
df	`pandas.DataFrame`	tabular data, with some preprocessing applied
		(depends on the dataset)
		or
arr	`numpy.ndarray`	array data

stratify_parameter_by_age

utils.stratify_parameter_by_age(age_bin_edges, par_bin_values)

Returns a callable that, given an age, returns the value of a parameter assigned to the age bin the given age falls into.

Parameters

Name	Type	Description	Default
age_bin_edges	np.ndarray	The edges of the age bins. Should be in ascending order.	required
par_bin_values	np.ndarray	The parameter values assigned to each age bin. Should be the of length age_bin_edges - 1.	required

Returns

Name	Type	Description
age_bin_function	callable	A function that takes an age and returns the parameter value for the bin that the age falls into.

age_bin_edges = np.array([0, 2, 5, 120]) par_bin_values = np.array([904.4, 240.9, 0.0]) age_stratified_parameter = stratify_parameter_by_age(age_bin_edges, par_bin_values) age_stratified_parameter(25) # should return 0.0

test_cpu_performance

utils.test_cpu_performance()

Normalize performance across CPUs

to_df

utils.to_df(sim, sep='_')

Export results as a Pandas dataframe. Available in newer starsim versions (ie, 2.0) Only saves 1D result arrays, discards the results from analyzers, unless they are 1D arrays.