add_equity_data

site.SiteProblem.add_equity_data(
    equity_data,
    equity_col,
    common_col,
    label,
    direction='higher_is_worse',
    continuous_measure=False,
    n_bins=10,
    reverse=False,
    verbose=True,
)

Add a dataframe containing equity data into your problem.

This method associates demand points with an equity metric (such as the Index of Multiple Deprivation). If a continuous measure is provided, it is automatically discretized into deciles (or maximum possible quantiles) to facilitate categorical plotting and comparative equity analysis.

Parameters

Name Type Description Default
equity_data str, pandas.DataFrame, or geopandas.GeoDataFrame The input data containing the equity metrics. Can be a filepath or an already loaded dataframe object. required
equity_col str The name of the column in equity_data containing the equity values or categories to be used. required
common_col str The name of the ID column used to join this data to the primary demand/spatial data in the SiteProblem. required
label str A human-readable label for the equity metric (e.g., ‘IMD Decile’, ‘Age Group’). This is used internally for auto-generating plot titles and table headers. required
direction (higher_is_better, higher_is_worse) Indicates whether higher values of equity_col represent a more or less advantaged group. This is stored as metadata and applied at analysis time — it does not modify the stored data. - "higher_is_better" : higher values indicate a more favourable equity position (e.g. IMD decile 10 = least deprived under the standard DLUHC 1–10 scale). - "higher_is_worse" : higher values indicate greater disadvantage (e.g. raw IMD score, where a higher score means more deprived; or a custom scale where 1 = least deprived). .. note:: IMD deciles as published by DLUHC run 1 (most deprived) to 10 (least deprived), so for pre-binned IMD decile columns use direction="higher_is_better". For raw IMD scores (higher = more deprived) use direction="higher_is_worse". "higher_is_better"
continuous_measure bool If True, treats equity_col as continuous numerical data and uses quantile-based discretization to convert it into deciles (1-10). The raw continuous data is preserved in a new column named {equity_col}_raw. False
reverse bool Only used when continuous_measure=True. Controls the direction of bin labelling relative to the raw values: - False (default): lower raw values receive lower bin numbers. - True: lower raw values receive higher bin numbers (i.e. the labelling is inverted). This is purely a binning convenience — for instance, to convert a raw IMD score (where lower = less deprived) into a decile where 1 = least deprived. It is independent of direction, which governs downstream analysis rather than how bins are labelled. False
verbose bool If True, output additional warnings and messages True

Raises

Name Type Description
ValueError If continuous_measure is True but the data cannot be meaningfully binned due to too many identical values.

Notes

When continuous_measure is True, pandas.qcut is used with duplicates='drop'. If the data is highly skewed with duplicate values, this may result in fewer than 10 bins. The method handles this dynamically to ensure the resulting categories always start at 1.

Back to top