
Enforce k-anonymity on a synthetic dataset (output guarantee)
Source:R/enforce-kanon.R
enforce_kanon.RdShapes the synthetic output so that no quasi-identifier combination appears
in fewer than k records. Direct identifiers are removed. Quasi-identifiers
are coarsened step-by-step and any residual cell still below k has its QI
values blanked (NA). Operates on the output only.
Arguments
- synthetic
A synthetic data frame.
- roles
A roles object/data frame with
variable+disclosure_role.- k
Minimum cell size (default 5).
- max_steps
Maximum coarsening iterations (default 6).
- max_suppress_frac
Feasibility backstop. If satisfying
kover the quasi-identifier set would require blanking more than this fraction of rows, k-anonymity is treated as infeasible for the chosen QI set: the coarsening/suppression is not applied (it would destroy the dataset), the synthetic output is returned populated, and a warning advises narrowing the quasi-identifiers or loweringk. Default 0.2.