manydist - Distance-Based Learning for Mixed-Type Data
Provides tools for constructing, computing, and using
distance measures for numerical, categorical, and mixed-type
data. The package implements a flexible framework in which
continuous and categorical components can be combined under
additive, commensurable, and association-aware specifications.
Supported methods include classical distances such as Gower,
Euclidean, Manhattan, and Mahalanobis-type distances;
categorical dissimilarities such as simple matching,
occurrence-frequency, and association-based measures; and
mixed-type presets designed to reduce biases due to variable
type, scale, distribution, redundancy, and number of
categories. The package also provides scaling options,
supervised and unsupervised distance constructions,
leave-one-variable-out tools for distance-based variable
importance, and integration with distance-based learning
workflows such as nearest-neighbour prediction, partitioning
around medoids, and spectral clustering. Methods are motivated
by van de Velden, Iodice D'Enza, Markos, and Cavicchia (2026)
<doi:10.1080/10618600.2026.2680181> and related work on
categorical and mixed-type dissimilarities.