You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This package provides methods for computing distances between rows of general
Tables.jl tables using the ecosystem
of scientific types available in DataScienceTraits.jl.
It follows the Distances.jl interface
as much as possible.
Rationale
A common task in statistics and machine learning consists of computing distances between observations
for different purposes (e.g. clustering, kernel methods). When the data is homogeneous, i.e. all the
attributes have the same scientific type, one can use packages such as Distances.jl
directly on the result of Tables.matrix(table). On the other hand, when the table is heterogeneous,
one must combine different distances for the various attributes using some weighting scheme.
Installation
Get the latest stable release with Julia's package manager:
] add TableDistances
Usage
We follow the Distances.jl interface as much as possible:
julia>using TableDistances
julia> table = (a=1:3, b=rand(3), c=["A", "B", "C"], d=[1, 2, 4])
(a =1:3, b = [0.7596581938450753, 0.6952806574889876, 0.6669145844749085], c = ["A", "B", "C"], d = [1, 2, 4])
julia> D =pairwise(TableDistance(), table)
3×3 Matrix{Float64}:0.01.097071.251.097070.00.9029271.250.9029270.0
Contributing
Contributions are very welcome. Please open an issue if you have questions.