TY - JOUR
T1 - Data should be made as simple as possible but not simpler
T2 - The method chosen for dimensionality reduction and its parameters can affect the clustering of runners based on their kinematics
AU - Rivadulla, Adrian R.
AU - Chen, Xi
AU - Cazzola, Dario
AU - Trewartha, Grant
AU - Preatoni, Ezio
N1 - Publisher Copyright:
© 2024 The Authors
PY - 2024/12/1
Y1 - 2024/12/1
N2 - Dimensionality reduction is a critical step for the efficacy and efficiency of clustering analysis. Despite the multiple available methods, biomechanists have often defaulted to Principal Component Analysis (PCA). We evaluated two PCA- and one autoencoder-based dimensionality reduction methods for their data compression and reconstruction capability, assessed their effect on the output of clustering runners’ based on kinematics, and discussed their implications for the biomechanical assessment of running technique. Eighty-four participants completed a 4-minute run at 12 km/h while trunk and lower-limb kinematics were collected. Data reconstruction quality was assessed for Direct PCA (PCA directly on original variables) and Fourier PCA (modelling time series as Fourier series and then applying PCA) using popular variance explained criteria; and a feedforward autoencoder (AE). Agglomerative hierarchical clustering was then applied and the agreement between the resulting partitions was assessed. Meaningful errors in the reconstructed signals were found when applying popular variance explained criteria, suggesting reconstruction error should be assessed to make a more informed decision about how many components to retain for further analysis. Direct PCA, Fourier PCA and AE yielded different clusters, warranting caution when comparing outcomes from studies that use different dimensionality reduction techniques: each method may be sensitive to different data features. Direct PCA retaining 99 % of the original variance emerged as the best compromise of data compression, reconstruction quality and cluster separability in our dataset. We encourage biomechanists to experiment with diverse dimensionality reduction methods to optimise clustering outcomes and enhance the real-world applicability of their findings.
AB - Dimensionality reduction is a critical step for the efficacy and efficiency of clustering analysis. Despite the multiple available methods, biomechanists have often defaulted to Principal Component Analysis (PCA). We evaluated two PCA- and one autoencoder-based dimensionality reduction methods for their data compression and reconstruction capability, assessed their effect on the output of clustering runners’ based on kinematics, and discussed their implications for the biomechanical assessment of running technique. Eighty-four participants completed a 4-minute run at 12 km/h while trunk and lower-limb kinematics were collected. Data reconstruction quality was assessed for Direct PCA (PCA directly on original variables) and Fourier PCA (modelling time series as Fourier series and then applying PCA) using popular variance explained criteria; and a feedforward autoencoder (AE). Agglomerative hierarchical clustering was then applied and the agreement between the resulting partitions was assessed. Meaningful errors in the reconstructed signals were found when applying popular variance explained criteria, suggesting reconstruction error should be assessed to make a more informed decision about how many components to retain for further analysis. Direct PCA, Fourier PCA and AE yielded different clusters, warranting caution when comparing outcomes from studies that use different dimensionality reduction techniques: each method may be sensitive to different data features. Direct PCA retaining 99 % of the original variance emerged as the best compromise of data compression, reconstruction quality and cluster separability in our dataset. We encourage biomechanists to experiment with diverse dimensionality reduction methods to optimise clustering outcomes and enhance the real-world applicability of their findings.
UR - https://www.scopus.com/pages/publications/85209407300
U2 - 10.1016/j.jbiomech.2024.112433
DO - 10.1016/j.jbiomech.2024.112433
M3 - Article
C2 - 39571422
AN - SCOPUS:85209407300
SN - 0021-9290
VL - 177
JO - Journal of Biomechanics
JF - Journal of Biomechanics
M1 - 112433
ER -