Extracted information quality, a comparative study in high and low dimensions

Leandro Ariza-Jimenez, Luisa F. Villa, Nicolas Pinel, O. Lucia Quintero

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Uncovering interesting groups in either multidimensional or network spaces has become an essential mechanism for data exploration and understanding. Decision making requires relevant information as well as high-quality on the retrieved conclusions. We presented a comparative study of two compact representations drawn from the same set of data objects by clustering high-dimensional spaces and low-dimensional Barnes-Hut t-stochastic neighbour embeddings. There is no consensus on how the problem should be addressed and how these representations/models should be analysed because of their different notions. We introduced a measure to compare their results and capability to provide insights into the information retrieved. We considered low-dimensional embeddings as a potentially revealing strategy to uncover dynamics possibly not uncovered in big-data spaces. We demonstrated that a non-guided approach can be as revealing as a user-guided approach for data exploration and presented coherent results for good uncertainty modelling capability in terms of fuzziness and densities.

Original languageEnglish
Pages (from-to)214-241
Number of pages28
JournalInternational Journal of Business Intelligence and Data Mining
Volume19
Issue number2
DOIs
StatePublished - 2021

Keywords

  • Bh-sne embeddings
  • Cluster fuzziness
  • Consistency
  • Decision making
  • High-dimensional clustering
  • Reliable information

Fingerprint

Dive into the research topics of 'Extracted information quality, a comparative study in high and low dimensions'. Together they form a unique fingerprint.

Cite this