Abstract:
A large amount of continuously increasing textual geoscience data is stored and not fully utilized. Text mining enables the discovery and analysis of valuable information,and presents valuable insights hidden in geological texts. This research aims to use text mining and visualization techniques to obtain content words -for the purpose of visually analyzing geological reports. The framework proposed in this study can enable researchers to quickly understand key information and improve the transmission efficiency of geological reports. First, we implemented an improved keyword extraction algorithm comprising the term frequency-inverse document frequency and word length to improve the accuracy of geological keyword extraction. Second, we extracted and visualized the relative importance as well as the links between content words that can represent the key information of geoscience reports using word-level information analysis and multidimensional scaling analysis. Finally, the keyword relevance and mutual clustering relations were visualized through graphs to provide an intuitive representation of the current state of the reports.