On One Approach to Building An Information System for Processing Text Information Based on Semantic Groups


  • S. V. Mochenov
  • R. R. Akhmetgaleev




information system, text processing, semantic groups, text abbreviation, semantic component, selection of useful information


The paper considers an approach to text analysis based on the construction and use of databases of parts of speech and other members of a sentence. Corresponding databases for Russian texts are formed on the basis of expert assessments obtained in the process of analyzing text arrays with suggestions of varying complexity. The relevance of the work is related to the problem of automating the search and highlighting useful information for the user that is needed to solve specific problems. In the process of analysis, various index arrays are formed. Selection of various combinations of words of the sentence, comparing them with valid combinations of the database (the formation of semantic groups), structuring sentences, and formation of a hierarchical system of semantic groups are carried out. The examples show the detailed results of the software package. When analyzing the main parts of the sentence (themes and rhemes), the same set of functional modules is used. The presented results showed the fundamental possibility of creating such an information system for analyzing textual information based on the outlined approach. The developed software package when selecting the SG analyzes word combinations, rather than individual prepositions, conjunctions and other auxiliary elements of sentences. Due to the division on the SG using expert databases, a more complete preservation of the semantic component of the text is provided. In the future, it is intended to expand the scope of application of the software system to highlight useful information for the user, reduce its volume, reduce the time spent on search.


