Extracting named entities from Russian-language documents with different expressiveness of structure
This work is devoted to solving the problem of recognizing named entities for Russian-language texts based on the CRF model. Two sets of data were considered: documents on refinancing with a good document structure, semi-structured texts of court records. The model was tested under various sets of t...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Yaroslavl State University
2023-12-01
|
Series: | Моделирование и анализ информационных систем |
Subjects: | |
Online Access: | https://www.mais-journal.ru/jour/article/view/1827 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This work is devoted to solving the problem of recognizing named entities for Russian-language texts based on the CRF model. Two sets of data were considered: documents on refinancing with a good document structure, semi-structured texts of court records. The model was tested under various sets of text features and CRF parameters (optimization algorithms). In average for all entities, the best F-measure value for structured documents was 0.99, and for semi-structured ones 0.86. |
---|---|
ISSN: | 1818-1015 2313-5417 |