HuSpaCy an industrial-strength Hungarian natural language processing toolkit /

Although there are a couple of open-source language processing pipelines available for Hungarian, none of them satisfies the requirements of today’s NLP applications. A language processing pipeline should consist of close to state-of-the-art lemmatization, morphosyntactic analysis, entity recognitio...

Teljes leírás

Elmentve itt :

Bibliográfiai részletek
Szerzők:	Orosz György Szántó Zsolt Berkecz Péter Szabó Gergő Farkas Richárd
Testületi szerző:	Magyar számítógépes nyelvészeti konferencia (18.) (2022) (Szeged)
Dokumentumtípus:	Könyv része
Megjelent:	2022
Sorozat:	Magyar Számítógépes Nyelvészeti Konferencia 18
Kulcsszavak:	Nyelvészet - számítógép alkalmazása
Tárgyszavak:	Természettudományok Számítás- és információtudomány Bölcsészettudományok Nyelvek és irodalom
Online Access:	http://acta.bibl.u-szeged.hu/75865

Leíró adatok
Tartalmi kivonat:	Although there are a couple of open-source language processing pipelines available for Hungarian, none of them satisfies the requirements of today’s NLP applications. A language processing pipeline should consist of close to state-of-the-art lemmatization, morphosyntactic analysis, entity recognition and word embeddings. Industrial text processing applications have to satisfy non-functional software quality requirements, what is more, frameworks supporting multiple languages are more and more favored. This paper introduces HuSpaCy, an industryready Hungarian language processing toolkit. The presented tool provides components for the most important basic linguistic analysis tasks. It is open-source and is available under a permissive license. Our system is built upon spaCy’s NLP components resulting in an easily usable, fast yet accurate application. Experiments confirm that HuSpaCy has high accuracy while maintaining resource-efficient prediction capabilities.
Terjedelem/Fizikai jellemzők:	59-73
ISBN:	978-963-306-848-9

HuSpaCy an industrial-strength Hungarian natural language processing toolkit /

Hasonló tételek