O‘zbek tili uchun mashinaviy o‘rganish asosidagi to‘liq NLP pipeline ishlab chiqish tahlili
Keywords:
O‘zbek tili, tabiiy tilni qayta ishlash, mashinaviy o‘rganish, lemmatizatsiya, POS-tagging, NLP pipelineAbstract
Ushbu maqolada o‘zbek tili uchun mashinaviy o‘rganish (ML) va
tabiiy tilni qayta ishlash (NLP) asosida to‘liq pipeline (jarayonlar zanjiri) ishlab chiqish
masalasi tahlil qilinadi. Tahlil jarayonida tokenizatsiya, morfologik tahlil, so‘z turini
aniqlash (POS-tagging), lemmatizatsiya, sintaktik tahlil, semantik model yaratish va
sentiment tahlil komponentlari ishlab chiqiladi. O‘zbek tilining agglutinativ xususiyatlari
inobatga olingan holda, ushbu pipeline uchun maxsus korpus va modellar yaratiladi.
Eksperimental natijalar ushbu yondashuvning samaradorligini ko‘rsatadi va uni tarjima,
matnni avtomatik tahlil qilish va chatbot tizimlarida qo‘llash mumkinligini tasdiqlaydi.
References
BERT: Pre-training of Deep Bidirectional Transformers for Language
Understanding. (2018). Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. arXiv preprint
arXiv:1810.04805.
The Illustrated Transformer. (2018). Vaswani, A., Shazeer, N., Parmar, N.,
Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2018). "The Illustrated
Transformer". Distill. Retrieved from https://distill.pub/2018/transformer/
Attention Is All You Need. (2017). Vaswani, A., Shazeer, N., Parmar, N.,
Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. Proceedings of NIPS
Deep Learning for NLP (and Other Things). (2019). Goldberg, Y. ACM
SIGIR.
A Survey on Sentiment Analysis: Approaches and Applications. (2018).
Aggarwal, C. C., & Zhai, C. X. Springer-Verlag.
LSTM: A Search Space Odyssey. (2015). Greff, K., Srivastava, R. K.,
Koutník, J., Steunebrink, B. R., & Schmidhuber, J. IEEE Transactions on Neural Networks
and Learning Systems.
The Stanford Sentiment Treebank. (2013). Socher, R., Perelygin, A., Wu, J.,
Chuang, J., Manning, C. D., Ng, A. Y., & Potts, C. Proceedings of the 2013 Conference on
Empirical Methods in Natural Language Processing (EMNLP 2013).
Practical Guide to Machine Learning with TensorFlow. (2020). Chollet, F.
O'Reilly Media.
Morphological Analysis of the Uzbek Language. (2012). Iskanderov, K., &
Mirzaev, M. Journal of Language and Linguistic Studies, 8(2), 112-130.
A Comparison of Machine Learning Algorithms for Sentiment
Classification. (2020). Kumar, A., & Gupta, R. International Journal of Computer
Applications, 175(9), 28-35.
Word2Vec and GloVe: The Two Models of Word Embeddings. (2014).
Mikolov, T., Chen, K., Corrado, G., & Dean, J. Proceedings of the 2013 Conference on
Neural Information Processing Systems (NIPS).
Neural Networks for NLP. (2017). Goldberg, Y. Cambridge University
Press.
Development of NLP Tools for Turkic Languages. (2021). Kadir, S., &
Jalilov, S. Asian Journal of Computational Linguistics, 9(3), 1-18.
Deep Learning for Uzbek Language: Challenges and Opportunities.
(2019). Tashkent State University of Economics. Unpublished Master's Thesis.
Transformers for Natural Language Processing: A Comprehensive
Overview. (2020). Brown, T. B., Mann, B., Ryder, N., Subbiah, M., & Kaplan, J.
Proceedings of NeurIPS 2020.




