Services: lang-uk

Lang-uk microservices

Lang-uk microservices give you an opportunity to launch and use the main tools developed by our team. Technically, they are implemented with Swagger and Docker technologies.

Now the following services are available:

Tokenization
Ukrainian, Russian, and English NER
Lemmatization with the nlp_uk library
Language identification with the WILD library

The lang-uk-ms project allows launching all microservices at once and accessing them via a web service API.

Example of an HTTP query:

$ curl -X POST -H "Content-Type: application/json" -d "{'text': 'Несе Галя'}" http://localhost:8080/lang-detect/wiki/detect [["uk",0.83333343],["bg",0.16666652]]

The docker scripts were developed by Mykhailo Chalyi.

NER microservice

NER microservice allows the annotation of a tokenized text using models trained with the help of MITIE library for the Ukrainian, Russian, and English languages (depending on what Dockerfile you choose during the launch). This microservice was developed by Mykhailo Chalyi.

Example of an HTTP query:

$ curl -X POST -H "Content-Type: application/json" -d '{ "tokens": ["Несе","Галя","воду",",","Коромисло","гнеться" ]}' http://localhost:8080/

NLP_UK microservice

Microservice on the basis of NLP_UK library gives an opportunity to lemmatize the input text with the dict_uk dictionary, which also includes tokenization. It was developed by Andriy Rysin.

Example of an HTTP query:

$ curl -X POST -H "Content-Type: application/json" -d "{'text': 'Сьогодні у продажі. 12-те зібрання творів 1969 р. І. П. Котляревського.'}" http://localhost:8080/lemmatize/

Language identification microservice (WILD)

With the help of the wiki-lang-detect library, the WILD microservice allows identifying the language of the input text out of 156 languages that are used on the Internet.

Example of an HTTP query:

$ curl -X POST -H "Content-Type: application/json" -d "{'text': 'Несе Галя'}" http://localhost:8080/

Coherence estimation web service

The web service (Docker image) is created in order to perform the following processing of Ukrainian texts:

Estimation of the coherence of a text (a number of text’s sentences should exceed 3).
Extraction of noun phrases.
Search for coreferent pairs (test mode).

The web service is implemented as REST API. Data processing is performed using HTTP POST requests to the following endpoints:

Estimation of the coherence of a text: /api/get_coherence
Extraction of noun phrases: /api/get_phrases
Search for coreferent pairs: /api/get_coreferent_clusters

The input format of a request body: {"text": "<:text>"}. The response from the server is returned in JSON format according to the endpoint.

The web service is created by Artem Kramov