Automatic translator from Portuguese (voice and text) to Portuguese sign language

Maeva de Brito1 and Nuno Soares Domingues2*

*Correspondence:
Nuno Soares Domingues,
nndomingues@gmail.com

Received: 21 March 2021; Accepted: 12 April 2021; Published: 26 April 2021.

One of the major focuses of technology, engineering, and computer science is to solve problems and improve the quality of life in health. The relationship between health and technology has made great advances in the last decades, and many cooperations are foreseen. Some citizens still have to deal with many obstacles daily to cope with the society as it is designed. One of the obstacles citizens deal with is deaf and hearing impairment. In Portugal, there are about 100,000 to 150,000 people with some level of hearing loss, and of these, around 30,000 people use Portuguese Sign Language as their mother tongue (1); National Institute for Rehabilitation, (2). The greatest difficulties are appropriate or amend if necessary. Encountered by them are poor communication with hearing people and the need of a translator person. Communication became even more complicated with the mandatory use of face mask, respect for the maximum number of people per division and social distancing due to the COVID-19 pandemic. To diminish or even solve this issue, the authors developed an automatic voice and text translation system to Portuguese sign language with captions. Sign language will be very useful for deaf and hearing impairment citizens, and captions will be useful for deaf and hearing impairment citizens who don’t understand sign language. For that, programming language tools were developed to create a translator with those requests. This article describes the developments and achievements. For the proof of concept, the translator started with 16 images in the database and reached a confidence level between 70 and 90%. This is an incentive to further developments that authors are continuing to produce more improvements in the developed tool.

Keywords: hearing impaired, communication, COVID-19, Portuguese sign language, automatic translator

Introduction

Disability an evolving concept that results from the interaction between people with disabilities and people with behavioral, orientation or spatial barriers, which prevents a person’s full and effective participation in society on an equal basis with others National Institute for Rehabilitation (3). One of these disabilities is people with deaf and hearing impairment. In Portugal, there are about 100,000 to 150,000 people with some level of hearing loss, and of these, around 30,000 people use Portuguese sign language as their mother tongue [(1); National Institute for Rehabilitation, (2)].

To overcome the existing gaps, it is often necessary to have empathy and think about what can be done to improve our services for disabled citizens. Being hearing impaired and having had certain negative experiences, which intensified with the COVID-19 pandemic, as a result of the mandatory use of the face mask, it made sense to create an automatic translator from voice to text and from text to Portuguese sign language (LGP), which will certainly be useful for people with the same limitations.

The difference between hearing impaired and deaf people lies in the depth of hearing loss and how well they can communicate (4). People with hearing impairment can hear certain sounds, but with difficulty; to circumvent this pathology in order to be able to communicate, they use speech in conjunction with medical devices, such as hearing aids or cochlear implants, depending on the degree of hearing loss (1). People with profound hearing loss, who cannot hear even using hearing aids or cochlear implants, use sign language as communication (1).

According to several authors, the greatest difficulties encountered by the deaf and hard of hearing are poor communication with the listeners, since there is a great ignorance on the part of the hearing community about the deaf community’s own characteristics, namely, LGP, and the lack of access to properly trained interpreters (5).

In addition to the problem of poor communication between hearing-impaired people and listeners, in 2019, the COVID-19 pandemic brought even more complications with regard to communication (6). The mandatory use of face mask to prevent the transmission of the SARS CoV 2 virus started to have a negative impact in this context, since many deaf and hearing impaired people, even those using sign language, need to visualize facial expressions, as well as use lip reading (6).

Paper Informatics, programming and machine learning have a key role in improving the communication of hearing-impaired and deaf people in the society. The work presented here is an example of the application of an automatic translator for written text, intended for hearing- impaired people who do not know Portuguese sign language (LGP), and for LGP, intended for people who can only communicate through sign language.

Portuguese sign language is the sign language used by the Portuguese deaf community, which is characterized as a form of communication through hand movements and facial and body expressions that have their own vocabulary and grammar (1).

Portuguese sign language has a very specific sentence structure, quite different from the structure used in the Portuguese language (LP), as it does not use linguistic connectors and the verb is always applied in the infinitive Goncalves et al. (7). This way, LGP respects a subject-object-verb grammatical syntax (SOV), whereas the LP presents a subject-verb-object sentence structure (SVO).

In Portugal, according to the Portuguese Association of the Deaf, there are about 100,000 to 150,000 people with some level of hearing loss, and of these, about 30,000 people use LGP as their mother tongue [(1); National Institute for Rehabilitation, (2)].

In the last decade, there has been a special interest in the development of machine translators due to the evolution of technology, as well as a greater focus on promoting the social inclusion of the deaf community, making communication between deaf and hearing people more effective.

In 2016, a bidirectional translator of Portuguese sign language (called VirtualSign) was developed, a model that facilitates the access of deaf and hearing-impaired people to digital content, especially educational and learning content (8).

At an international level, it is worth mentioning the automatic translator HandTalk, which uses an avatar named “Hugo”. It is considered the largest automatic translation platform for sign language in the world, and it was elected in 2013 by the United Nations (UN) as the best social app in the world (9). However, according to a questionnaire to the deaf community in Portugal (Appendix A and Appendix B), it was possible to verify that the automatic translators that use avatars are not in fact the best method of communication, since the part of facial/body language expression is compromised because it is a crucial element in the phonology of sign languages.

This way, with this work, we intended to create an automatic translator that doesn’t use an avatar in order not to compromise the parameter related to facial and corporal expression inherent to LGP. A database was created which presents several images with LGP gestures. When the system refers to the word “nurse”, for instance, the respective image with the “nurse” gesture appears together with the caption. For that to happen, it was necessary to use different programming languages in order to achieve a code that can receive the word “nurse” as input by voice recognition and write it in the text field, and the code will search for this word in the library, and if it finds it, it will show on the screen the image of the gesture corresponding to the word “nurse” with the respective legend above the image and its degree of confidence.

Materials and methods

To develop the voice-text and text-LGP machine translator, the authors got access to a set of tools that enabled its proper functioning. Figure 1 presents the flowchart of the process, where it is possible to verify all the steps taken in order to develop the translator.

FIGURE 1
www.bohrpub.com

Figure 1. Flowchart of the developed automatic translator.

Each block of the flowchart is briefly explained below, which receives as input a word or a set of words (sentence) entered by the user.

Text input

This stage uses a microphone to receive the input from the user and uses the Google Recognition tool to convert audio to text. Figure 2 shows the interface of the translation system.

FIGURE 2
www.bohrpub.com

Figure 2. Interface of the developed translation system.

Sentence recognition

1. If the spoken sentence is not recognized: “Sentence not found” will be displayed on the screen.

2. If it is recognized: The recognized sentence and its confidence level are displayed on the screen, and these two variables are stored in the computer’s RAM. In Figure 3, one can see the sentence “O enfermeiro chegou ao hospital” (it will be spoken in the SOV structure saying “Enfermeiro hospital chegar”).

FIGURE 3
www.bohrpub.com

Figure 3. Presentation of the images corresponding to the words spoken in the microphone.

Split function

This function splits a string into an ordered list of substrings, makes those substrings into an array, and returns the array. In this case, the function splits the variable sentence into several variables (words), as exemplified next with the sentence “hospital nurse arrive”. If only one word is spoken in the microphone, it just returns the word itself.

Input (text): enfermeiro hospital chegar

Output (function SPLIT): “enfermeiro,” “hospital,” “chegar”

Database (BD)

After the division of the sentence into words by the SPLIT function, the stored word is compared with the words in the library list (the database library shows the words and their respective representative images, if they exist in the BD). The database presents 21 images; however, only 16 images are being recognized, because the words, “Ainda nao,” “Nao ha,” “Nos,” “Dificil” and “OS,” are recognized by the microphone, but they cannot match the respective images.

Existence of the word in a file name.jpg

1. If the word doesn’t exist in the BD: The screen shows the symbol of the image not found, as shown in Figure 4.

FIGURE 4
www.bohrpub.com

Figure 4. Symbol for the image not found.

2. If the word exists in the DB: Displays the corresponding image that takes the value of the word found, as in the example of the word “enfermeira”, shown in Figure 5.

FIGURE 5
www.bohrpub.com

Figure 5. Presentation of the “enfermeira” image corresponding to the text recognized by the microphone.

Results and discussion

To evaluate the machine translation system developed, we applied a questionnaire that proposes a set of tasks, which are to speak five sentences in the translator’s microphone and, after these tasks, to answer seven questions in order to evaluate the translator and his/her performance.

The selected sentences are in SOV structure since the translator has a limitation regarding the ability to change the structure from SVO to SVO. Thus, the translator can recognize the voice, switch to text and display on the screen the image corresponding to that text. The evaluation is done based on the translator’s goal, voice-to-text-to-image conversion.

The sentences selected and the output of each sentence are presented as follows:

1. LGP: FILHA TUA ENFERMEIRA (LP: A tua filha ei enfermeira).

2. LGP: EU CHOCOLATE NAO GOSTAR (LP:Eu nao gosto de chocolate).

3. LGP: FILHO MEU DOENTE (LP:O meu filho estai doente).

4. LGP: ELES LISBOA CHEGAR (LP: Eles chegaram a Lisboa).

5. LGP: ENFERMEIRO HOSPITAL CHEGAR (LP: O enfermeiro chegou ao hospital).

The questionnaire was answered by three people from the Associacao de Surdos do Concelho de Sintra, two of whom were born deaf and both use cochlear implants. The third respondent was born hearing, but due to measles infection, he became profoundly deaf, and he doesn’t use any hearing aid, because he did not adapt. All respondents use LGP to communicate in their daily lives (Figures 6 to 10).

FIGURE 6
www.bohrpub.com

Figure 6. Representation in LGP of the sentence “FILHA TUA ENFERMEIRA.”

FIGURE 7
www.bohrpub.com

Figure 7. Representation in LGP of the sentence “EU CHOCO- LATE NÃO GOSTAR.”

FIGURE 8
www.bohrpub.com

Figure 8. Representation in LGP of the sentence “FILHO MEU DOENTE.”

FIGURE 9
www.bohrpub.com

Figure 9. Representation in LGP of the sentence “ELES LISBOA CHEGAR.”

FIGURE 10
www.bohrpub.com

Figure 10. Representation in LGP of the sentence “ENFERMEIRO HOSPITAL CHEGAR.”

The results obtained are presented in Table 1.

TABLE 1
www.bohrpub.com

Table 1. Results obtained in the questionnaires made by 3 people from the Associação de Surdos do Concelho de Sintra.

Given the answers obtained, the respondents assessed the automatic translator positively, and it is an asset to the deaf community (Table 1).

Constructive criticism was made at the level of images, which could present a more prominent outline in order to highlight the gesture more and could also have slightly larger images for better visualization of the gesture, if the device presenting the images is at a greater distance. It was also suggested to make the inverse translation of the proposed translation system (from LGP to text), so that listeners understand what is being gestured.

Respondents, despite finding the proposed translation system “very good,” mentioned that they have a preference for LGP interpreters or video translation systems, knowing that both have some limitations. According to them, interpreters are the first choice for them; however; it is necessary to pay for the service all the time, and the State doesn’t help financially in this sense, making it very complicated economically. They also mentioned that, as described in the literature, SNS 24 has communication channels with deaf people through video call and webchat, but according to the respondents, they are constantly busy and end up not being able to contact any interpreter.

Respondents shared the same opinion about the avatar machine translators. According to them, these translators are often very confusing and cannot show emotion, as is the case of the VirtualSign avatar, where they say they cannot understand what is being gestured by the avatar, as the hand arrangement is very confusing in relation to the space. The HandTalk (Libras) translator, however, is one that respondents agree is a fairly good and perceptible translator.

Thus, it can be concluded that the translation system developed had a good evaluation by the respondents, both at the translation level and at the level of understanding of the figures.

Limitations of the study

The major limitation of this study was undoubtedly the little (practically non-existent) information available on LGP and its structure. Undoubtedly, more studies and coherent information on this theme are necessary.

Relative to the translation system developed, it presents some limitations, namely:

1. Of the 23 images that are in the database library, only 16 images are working correctly, because the words “Ainda não” and “Não há” in sign language are represented only with a gesture; however, the translator has “read” them as two distinct words and will look for two images “Ainda” and “não”, for example. The words “Nós”, “Difícil” and “ÓS” are recognized by the microphone, but are not matched with their respective images, as they are accented words

2. This translator cannot switch the text recognized by the microphone from the Portuguese sentence structure (SVO) to the sign language structure (SOV).

3. It is not possible to write directly into the text box to translate.

Conclusion and future prospects

Undoubtedly, the last decade has been marked by strong growth in the area of technology, which has had a positive impact on all sectors, namely, the health sector. However, there are still many obstacles to overcome regarding the adaptation of society to people with disabilities, namely, hearing impairment.

Few works have been developed, at a national level, which tried to create a system that could help this community, but two works should be highlighted, namely, “PE2LGP: From Text to Sign Language (and vice-versa)” by the student Ruben Santos from Tecnico de Lisboa and the “Real-Time Bidirectional Translator of Portuguese Sign Language” by Professor Paulo Escudeiro, responsible for the project, from Instituto Superior de Engenheiria do Porto. Both projects have some limitations, namely, the parameter related to facial and body expressions inherent to LGP and the linguistic level, because although LGP is an official language since 1997 in Portugal, there is very little information about it.

Thus, this study aimed to create an automatic voice and text translator for LGP, which could help hearing-impaired people, without resorting to the use of avatars in order not to compromise the parameter related to facial and corporal expressions inherent to LGP. In the first instance, in order to understand what deaf people and people who use LGP as a means of communication think about translators, a questionnaire was made to this community, and 183 answers were obtained, of which 39 were from people who had some degree of deafness. From this questionnaire, it was concluded that most of the respondents agree that the translator is a good way of inclusion for the deaf community, and what translator is more perceptible to the respondents is the translator by images.

In this way, an automatic translator was developed from several programming languages, and it was necessary to create a database library, where the 23 images with LGP gestures are stored. When opening the translator interface, a button allusive to a microphone appears, and when clicking on it, we can say the word or a set of words we want to translate. Having said this, the translator can receive a given word, as input, by voice recognition and write it in the text field, and the code will look for this word in the library, and if it finds it, it will show on the screen the image of the gesture corresponding to that word with the respective subtitle above the image and its confidence level.

Despite the great progress that the health and technology sectors have made in society in recent decades, there are still many gaps when it comes to adapting society to people with disabilities, of whatever kind. Disabled people need adapted services so that they can have access to them without feeling frustrated, put aside, or even thinking that the problem lies within themselves as individuals, when in fact the problem lies in the lack of means and funds on the part of institutions that fail to adapt services for disabled people.

The translation system was evaluated through a questionnaire made by three deaf people from the Associacao de Surdos do Concelho de Sintra, who use LGP as a means of communication in their daily life. They evaluated that the translator has a good performance and will be useful for the deaf community. The first part of the questionnaire consisted of five sentences that the respondents had to say in the microphone. After this stage, they answered the questionnaire with seven questions related to the good functioning and perception of the translator. The answers were quite similar, as they classified the translator as “very good” and as being a useful translation system for the deaf community.

In future studies, it would be essential to overcome the limitations presented above, as it would also be interesting to use this translator as a means of learning LGP, presenting didactic games for children and dictionaries with the words and the respective gestures to be performed. In a more advanced stage, it would also be interesting to make the translation module from LGP to text, making the translator bidirectional.

References

1. Gaspar LR. IF2LGP–Interprete automatico de fala em lingua portuguesa para lingua gestual portuguesa. (2015). Available online at: http://hdl.handle.net/10400.8/2541 (Accessed March 13, 2022).

Google Scholar

2. National Institute for Rehabilitation.Inquerito national as incapaci- dades, deficiencias e desvantagens:sintese. (1996). Available online at: https://www.inr.pt/cadernos (Accessed March 28, 2022).

Google Scholar

3. National Institute for Rehabilitation.Glossario - INR, I.P. Instituto Nacional Para a Reabilitacao. (2021). Available online at: https://www.inr.pt/glossario (Accessed March 3, 2022).

Google Scholar

4. World Health Organization [WHO]. World report on hearing. In World Health Organization. (2021). Available online at:https://www.who.int/publications/i/item/world-report-on-hearing (Accessed December 27, 2022).

Google Scholar

5. Araujo De Oliveira YC, Deysny De Matos Celino S, Cavalcanti Costa GM. Comunicagao como ferramenta essencial para assistencia a saude dos surdos. Rev Saude Colet. (2015) 25:307–20. doi: 10.1590/S0103-73312015000100017

CrossRef Full Text | Google Scholar

6. ten Hulzen RD, Fabry DA. Impact of hearing loss and universal face masking in the COVID-19 era. Mayo Clin Proc. (2020) 95:2069–72. doi: 10.1016/j.mayocp.2020.07.027

CrossRef Full Text | Google Scholar

7. Goncalves, et al. (2021)

Google Scholar

8. Escudeiro P, Escudeiro N, Reis R, Lopes J, Norberto M, Baltasar AB, et al. Virtual sign - a real time bidirectional translator of Portuguese sign language. Procedia Comput Sci. (2015) 67:252–62. doi: 10.1016/j.procs.2015.09.269

CrossRef Full Text | Google Scholar

9. Hand Talk.Discover the largest Sign Language translation plataform in the world. (2022). Available online at: https://www.handtalk.me/br/sobre/ (Accessed May 2, 2022).

Google Scholar