At a time when many of the leading on TV behave in the same scheme – like robots, experts on new technologies are experimenting with real robots. So, technical media Ferra.ru included in the Rambler Group, integrated with the news digital-TV presenter Elena. Created using neural networks, it can lead news stories, using only text to generate a full video of the question and mimics the facial expressions and human emotions. The first issue with the participation of Elena is already available on the website Ferra.ru. Everyone who will watch it, pay attention to the standard of women’s image, chosen for the role of presenter, and slightly slow, but the ability to read text news.
I must say that the intonation is different in digital the leading best side from the noisy flow of her colleagues on some channels and even those bloggers who are struggling to attract attention.
Elena is the DoppelgangeR of the digital presenter, established in 2019 in the robotics Lab of the savings Bank with the involvement of the group companies MDG technology which provide text-to-speech avatar.
“It is unique in the use of sophisticated neural network models for continuous generation of speech audio from text. This allows to achieve smoothness and expressiveness of synthetic speech, and with powerful linguistic processor of the reading of the text is done with all the rules of the language, even in complex cases”, – explained in the release.
the Image of Elena is also the result of neural network models trained on the recordings of a real person.
“Obviously, this is only the beginning: the research is still very much relevant tasks associated with the generation of photo-realistic digital characters, modeling high-quality animation of the body (including facial expressions and gestures) and different styles (clothing, hair, makeup). A separate big task – to make these technologies work fast, ideally in real time,” says Nicholas Simon, head of the Department for the development of virtual characters SberDevices.
“the Use of neural networks allowed us to bring the quality of the generated speech to a new level. With the help of flexible settings we have brought in a whole layer of new possibilities for its control: a natural change of pace, tone of speech, and style of reading text. In the near future this technology will be able to accurately simulate human emotions and will fully compete with professional speakers”, – says Dmitry Dyrmovsky, General Director of group of companies of the MDGs.