Looking at a bunch of people's photographs released by chipmaker Nvidia, you are not likely to find anything particularly impressive. They are just regular people until you learn that the photos are fakes and those people never existed in real life. They look quite real, though, down to the minutest details - from facial lines and creases to freckles and skin tones. More than that, their expressions are incredibly genuine. Each face looks like it has a personal history and individuality, an aliveness and vibrancy. But the entire lot has been created by artificial intelligence (AI). A collection of these images and a research paper outlining the broad method used to create them can be found online.
At the heart of these fake faces is a technology called Generative Adversarial Networks, or GANs, given a new twist by Nvidia researchers. The system is made up of two neural networks - one that can create images based on the data it has been fed (the generator) and a second network that can recognise the data (the discriminator or adversary). The second network checks the images coming from the generator to determine how lifelike these are, based on which the generator can further improve its output. But there is more to it.
Nvidia researchers have come up with a style-transfer method that enables GANs to separate from the images the multitude of characteristics (face shape, facial features, eyes, nose, hair, pose and so on) that create a face. This is done without any human intervention. Later on, the system can combine these characteristics randomly to generate new images.
"Our generator thinks of an image as a collection of 'styles', where each style controls the effects at a particular scale," the researchers said in a video. In Nvidia's demonstration, one can see sliders adjusting the strengths of characteristics sets to get stunning variations.
The concept of GANs was introduced four years ago by researchers from the University of Montreal. However, Nvidia says its photographs are much better than the ones rolled out by current GANs. The company has done similar experiments with several objects such as cars, but creating such realistic yet fake human faces could create troubles. The research paper does not provide details regarding possible applications, but one can imagine that the models in stock photographs can be replaced with these deep fakes. The advertising industry should benefit as well. Whenever anonymity is required or a company prefers not to splash big money on human models, AI-generated creations could step in. Another potential area is high-graphics games. Nvidia is already involved in creating hardware for such games, but its ability to create even more realistic images could take things to another level.
Read My Lips
Google and a research group from the University of Oxford have been working on an AI system that can read lips, apparently with better success than humans. Lip-reading is not a common skill. But people with hearing impairments often learn it and, along with hearing aids, manage to understand what others are saying. Now things could be a lot easier if Google provides access to the technology.
Interestingly, a deep learning system was exposed to 5,000 hours of BBC television shows. It involved 1,18,000 sentences, according to a study titled Lip Reading Sentences in the Wild, and results have shown that the system is outperforming humans at this skill. The study envisages applications such as dictating to a phone in a noisy environment, translation and transcription, redubbing films, resolving multi-talker speech and generally improving speech recognition.