Google may have taught an AI how to doodle, but drawing something more complex is tough for a computer. Imagine asking a computer to draw a “yellow bird with black wings and a short beak;” it sounds a little tricky.
Researchers at Microsoft, though, have been developing an AI-based technology to do just that. It generates images from text descriptions with a surprising amount of accuracy, according to the most recent paper posted by the team.
The system doesn’t find an existing image based on your input, but creates real drawing. “If you go to Bing and you search for a bird, you get a bird picture. But here, the pictures are created by the computer, pixel by pixel, from scratch,” said principal researcher Xiaodong He in a statement. “These birds may not exist in the real world — they are just an aspect of our computer’s imagination of birds.”
While the current form of this drawing technology isn’t perfect, it’s not hard to imagine a future where it could function as a sketch assistant for painters and interior designers or a tool to refine photos based on voice input. Farther out, researcher He imagines animated movies generated from a written script.
The team began its research into computer vision and natural language processing with the CaptionBot, an AI system that automatically writes captions for photos, then created a system to answer questions people ask about images called SeeingAI, which can be extra helpful if you’re blind. The current technology consists of two parts: one that generates images known as a Generative Adversarial Network (GAN) and one that judges the quality of the images generated, known as a discriminator.
The drawing bot was trained on pairs of images and captions, which teach the AI to learn what words go with which images. The team also created a mathematical representation of human attention, which is what we all use when we draw pictures from complex descriptions: a red wing, a sharp beak, a yellow wing. “Attention is a human concept; we use math to make attention computational,” said He.
Alibaba’s AI Bot Outguns Humans In Reading Comprehension Test
The AI model developed by Alibaba’s Institute of Data Science and Technologies blazed past the SQuAD (Stanford Question Answering Dataset) test – one of the most reliable reading comprehension test for evaluating a machine’s language skills – in a contest which pitted it against human rivals.
Alibaba’s AI scored a cumulative 82.44 Exact Match (EM) points, outscoring its human competitors who manged to put up 82.304 points on the scoreboard.
According to a report published in the South China Morning Post – also owned by Alibaba – this achievement marks the first instance when a machine has beaten its human counterparts in a reading comprehension test.
When it comes to the net F1 scores in the SQuAD assessment, Alibaba’s AI model topped the chart with 88.607 points, positioning itself higher than similar systems developed by Microsoft and Facebook. The results are truly impressive because language comprehension has traditionally been regarded as a weak point of AI systems.
The aforesaid shortcoming severely limits their ability to carry a truly productive conversation with a person, and not just crunching numbers and processing information.
In the words of Si Luo, a chief scientist of natural language processing at Alibaba’s research arm, the recent results will open up newer avenues of deploying AI systems in customer assistance jobs, thanks to their improved language processing capabilities.
We believe the underlying technology can be gradually applied to numerous applications such as customer service, museum tutorials, and online response to inquiries from patients, freeing up human efforts in an unprecedented way
The results achieved by Alibaba’s deep neural network model indicate that AI systems will soon be able to answer objective questions like ‘what causes rain’ by processing the vast amount of information at their disposal, and responding with the most contextually accurate and precise answer.