Alibaba’s AI Bot Outguns Humans In Reading Comprehension Test
The AI model developed by Alibaba’s Institute of Data Science and Technologies blazed past the SQuAD (Stanford Question Answering Dataset) test – one of the most reliable reading comprehension test for evaluating a machine’s language skills – in a contest which pitted it against human rivals.
Alibaba’s AI scored a cumulative 82.44 Exact Match (EM) points, outscoring its human competitors who manged to put up 82.304 points on the scoreboard.
According to a report published in the South China Morning Post – also owned by Alibaba – this achievement marks the first instance when a machine has beaten its human counterparts in a reading comprehension test.
When it comes to the net F1 scores in the SQuAD assessment, Alibaba’s AI model topped the chart with 88.607 points, positioning itself higher than similar systems developed by Microsoft and Facebook. The results are truly impressive because language comprehension has traditionally been regarded as a weak point of AI systems.
The aforesaid shortcoming severely limits their ability to carry a truly productive conversation with a person, and not just crunching numbers and processing information.
In the words of Si Luo, a chief scientist of natural language processing at Alibaba’s research arm, the recent results will open up newer avenues of deploying AI systems in customer assistance jobs, thanks to their improved language processing capabilities.
We believe the underlying technology can be gradually applied to numerous applications such as customer service, museum tutorials, and online response to inquiries from patients, freeing up human efforts in an unprecedented way
The results achieved by Alibaba’s deep neural network model indicate that AI systems will soon be able to answer objective questions like ‘what causes rain’ by processing the vast amount of information at their disposal, and responding with the most contextually accurate and precise answer.