OTTAWA — The first whispers of the COVID-19 pandemic came in postings on WeChat by Dr. Li Wenliang, the young Wuhan ophthalmologist who was first scolded by Chinese authorities for warning of the strange pneumonia cases at his hospital in December 2019, then contracted it himself and died in February 2020.
Those posts, meant only as advice for some of Li’s medical-school classmates to be careful, quickly spread more widely, even ahead of official alerts within China.
Talking Point
The first public signs of the next pandemic will probably be somewhere in social media posts, as they were for what became COVID-19. But the sheer quantity of information to examine makes sifting it an all-but-impossible challenge.
Canada’s Global Public Health Intelligence Network (GPHIN), the surveillance unit in the Public Health Agency of Canada (PHAC) that looks for emerging health threats around the world, from new germs to contaminated pharmaceuticals, didn’t see them—because the algorithms it uses to trawl the internet for worrying health news cover just a sliver of social media. For everything else, the system relies on the eyes of just 13 human analysts.
A major review of GPHIN, released in July, noted that algorithmic scans of open source material from around the world are critical to GPHIN’s work, but also that 84 per cent of GPHIN’s automated inputs are from Factiva, a news archive maintained by Dow Jones. Most of the rest are from other news sources and aggregators—and those inputs are probably not good enough, the reviewers wrote: “A recurring theme in the panel’s conversations was the rise of social media as a source of public health information. Integrating more social media data as inputs for the GPHIN system should be a key future consideration.”
The review followed extensive reporting by The Globe and Mail on how the network’s intelligence didn’t provoke the rapid response the looming COVID-19 pandemic demanded.
A handful of social media sources considered significant—tweets from the World Health Organization and provincial health authorities, for instance—are already ingested alongside traditional news, spokesperson Anna Maddison said by email. But otherwise, “GPHIN analysts manually scan various official and unofficial social media accounts in their respective languages on a daily basis.”
The review found that cranking up that capacity won’t be easy: “As analysts are already working at capacity, further social media curation will need to be supported either through additional resources or through automation.”
Even the current system spits out about 3,500 items a day for the analysts to look over, once duplicates and dross, like mentions of “Bieber fever,” are filtered out.
How GPHIN might automate work now done by its analysts is the hard part. It’s one thing to see Li’s posts now for what they were, something else to find a needle in a line of haystacks in real time. Especially when there’s hardly ever a needle.
Diana Inkpen, a computer-science professor and director of the Natural Language Processing Lab at the University of Ottawa, said that with careful training, neural networks—algorithmic systems that can “learn” from previous mistakes and imprecisions—are very good at spotting unusual patterns.
Some of her work is on detecting signs of serious mental illness through social media posts such as tweets. She stressed in an interview that this is not now done in real time, though eventually algorithms could keep electronic eyes out for people in active crises. Inkpen said she’s talked with people at PHAC about that prospect.
“My take on the public health agency is they are very interested and forward-looking. They just aren’t sure how to do it,” she said of the agency’s interest in using social media for epidemiological surveillance.
Early in the COVID-19 pandemic, Toronto-based BlueDot staked a claim to fame by fairly accurately predicting the path of the novel coronavirus outside China. Alex Demarsh, BlueDot’s senior director of outbreak science, said GPHIN’s people are some of the best.
“They have analysts who are just brilliant, and often that have been doing this for longer than anyone else. I think that’s the key value-add for GPHIN,” Demarsh told The Logic.
Demarsh, who was consulted in the GPHIN review, has been at BlueDot since March, but he spent 10 years as an epidemiologist and biostatistician at PHAC, including two as a senior epidemiologist at GPHIN.
He said experts around the world were alert to the possibility of a novel coronavirus emerging in China at some point. BlueDot, GPHIN and another surveillance network called ProMED all recognized, almost simultaneously at the very end of 2019, that a new illness had appeared in Wuhan, because they were paying attention to Taiwanese news, he said.
“In China, we’re pretty bullish on Taiwan as an indirect route, because they have obvious interest in keeping up on events on the mainland, so they can have good sources,” Demarsh said.
BlueDot was able to take an extra step and say that this “pneumonia of unknown aetiology” could become a threat outside China, by examining airline-ticket data.
Drawing on social media could get around problems with censored or state-directed media, and with local sources that just aren’t what they used to be. Not every place a new illness might crop up has eyes on it like Taiwan’s.
Sifting the ephemera of people’s digital lives for public health insights isn’t new. Google made a splash in 2008 with a tool to predict influenza outbreaks based on internet searches for terms like “flu symptoms.” Often the first thing people do when they feel a bit off is look up their symptoms—before they see doctors, and long before they get tested for flu (which most patients never are, anyway).
Google Flu Trends’s performance was uneven. It did wonderfully some years, but largely missed the 2009 H1N1 influenza pandemic: that one began, unusually, in California in April, when flu season is usually ending. Google Flu missed in 2013, as well, massively overestimating flu cases compared to reality. In 2015, Google closed public access.
Tweets are attractive sources because they’re almost all public, Inkpen said: anyone can search Twitter, including in bulk. The company offers a mechanism, the Twitter API, that imposes some restrictions, but is fundamentally meant to make the task easy.
(Twitter itself has access to all the content it carries, and Inkpen said it would be best placed to analyze the material. Twitter Canada spokesperson Cam Gordon agreed the subject is important, but said, “We’re not doing interviews on this kind of topic currently.”)
The European Centre for Disease Prevention and Control has released an open source tool called epitweetr that scans tweets for specified search terms and throws up alerts when those terms appear with statistically unexpected frequency. But you have to tell it what words to look for, and it has no magical way of determining where the tweets it examines originated.
Maddison said the GPHIN team checked epitweetr out last winter and is “now examining how best to incorporate the use of this tool in its day-to-day monitoring.”
One trouble for surveillance efforts is that tweets are so brief that extracting meaning from them through automated means is difficult and figuring out whether that meaning has value is even harder.
“This short-message problem in natural language processing is well known. Just the nature of social media communication is non-standard language,” Demarsh said. “People have tried it for years, and it’s seen as the next frontier in event-based surveillance. But to me, there are more promising sources of early intelligence.
On the other hand, someone who’s tweeting about being sick or noticing a lot of ill people is likely to say something relevant more than once.
“There is a lot of redundancy. So even if the system would get some messages wrongly classified, there will be more material,” Inkpen said. “It’s not going to be 100 per cent accurate. But no AI system is. We see that we can get 80, 90 per cent accuracy,” she said.
That would still mean a lot of false positives for human analysts to check out. There’s also the risk of bad actors flooding social media with bogus messages meant to make observers think a new plague is spreading when none is—a risk raised in the GPHIN review—though Inkpen said fraudulent posts are often fairly easily detected by AIs because of their repetitive language and odd posting times.
Besides the challenge of extracting meaning from social posts, some of the potential hot spots for infectious diseases, such as East Africa, don’t have deep social media ecosystems, Demarsh said.
“Language-specific radio networks are a common way of getting local news information there. So at BlueDot we’re keen—if we can get it, collect at the appropriate scale and process voice information from those radio networks, it’s a pretty exciting source,” he said.
Older forms of social media might also be useful: “A hypothesis I have is that discussion forums amongst relative specialists or enthusiasts, or geographically focused discussion boards—I think those are potentially quite useful in this space,” he said.
Inkpen is more optimistic about tweets but agreed that there are plenty of other potentially useful inputs, such as Reddit forums.
Demarsh distinguishes between detecting new illnesses and assessing them as threats. GPHIN detected that people were getting sick with something strange in China at the very end of 2019—it just didn’t convey a clear assessment that those illnesses warranted global concern.
“That’s a key learning for BlueDot: more emphasis on the ‘assessment’ part of the detect-assess-respond epidemic-intelligence cycle,” he said. Followed by translating that assessment into language other officials and the public can do something with.
Specialists, he said, “can understand it at a technical level, we can pull out useful hidden information, connect it to accurate forecasting models and get a good sense for ourselves. [But] we need to communicate that clearly. And the task isn’t done until it’s communicated and understood, and people [are] empowered to act.”