How Twitter Feels The Earth Move

In the 1830s, Samuel Morse developed his self-named code to transmit messages via dots and dashes.

Strung together, the bits and pieces communicated words and conveyed information far more quickly over far greater distances than ever before.  Twitter uses 140 characters or less per tweet and yet the volume and velocity don’t just tell individual stories; they aggregate to provide viable, robust information.

Just as Google could predict the next flu outbreak weeks before the Center for Disease Control (CDC), Twitter makes the call on earthquakes.   With the correct algorithms, tweets provide data points that both detect and verify quake information provided by mechanical sensors or instead of the same where locations are too remote for sensors.

How the USGS uses Twitter data to track earthquakes

#DataStories is where we interview people doing interesting work with Twitter data. This week we’re speaking with Paul Earle and Michelle Guy of the USGS on how they use Twitter data to monitor earthquakes.

After the disastrous Sichuan earthquake in 2008, people turned to Twitter to share firsthand information about the earthquake. What amazed many was the impression that Twitter was faster at reporting the earthquake than the U.S. Geological Survey (USGS), the official government organization in charge of tracking such events.

This Twitter activity wasn’t a big surprise to the USGS. The USGS National Earthquake Information Center (NEIC) processes about 2,000 realtime earthquake sensors, with the majority based in the United States. That leaves a lot of empty space in the world with no sensors. On the other hand, there are hundreds of millions of people using Twitter who can report earthquakes. At first, the USGS staff was a bit skeptical that Twitter could be used as a detection system for earthquakes – but when they looked into it, they were surprised at the effectiveness of Twitter data for detection.

USGS staffers Paul Earle, a seismologist, and Michelle Guy, a software developer, teamed up to look at how Twitter data could be used for earthquake detection and verification. By using Twitter’s Public API, they decided to use the same time series event detection method they use when detecting earthquakes. This gave them a baseline for earthquake-related chatter, but they decided to dig in even further. They found that people Tweeting about actual earthquakes kept their Tweets really short, even just to ask, “earthquake?” Concluding that people who are experiencing earthquakes aren’t very chatty, they started filtering out Tweets with more than seven words. They also recognized that people sharing links or the size of the earthquake were significantly less likely to be offering firsthand reports, so they filtered out any Tweets sharing a link or a number. Ultimately, this filtered stream proved to be very significant at determining when earthquakes occurred globally.

USGS Modeling Twitter Data to Detect Earthquakes

While I was at the USGS office in Golden, Colo. interviewing Michelle and Paul, three earthquakes happened in a relatively short time. Using Twitter data, their system was able to pick up on an aftershock in Chile within one minute and 20 seconds – and it only took 14 Tweets from the filtered stream to trigger an email alert. The other two earthquakes, off Easter Island and Indonesia, weren’t picked up because they were not widely felt.

USGS map of earthquakes

On any given day, the NEIC processes about 70 earthquakes, but only a small handful of these might be felt. They might take place in the ocean, deep in the earth, or away from populated areas. Twitter data can be crucial in helping identify earthquakes felt by humans, and can trigger an alert typically in under two minutes. The 2014 earthquake in Napa was detected by USGS in 29 seconds using Twitter data, likely due to the tech savvy population that dominates the area. (Origin time was 2014-08-24 10:20:44 UTC and Twitter data detection time was 2014/08/24 10:21:13.)

The USGS monitors for earthquakes in many languages, and the words used can be a clue as to the magnitude and location of the earthquake. Chile has two words for earthquakes: terremotoand temblor; terremoto is used to indicate a bigger quake. This one in Chile started with people asking if it was a terremoto, but others realizing that it was a temblor.

As the USGS team notes, Twitter data augments their own detection work on felt earthquakes. If they’re getting reports of an earthquake in a populated area but no Tweets from there, that’s a good indicator to them that it’s a false alarm. It’s also very cost effective for the USGS, because they use Twitter’s Public API and open-source software such as Kibana and ElasticSearch to help determine when earthquakes occur.

Next, the USGS team says that they want to determine if they can drop Twitter data based detections into seismic algorithms, and if that can speed up alerts even more.

Thanks to Paul Earle and Michelle Guy of the USGS for taking the time to speak with us.