Robert Munro / Rob Munro
I'm a computational linguist working in communication technologies. This covers a broad area of technology and development, from crowdsourcing and machine-learning for extracting rich information from natural language, to the installation of supporting infrastructures.
Some past work:
Global Viral Forecasting. In 2011 I worked at Global Viral Forecasting (now Metabiota) as the Chief Technology Officer for EpidemicIQ; a system that is tracking disease outbreaks world-wide. The goal is to predict and prevent future epidemics. Google Flu trends found that you can predict flu outbreaks by simply modeling the symptoms that people choose to search for. Imagine if you modeled all the world's available medical information and reports?
Crowdsourcing language and cognition. Language and cognition tasks that used to take thousands of dollars over several months can now be completed in a matter of hours for a few dollars. While I originally worked in commercial crowdsourcing applications, and more recently in social development, many of the most exciting applications are in scientific research.
In July 2011, I helped run the Workshop on Crowdsourcing Technologies for Language and Cognition Studies, for the first time bringing together the researchers who are embracing these new technologies and strategies. We are already seeing the beginning of paradigm shift in language research back to empirically savvy approaches. Crowdsourcing technologies are set to become one of the leading tools in this new wave of research methodologies.
Mission 4636. I coordinated the translation, geolocation and categorization of emergency text messages sent in Haiti in the wake of January 12, 2010 earthquake. This was the only emergency response service available to people within Haiti during this critical period. The primary emergency responders were the US Military who for the most part did not speak Haitian Kreyol or know the locations of addresses in Haiti. Working with more than 1000 Kreyol and French-speaking volunteers from 49 countries, we created a system that allowed us to turn raw text messages in Haitian Kreyol into categorized English messages with precise coordinates with an average turnaround of just 10 minutes. According to the responders this saved hundreds of lives and directed the first aid to tens of thousands.
In total, we processed more than 80,000 messages. It was the first time that crowdsourcing had been used for real-time humanitarian relief and it is still the largest deployment of humanitarian crowdsourcing to date.
Classifying and extracting meaning from short message communications with machine learning and natural language procesing. This project was the focus of my Ph.D. and it looked at methods for automatically classifying text messages (SMS) in low resource languages, and for extracting information such as locations and the names of people.
A new architecture was developed that adapts to the variation in the language by combining subword models with incremental learning over streaming data. By looking at messages in Chichewa, Kreyol, Pashto, Urdu and Sindhi we were able to combine linguistic models with spatial and temporal information to identify the topics of messages with high accuracy and confidence.
Pakreport. I developed modules that allows Pakreport's information management component to outsource the value-adding tasks of translation, geolocation and categorization to volunteers working with CrowdFlower.
This means that work is cross-checked among multiple workers so that the information is not susceptible to the potential errors of any one volunteer, ensuring data-quality for the aid agencies using the service and meaning that the volunteers can help without fear of accidentally introducing bad information.
Reported Speech in Matses. In late 2009 I had the privilege to live with the Matses and study their language. The Matses people live in a remote enough corner of Peruvian Amazon and only gave up their prior nomadic lifestyle in 1969, making it an under-studied and endangered language.
Reported speech in Matses is unlike any other language. If someone says, "I will go to there tomorrow", you can quote that person directly (they said "I will go there tomorrow"), but you cannot rephrase it from your own spatio-temporal or interpersonal point-of-view (they said "they will come here today"). However, you are otherwise free to paraphrase (they said "I will canoe to there in the morning") or extract (where did they said "I will go"?). This challenges some of the fundamental assumptions about cross-linguistic semantic constraints and raises interesting questions about the possibilities for how we encode the world we perceive.