Geospatial crowdsourced data fitness analysis for spatial data infrastructure based disaster management actions

PhD Thesis


Koswatte, Saman. 2017. Geospatial crowdsourced data fitness analysis for spatial data infrastructure based disaster management actions . PhD Thesis Doctor of Philosophy. University of Southern Queensland. https://doi.org/10.26192/5c09fc67f0cd3
Title

Geospatial crowdsourced data fitness analysis for spatial
data infrastructure based disaster management actions

TypePhD Thesis
Authors
AuthorKoswatte, Saman
SupervisorMcDougall, Kevin
Liu, Xiaoye
Institution of OriginUniversity of Southern Queensland
Qualification NameDoctor of Philosophy
Number of Pages194
Year2017
Digital Object Identifier (DOI)https://doi.org/10.26192/5c09fc67f0cd3
Abstract

The reporting of disasters has changed from official media reports to citizen reporters who are at the disaster scene. This kind of crowd based reporting, related to disasters or any other events, is often identified as 'Crowdsourced Data' (CSD). CSD are freely and widely available thanks to the current technological advancements. The quality of CSD is often problematic as it is often created by the citizens of varying skills and backgrounds. CSD is considered unstructured in general, and its quality remains poorly defined. Moreover, the CSD's location availability and the quality of any available locations may be incomplete. The traditional data quality assessment methods and parameters are also often incompatible with the unstructured nature of CSD due to its undocumented nature and missing metadata. Although other research has identified credibility and relevance as possible CSD quality assessment indicators, the available assessment methods for these indicators are still immature.

In the 2011 Australian floods, the citizens and disaster management administrators used the Ushahidi Crowd-mapping platform and the Twitter social media platform to extensively communicate flood related information including hazards, evacuations, help services, road closures and property damage. This research designed a CSD quality assessment framework and tested the quality of the 2011 Australian floods' Ushahidi Crowdmap and Twitter data. In particular, it explored a number of aspects namely, location availability and location quality assessment, semantic extraction of hidden location toponyms and the analysis of the credibility and relevance of reports. This research was conducted based on a Design Science (DS) research method which is often utilised in Information Science (IS) based research.

Location availability of the Ushahidi Crowdmap and the Twitter data assessed the quality of available locations by comparing three different datasets i.e. Google Maps, OpenStreetMap (OSM) and Queensland Department of Natural Resources and Mines' (QDNRM) road data. Missing locations were semantically extracted using Natural Language Processing (NLP) and gazetteer lookup techniques. The Credibility of Ushahidi Crowdmap dataset was assessed using a naive Bayesian Network (BN) model commonly utilised in spam email detection. CSD relevance was assessed by adapting Geographic Information Retrieval (GIR) relevance assessment techniques which are also utilised in the IT sector. Thematic and geographic relevance were assessed using Term Frequency – Inverse Document Frequency Vector Space Model (TF-IDF VSM) and NLP based on semantic gazetteers.

Results of the CSD location comparison showed that the combined use of non-authoritative and authoritative data improved location determination. The semantic location analysis results indicated some improvements of the location availability of the tweets and Crowdmap data; however, the quality of new locations was still uncertain. The results of the credibility analysis revealed that the spam email detection approaches are feasible for CSD credibility detection. However, it was critical to train the model in a controlled environment using structured training including modified training samples. The use of GIR techniques for CSD relevance analysis provided promising results. A separate relevance ranked list of the same CSD data was prepared through manual analysis. The results revealed that the two lists generally agreed which indicated the system's potential to analyse relevance in a similar way to humans.

This research showed that the CSD fitness analysis can potentially improve the accuracy, reliability and currency of CSD and may be utilised to fill information gaps available in authoritative sources. The integrated and autonomous CSD qualification framework presented provides a guide for flood disaster first responders and could be adapted to support other forms of emergencies.

Keywordscrowdsourced data; relevance; semantics; geographic information retrieval; natural language processing; disaster management
ANZSRC Field of Research 2020469999. Other information and computing sciences not elsewhere classified
370903. Natural hazards
401302. Geospatial information systems and geospatial data modelling
Byline AffiliationsSchool of Civil Engineering and Surveying
Permalink -

https://research.usq.edu.au/item/q4wyz/geospatial-crowdsourced-data-fitness-analysis-for-spatial-data-infrastructure-based-disaster-management-actions

Download files


Published Version
Koswatte_2017_whole.pdf
File access level: Anyone

  • 663
    total views
  • 136
    total downloads
  • 5
    views this month
  • 3
    downloads this month

Export as

Related outputs

Crowd-Assisted Flood Disaster Management
Koswatte, S., McDougall, K. and Liu, X.. 2023. "Crowd-Assisted Flood Disaster Management." Singh, Vijay P., Yadav, Shalini, Yadac, Krishna Kumar, Corzo, Gerald Augusto, Munoz-Arriola, Francisco and Yadava, Ram Narayan (ed.) Application of Remote Sensing and GIS in Natural Resources and Built Infrastructure Management. Switzerland. Springer. pp. 39-55
Relevance assessment of crowdsourced data (CSD) using semantics and geographic information retrieval (GIR) techniques
Koswatte, Saman, McDougall, Kevin and Liu, Xiaoye. 2018. "Relevance assessment of crowdsourced data (CSD) using semantics and geographic information retrieval (GIR) techniques." ISPRS International Journal of Geo-Information. 7 (7), pp. 1-18. https://doi.org/10.3390/ijgi7070256
VGI and crowdsourced data credibility analysis using spam email detection techniques
Koswatte, Saman, McDougall, Kevin and Liu, Xiaoye. 2018. "VGI and crowdsourced data credibility analysis using spam email detection techniques." International Journal of Digital Earth: a new journal for a new vision. 11 (5), pp. 520-532. https://doi.org/10.1080/17538947.2017.1341558
Semantic location extraction from crowdsourced data
Koswatte, Saman, McDougall, Kevin and Liu, Xiaoye. 2016. "Semantic location extraction from crowdsourced data." 23rd International Society for Photogrammetry and Remote Sensing (ISPRS 2016). Prague, Czech Republic 12 - 19 Jul 2016 Germany. https://doi.org/10.5194/isprsarchives-XLI-B2-543-2016
SDI and crowdsourced spatial information management automation for disaster management
Koswatte, S., McDougall, K. and Liu, X.. 2015. "SDI and crowdsourced spatial information management automation for disaster management." Survey Review. 47 (344), pp. 307-315. https://doi.org/10.1179/1752270615Y.0000000008
SDI and crowdsourced spatial information management automation for disaster management
Koswatte, Saman, McDougall, Kevin and Liu, Xiaoye. 2014. "SDI and crowdsourced spatial information management automation for disaster management." FIG Commission 3 Workshop 2014. Bologna, Italy 04 - 07 Nov 2014