Award number: 1845789
Award Duration: Feb, 1 2019 to January, 31st 2024
Award Title: CAREER: Towards Spatial Data Systems Support for the Internet of Things
PI: Mohamed Sarwat
Students:
Jia Yu
Yuhan Sun
Ankita Sharma
Kanchan Chowdhury
Venkata Vamshikrishna Meduri
Project Goals:
The overarching goal of the project is to is to develop graph database systems technology that can efficiently store, manage, and execute real-time spatial / spatio-temporal graph queries on linked IoT data and support scalable processing of large-scale IoT data. To achieve that, the research effort in this project includes the design and development of novel spatial data storage, indexing, processing and management techniques that scale to the ever-increasing volume and fast rate of data collected from IoT devices. As opposed to existing big spatial data systems and numerical frameworks, a novel IoT data abstraction method will extend recently developed big spatial data systems to provide an Application Programming Interface (API) for development of IoT applications. The newly developed method takes into account not only the spatial distribution of the data, but also the physical and mathematical characteristics of signals generated by each sensor. Furthermore, the new IoT abstraction method includes novel query operators as well as query optimization strategies, which can efficiently evaluate a hybrid workload that involves classic spatial and spatio-temporal data processing and digital signal processing operations on IoT data. The new method also bridges the gap between IoT devices and the spatial data system by designing a middleware that integrates the IoT devices streaming data to the central system with the requirements of applications accessing such IoT data. Another outcome of the project is a graph query processor that can optimize and evaluate general-purpose spatial predicates such as spatial range, join and K-Nearest Neighbors (KNN) as well as temporal predicates in a graph query issued on linked IoT data in real time even with new things and observations being regularly added to the IoT graph.
Publications:
- [IEEE MDM 2019] Building a Large-Scale Microscopic Road Network Traffic Simulator in Apache Spark
Zishan Fu, Jia Yu, Mohamed Sarwat.
To appear in proceedings of the IEEE International Conference on Mobile Data Management, in Hong Kong, China June 2019 - [IEEE ICDE 2019] GeoSparkViz in Action: A Data System with built-in support for Geospatial Visualization
Jia Yu and Mohamed Sarwat.
To appear in proceedings of the IEEE International Conference on Data Engineering, in Macau, China April 2019 (Demo Track) - [IEEE ICDE 2019] Demonstrating Spindra: A Geographic Knowledge Graph Management System
Yuhan Sun, Jia Yu, and Mohamed Sarwat.
To appear in proceedings of the IEEE International Conference on Data Engineering, in Macau, China April 2019 (Demo Track) - [IEEE MDM 2019] An Automated Framework for Explaining Facts Extracted From Mobility Datasets
Anique Tahir, Yuhan Sun and Mohamed Sarwat.
To appear in proceedings of the IEEE International Conference on Mobile Data Management, in Hong Kong, China June 2019 - [SSTD 2019] Demonstrating GeoSparkSim: A Scalable Microscopic Road Network Traffic Simulator Based on Apache Spark
Zishan Fu, Jia Yu, and Mohamed Sarwat
To appear in proceedings of the International Symposium on Spatial and SpatioTemporal Databases, in Vienna, Austria August 2019 (Demo Trak)
Best Demonstration Paper Award Runner-Up - [GeoInformatica 2019] Spatial Data Management in Apache Spark: The GeoSpark Perspective and Beyond [Source Code]
Jia Yu, Zongsi Zhang and Mohamed Sarwat.
in Springer International Journal on Advances of Computer Science for Geographic Information Systems, Volume 23, Number 1, Pages 37-78, January 2019 - [GeoInformatica 2019] A Spatially-Pruned Vertex Expansion Operator in Graph Database Systems
Yuhan Sun and Mohamed Sarwat
in Springer International Journal on Advances of Computer Science for Geographic Information Systems, Volume 23, Number 3, Pages 397-423 - Spatio-Social Data
Yuhan Sun and Mohamed Sarwat.
In the Encyclopedia of Big Data Technologies, 2019, Editors: Timos Sellis, Aamir Cheema
Software Downloads
We released a new version of GeoSpark as an open source software (Github repo: https://github.com/DataSystemsLab/GeoSpark), which attracted many users and contributors either from industry or academia. Currently, the GeoSpark software is being downloaded more than 10,000 times on a monthly basis and its open source community is in a steady growth. For instance, Uber, Lyft, Apple, MoBike, American Family Insurance and Facebook are using GeoSpark to power their geospatial analytics applications. Our team also gave an academic tutorial on ”Geospatial Data Management in Apache Spark” in ICDE 2019 and a practical tutorial on ”Spatial Data Wrangling using GeoSpark” at the ACM SIGSPATIAL Spatial API workshop. Recently, GeoSpark has been featured by Databricks (the main company maintaining Apache Spark) in an official blog article about ”Processing Geospatial Data at Scale” as a popular Spark-based framework used by Spark customers.
Presentations
“Optimizing Systems For Geolocation Data – the works” Early Career Distinguished Speaker at IEEE MDM 2019
”Geospatial Data Management in Apache Spark” in IEEE ICDE 2019
”Spatial Data Wrangling using GeoSpark – A Step By Step Tutorial” at ACM SIGSPATIAL SpatialAPI workshop 2019
Awards/Prizes
-Mohamed Sarwat (the PI) was named Early Career Distinguished Lecturer by the IEEE Mobile Data Management Community
-Best Demonstration Paper Runner Up at SSTD 2019
Highlights/Press Releases
Acknowledgement: “This material is based upon work supported by the National Science Foundation under Grant No.1845789.”
Disclaimer: “Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.”
Data of last update: February 10th 2020