This project is read-only.

GIS vector-based spatial data overlay processing is much more complex than raster data processing. The GIS data files can be huge and their overlay processing is computationally intensive. Meager amount of work has been done on processing large volume of vector geospatial data through parallel/distributed computing, and none has been on cloud platforms. We have created Crayons system, which we believe to be the first such parallel framework over clouds for overlay analysis of two GIS layers of polygonal data in GML format. The Windows Azure cloud platform was a challenge as it currently lacks support for traditional distributed computing infrastructures such as MPI or map-reduce. This paper presents the basic design of Crayons framework and explores the amount of parallelism in GIS computation over Azure. We show how the computation underlying this application can be effectively partitioned into independent tasks, and how Azure communication and storage mechanisms can be utilized to distribute these tasks among processors (Azure workers). We report on how much scalability Azure platform affords to various computational and i/o phases, and point out various bottlenecks in both algorithms and the Azure platform. Our experimental results show excellent speedups of basic overlay computation, highlight possible need for a new, distributed representation and storage of GIS files, and promise further scalability over larger clouds and data files.

Learn more about Crayons at Dimos' Project Page

Last edited Nov 12, 2011 at 12:20 AM by dinwal, version 5