Monday, June 11, 2012

I need my proprietary GIS for this hard-core geospatial analysis, right?


Bill Dollins blogged recently that he doesn't see much difference in capability between open source and proprietary geospatial tools.   


I think Web/GIS developers would agree.  But I'm not so sure that geospatial analysts would agree.


I agree, if we're talking about building and using map apps -- displaying points, lines, and polygons, and routine map functions like routing, thematic mapping, and interactive display of the map and underlying data.


But what about hard-core geospatial analysis?  Case at hand:


I'm working on an emergency management application that estimates the age-group populations that may be affected by an emergency event.   The only population and demographics data I have is census block groups.  I create a buffer around the event, then clip the census block groups that intersect the buffer.




The problem is that for many of the included block groups, only a small portion lie within the event zone.  So, only a fraction of the population in each of those groups should be included.


I was ready to write the function for my Python script that would calculate the percentage of the area of  each block group that got clipped.  If only 15% of the area got clipped, I would grab only 15% of the population counts for that block group to add to my population total.


Then I saw Esri's announcement that ArcGIS 10.1 now includes an has areal interpolation tool.




Seems like the perfect fit for what I need to do.  I can create a grid of 100 x100 meter polygons and re-assign portions of the population counts to these much smaller and regularly shaped grid polygons.  And save this data layer for use any time a need a more precise population estimate.


Can any open source geospatial tools do this?  I admit that I don't know.


I think it would be good to do something like what Tobin Bradley did to evaluate the importance of different elements of the Google Maps API.  Maybe do the same with Esri's ArcToolbox.  How much of this tool set is present in open source geospatial software?  How important are the missing tools?


Has anyone besides Esri done an assessment like this?



5 comments:

  1. Don,

    I'll admit to be more of a developer than an analyst but I'm pretty sure you can do that with QGIS. Others would need to chime in about how.

    While I said that I couldn't perceive an meaningful difference between proprietary and open-source tools, I will admit that either may be more suitable than the other for any particular task. Obviously, requirements matter.

    Thanks for stopping by my blog.

    Bill

    ReplyDelete
  2. This is definitely something that can be done with open source tools. Using a spatially enabled database this could be done with only sql queries. The simplest way to analyze the first question would be to "intersect" (that term is different depending on what tool you are using) the original polygons and compare the size of the intersected polygon to the size of the original polygon that give you the ratio to use to determine the resulting population. This could also be done using GRASS on the desktop, but the result would be much slower than using the spatial database directly.

    The issue with this simplified analysis, as well as the esri interpolation tools, is that they both completely ignore the varying population density within the census block groups, especially in larger block groups. The ideal situation would be to combine land use/land cover data to create a population density surface that was much closer to accurate. Take a look at the research here for more information: http://www.acsu.buffalo.edu/~lewang/pdf/lewang_sample10.pdf

    ReplyDelete
  3. If Don's question had been about clipping then calc'ing new values proportional to area, then it would be easy for even any lightweight GIS package to handle it, and Don would hardly need to ask at all. (Shoot... for a coder even less than that; just data table access and two geometric functions.) But then we all agree that doing this "ignores the varying density within" each polygon, as Adam put it.

    But that's not what Don's asking. And I need to disagree with Adam's "as well as" part. I mean, the sole purpose of area interpolation is to *not* ignore variations within. Better put, it's guessing what the variations within might be based on the values/density of surrounding areas (think Focal Mean). Of course you need to play an active role in this guesswork, choosing parameters during rasterizing and running the stats that give this guesswork the best chance at being close to right. In other words, you better know your data. And thanks Adam for the idea of mixing in some land use data to further improve the surface. Awesome idea.

    If your GIS package can do vector-to-raster, and run some basic raster ops like Focal Mean and other basic map algebra, then that might do it. I don't know enough to clearly answer "which ones", but hope that helps anyway.

    (Disclaimer: I work for Esri, and I don't pretend to know the algorithms behind the area interp tools mentioned in this article. I'm guessing some kind of focal statistics are in the mix somewhere.)

    ReplyDelete
  4. QGIS already includes a create Vector Grid or create Point Grid tool, use that to generate the 100x100m polygons. Census TIGER already has the land area per square mile, so the rest is to write a python script that iterates through each block, calculate the proportional population by 100x100m polygon, and add that value to each polygon. I probably would use a point grid instead, selection would be faster.

    ReplyDelete
  5. In the absolute worst case, you could perform each of the necessary component operations with the painful-to-use GRASS UI. Waaaaaaay the heck easier would be to write an elegant SQL statement encompassing the lot of it, though I realize this requires an understanding of PostGIS structure and functions. Somewhere in between is a click-lots-of-buttons procedure that leverages the QGIS vector tools and some of the richer GDAL-based plugins; this looks very similar to the procedure you've used in ArcMap/ArcToolBox. All of the above options are open-source.

    Open-source analysis tools are up to almost any "geoprocessing" you can do in ArcGIS. They're not well-publicized and in many cases there aren't good examples of their application. Open-source web and visualization tools have gotten a lot more press because the demand is there. The community of geospatial analysts can drive a similar discussion if a critical mass is reached, demanding open-source tools and training.

    ReplyDelete