Pages

Showing posts with label streams. Show all posts
Showing posts with label streams. Show all posts

Monday, September 26, 2011

Reading Shapefiles from the Cloud

In a previous post, I wrote about saving shapefiles using pyshp to file-like objects and demonstrated how to save a shapefile to a zip file. PyShp has the ability to read from Python file-like objects including zip files as well (as of version 1.1.2).  Both the Reader object and the Writer.save() method accept keyword arguments which can be file-like objects allowing you to read and write shapefiles without any disk activity.

In this post, we'll read a shapefile directly from a zip file on a server all in memory.

Normally to read a shapefile from the file system you just pass in the name of the file to the Reader object as a string:

import shapefile
r = shapefile.Reader("myshapefile")

But if you use the keywords shp, shx, and dbf, then you can specify file-like objects.  This example will demonstrate reading a shapefile - from a zip file - on a website.

import urllib2
import zipfile
import StringIO
import shapefile

cloudshape = urllib2.urlopen("http://pyshp.googlecode.com/files/GIS_CensusTract.zip")
memoryshape = StringIO.StringIO(cloudshape.read())
zipshape = zipfile.ZipFile(memoryshape)
shpname, shxname, dbfname, prjname = zipshape.namelist()
cloudshp = StringIO.StringIO(zipshape.read(shpname))
cloudshx = StringIO.StringIO(zipshape.read(shxname))
clouddbf = StringIO.StringIO(zipshape.read(dbfname))
r = shapefile.Reader(shp=cloudshp, shx=cloudshx, dbf=clouddbf)
r.bbox
[-89.8744162216216, 30.161122135135138, -89.1383837783784, 30.661213864864862]

You may specify only one of the three file types if you are just trying to read one of the file types. Some attributes such as Reader.shapeName will not be available using this method.

File-like objects provide a lot of power. However it is important to note that not all file-like objects implement all of the file methods. In the above example the urllib2 module does not provide the "seek" method needed by the zipfile module. The ZipFile read() method is the same way.  To get around that issue, we transfer the data to the StringIO or cStringIO module in memory to ensure compatibility. If the data is potentially too big to hold in memory you can use the tempfile module to temporarily store the shapefile data on disk.

Saturday, August 20, 2011

Create a Zipped Shapefile

Shapefiles consist of at least three files. So zipping up these files is a a common means of moving them around - especially in web applications. You can use PyShp and Python's zipfile module to create a zipped shapefile without ever saving the shapefile to disk (or the zip file for that matter).

Python's "zipfile" module allows you to write files straight from buffer objects including python's StringIO or cStringIO modules. For web applications where you will return the zipped shapefile as part of an http response, you can write the zip file itself to a file-like object without writing it to disk. In this post, the example writes the zip file to disk.

In Python, file-like objects provide a powerful way to re-route complex data structures from the disk to other targets such as a database, memory data structures, or serialized objects. In most other programming languages file-like objects are called "streams" and work in similar fashion. So this post also demonstrates writing shapefiles to file-like objects using a zip file as a target.

Normally when you save a shapefile you call the writer.save method which writes three files to disk. To use file-like objects you call separate save methods for each file: writer.saveShp, writer.saveShx, and writer.saveDbf.

import zipfile
import StringIO
import shapefile

# Set up buffers for saving
shp = StringIO.StringIO()
shx = StringIO.StringIO()
dbf = StringIO.StringIO()

# Make a point shapefile
w = shapefile.Writer(shapefile.POINT)
w.point(90.3, 30)
w.point(92, 40)
w.point(-122.4, 30)
w.point(-90, 35.1)
w.field('FIRST_FLD')
w.field('SECOND_FLD','C','40')
w.record('First','Point')
w.record('Second','Point')
w.record('Third','Point')
w.record('Fourth','Point')

# Save shapefile components to buffers
w.saveShp(shp)
w.saveShx(shx)
w.saveDbf(dbf)

# Save shapefile buffers to zip file 
# Note: zlib must be available for
# ZIP_DEFLATED to compress.  Otherwise
# just use ZIP_STORED.
z = zipfile.ZipFile("myshape.zip", "w", zipfile.ZIP_DEFLATED)
z.writestr("myshape.shp", shp.getvalue())
z.writestr("myshape.shx", shx.getvalue())
z.writestr("myshape.dbf", dbf.getvalue())
z.close()

If you've been using PyShp for awhile make sure you have the latest version. The file-like object save feature was uploaded to the PyShp subversion repository on Aug. 20, 2011 at revision 30.

You can download PyShp here.

You download the sample script above here.