用python解压web服务器返回的gzip数据

之前用tcpdump抓包的时候,只要是gzip压缩过的数据就没有办法直接还原原始的数据。这段时间学了一下python正好看里面有gzip模块。今天先尝试了一下解压web server返回的压缩过的数据。测试了一下OK

#!/usr/bin/env python  
import urllib2  
import gzip  
import binascii  
from StringIO import StringIO  
def gunziptxt(data):  
buf = StringIO(data)  
of =gzip.GzipFile(fileobj=buf,mode="rb")  
outdata=of.read()  
return outdata  
url="http://127.0.0.1/index.html"  
request=urllib2.Request(url,headers={'User-agent':"python urllib browser","Accept-Encoding":'gzip'})  
try:  
response=urllib2.urlopen(request,timeout=5)  
data=response.read()  
except:  
print "get %s response failed" %url  
print "headers:\n",response.info()  
if response.info()["content-encoding"] == 'gzip':  
print "http resonse is gzip"  
outdata=gunziptxt(data)  
lbuf=StringIO()  
with gzip.GzipFile(mode='wb',fileobj=lbuf) as inf:  
inf.write(data)  
gziplen=len(lbuf.getvalue())  
print "gzip %d and gunzip %d"%(gziplen,len(outdata))  
else:  
print "http resonse is not gzip"  
outdata=data  
print "http response:\n",outdata

要获取压缩过的数据的长度只能先借助StingIO,把原始的输出放进去后才能用len来得到。