Which University

这年头怪事真叫多。今天在珠江路地铁站,突然过来一位深度大叔,弱弱地问我这个方向是不是去火车站,我说这大叔也真不容易,牌子都不认识怎么进来的,赶紧说是的。
大叔继续问你也去火车站吗,我想这大叔估计第一次来,人生地不熟的,也说是。
这就罢了吧,您跟着我就行了。但是大叔不罢休,继续,“你是学生吗”。我本能地把手装进口袋握住钱包,说了一个我平生最凄凉的谎话,是的。以为会话到此结束的朋友你们错了,大叔继续不依不饶,问你是哪个学校的啊,我刚要如实撒谎,大叔蹦出一句which university(请相信我的听力,我的听力和大叔的口语水平还是相当的),这句没让我当场摔倒完全是多亏中午吃得多有劲。
奇妙的对话从他娘的which university开始了,大叔滔滔不绝
“英语怎么样?四级过了吗?”
“你是学什么专业的?which major?”
“What about your speaking English?口语怎么样啊?”
“你知道向日葵怎么说吗?桑福拉沃”
“月季呢?”
“肉丝你知道吧?玫瑰。加上中国,China rose就是月季”
(地铁来了,我不堪忍受,从旁边的门进去,大叔跟过来继续)
“toilet你知道吧?toilet water你猜是什么?”
这个时候终于有个座位,大叔当然不让坐下,还不忘循循善诱,“是花露水”。

好在这课也上到花露水为止。临下车摸了一下手机和钱包,嗯,这课是免费试听的,也罢也罢。原来我长得这么像是有英语培训市场价值的啊。

OAuth Step by Step

最近琢磨OAuth认证方式。OAuth的优点主要在于

  • 用户不需要直接提供用户名密码给第三方应用,就可以让第三方应用访问受限资源;
  • 资源提供方对第三方应用有更细粒度的控制。

在整个OAuth协议里,生成signature的base string是最容易出错的部分。它由HTTP方法名、URL编码的请求路径和请求的参数表组成。
请求的参数表是除去oauth_signature以外的所有参数,按参数名排序,并进行url转义

def to_signature_key(method, url, data):
	keys = list(data.keys())
	keys.sort()
	encoded = urllib.quote("&".join([key+"="+data[key] for key in keys]))
	return "&".join([method, urllib.quote(url, safe="~"), encoded])

有了这个通用的生成signature base string的方法,以后就可以根据OAuth协议规范按步骤进行。

首先获取Request Token。这一步通常使用资源提供方注册的API Key和API Key Secret

def request_token_params(consumer_key, consumer_secret, path, method='GET'):
	data={}
	data['oauth_consumer_key']=consumer_key
	data['oauth_signature_method']='HMAC-SHA1'
	data['oauth_timestamp']=str(int(time.time()))
	data['oauth_nonce']=''.join([str(random.randint(0,9)) for i in range(10)])
	print data

	msg = to_signature_key(method, path, data)
	print msg

	signed = base64.b64encode(hmac.new(consumer_secret+"&", msg, hashlib.sha1).digest())
	print signed
	data['oauth_signature']=signed
	return data

def result2dict(result_string):
	d = {}
	params = res.split('&')
	for p in params:
		d[p.split('=')[0]] = p.split('=')[1]
	return d

conn = httplib.HTTPConnection("www.douban.com", 80)

params = request_token_params(consumer_key, consumer_secret, request_token_path)
conn.request('GET', request_token_path+"?"+urllib.urlencode(params))
res = conn.getresponse().read()
print res
request_token = result2dict(res)

这一步可以获得未经认证的Request Token和Request Token Secret。需要注意的细节是在计算hmac签名的时候,即使只有一个Token Secret,仍然需要加上”&”

第二步要求用户授权该Request Token,打开浏览器,将用户定向到相应的授权页面,参数为上一步获得的Request Token

第三步,用授权过的Request Token换取Access Token。这一步类似第一步,只是用于签名的token包括API Key Secret和Request Token

def access_token_params(consumer_key, consumer_secret, oauth_token, oauth_secret, path, method='GET'):
	data={}
	data['oauth_consumer_key']=consumer_key
	data['oauth_signature_method']='HMAC-SHA1'
	data['oauth_timestamp']=str(int(time.time()))
	data['oauth_nonce']=''.join([str(random.randint(0,9)) for i in range(10)])
	data['oauth_token'] = oauth_token

	msg = to_signature_key(method, path, data)
	print msg

	signed = base64.b64encode(hmac.new(consumer_secret+"&"+oauth_secret, msg, hashlib.sha1).digest())
	print signed
	data['oauth_signature']=signed
	return data

params = access_token_params(consumer_key, consumer_secret, request_token['oauth_token'],
	request_token['oauth_token_secret'], access_token_path)
conn.request('GET', access_token_path+"?"+urllib.urlencode(params))
res = conn.getresponse().read()
print res
access_token = result2dict(res)

这一步将至少返回Access Token和Access Token Secret,是最终用于访问受限资源的Token。以豆瓣的实现为例,OAuth的相关参数应放在HTTP头里随请求进行发送。

def oauth_header(consumer_key, consumer_secret, oauth_token, oauth_secret, path, realm):
	data = access_token_params(consumer_key, consumer_secret, oauth_token, oauth_secret, path, method="POST")
	header_string = ','.join([key+'="'+data[key]+'"' for key in data.keys()])
	return 'OAuth realm="'+realm+'",'+header_string

posturl = 'http://api.douban.com/miniblog/saying'

content = """<?xml version='1.0' encoding='UTF-8'?>
<entry xmlns:ns0="http://www.w3.org/2005/Atom" xmlns:db="http://www.douban.com/xmlns/">
<content>li lei ju le han mei mei</content>
</entry>
"""

header = {}
header['Authorization'] = oauth_header(consumer_key, consumer_secret,
		access_token['oauth_token'], access_token['oauth_token_secret'],
		posturl, "http://api.douban.com")
header['Content-Type'] = 'application/atom+xml'
print header

conn.request('POST', posturl, content, header)
res = conn.getresponse().read()

print res

conn.close()

在这一步中,用于生成signature base string的url是要访问的受限资源地址,而签名的参数表依然是oauth相关的参数。
生成的Authorization头如下

Authorization: OAuth realm="http://api.douban.com",
    oauth_nonce="8735717688",
    oauth_timestamp="1262613619",
    oauth_consumer_key="0bc081a01168b263234184e0343a1729",
    oauth_signature_method="HMAC-SHA1",
    oauth_token="5fb836c37543ad691f28a44a5fcb083b",
    oauth_signature="jk6p5qaXVPrGQctSzpO5jjYHfDk="

用这个头就可以在一定的时间内访问所有授权范围内的受限资源。

代码是ugly了一些,不过应该相对易于理解吧。

Happy new year with Yan 0.4

I am glad to release the Yan 0.4 at the last night of 2009. It’s an important release which witnesses the improvement on both code and the project itself. We have great changes in all modules.  Packages have been rearranged and renamed to be more comprehensible. Let’s dive into the changeset:

Changeset

  • ApiKey database derby / h2 support, still use hsql as default because it’s fastest (#9)
  • Captcha provider classes now loaded dynamically on runtime. They are not managed by guice container any longer. (#15)
  • The useless cache module was completely removed (#10)
  • ApiKey CRUD UI has been replaced by RESTful GET/PUT/DELETE interfaces
  • For text-based captcha, just return the question in ticket, no need for (also impossible to)  retrieve it from browser (#8)
  • Resource has been separated from the project file (#14 #18)
  • Add test to make sure the /ticket and /validate request are sent by the same application (#17)
  • Standardized error output: use HTTP error for client(browser), a selected error object for application (#24 #26)
  • /captcha/ request (invoked by client directly) now accepts configuration parameters
  • ApiKey is now binded to specified domain, it will check the /captcha request’s referer (#27)
  • NEW captcha provider introduced in: Tiled Image Captcha (#12)
  • JMX monitoring support on EhCache (#33)
  • a great deal of code improvements and bug-fixes

Interface changes

The object return by /ticket has its attribute “url” renamed to “data”.

Screenshots

Use visualvm or jconsole the monitor ehcache status and statistics (enabled in 0.4 by default):

The new captcha provider in ruby and python sample application (Sample Code):

Download & Deploy

Since 0.4, yan package are available for download directly. Please refer to the  download page, grab both yan-0.4.war and yan-resource.tar.bz2. Just throw the war package to your servlet container.  Then extract the resource package to your disk. Don’t forget to set the environment variable:
export YAN_RESOURCE=/path/to/your/resource

Start the servlet container in the same context, browser http://localhost:8080/yan-0.4/ to see the test page and emulate the captcha process.

Retrieve the code

Clone the mercurial repository from bitbucket.org

$ hg clone https://sunng@bitbucket.org/sunng/yan/

You will get a copy of whole code repository (because mercurial is a distributed version control system). You are on the default branch at the beginning. The default branch maintains the code of 0.4, if you want to see latest things on Yan, switch to the development branch by

$ hg update 0.5-dev

Yan 0.5 is already on the way.

Reporting Issue

Issue reporting and patch submitting are always welcomed. Check the issue tracker on bitbucket, you will find new features in 0.5

Thanks for your support and !