Setup a Jython development environment

This article will show you some basic steps to setup a standalone environment for Jython, with the power of some popular and standard tools.

0. Install essential tools

Take Ubuntu Maverick as an example, before starting, you have to make sure these packages installed on your system.

Essential Python tools (Python is shipped with most Linux distribution by default)
sudo apt-get install python-pip python-virtualenv

Java
sudo apt-get install openjdk-6-jdk openjdk-6-jre

Jython
The Jython package in Ubuntu repository is obsolete. Download Jython installer from official website. A full installation is recommended here.
http://jython.org/downloads.html

As an optional step, I often move Jython direcotry to /usr/local/share/ , and use a symbol link to add Jython to PATH:
sudo ln -s /usr/local/share/jython/bin/jython /usr/local/bin/

Install additional Java tools
sudo apt-get install ant ivy

1. Create a standalone environment

Jython 2.5 is fully compatible with virtualenv, so you can create a standalone environment just like what you do with python:
virtualenv -p /usr/local/bin/jython jython-env
cd jython-env

List the directory, you will see these files and directories:

  • bin/
  • cachedir/
  • Lib/
  • jython.jar
  • registry

Source the activate script:
source bin/activate

2. Download and install python packages

Now the python environment is just ready, you can install python packages as you want. Install bottle, for example:
pip install bottle

3. Download and install Java dependencies

This is the different step. As you know, the most important feature of Jython is availability of Java libraries to Python code. So we need some additional tools to manage Java dependencies. Popular Java lifecycle management tool Maven is not suitable here, because Maven is tightly depends on its archetype. Gradle is also known as a powerful build tool with DSL support. However, Gradle uses Groovy as the scripting language to create build file. Bringing Groovy to a Jython project seems terribly strange.

So I prefer the traditional Ant way to manage dependencies. First of all, create a build file with following content:
build.xml

<project xmlns:ivy="antlib:org.apache.ivy.ant" name="jython-env" default="resolve">
    <target name="resolve" description="retrieve dependencies">
        <ivy:retrieve pattern="javalib/[artifact]-[revision].[ext]" type="jar"/>
    </target>
</project>

Then create an ivy xml to configure dependencies:
ivy.xml

<ivy-module version="2.0">
    <info organisation="info.sunng" module="jython-env" />
    <dependencies>
        <dependency org="commons-lang" name="commons-lang" rev="2.5"/>
    </dependencies>
</ivy-module>

Also, as an optional step, you can overwrite default settings to configure ivy to use maven local repository.
ivysettings.xml

<ivy-settings>
    <settings defaultResolver="maven-or-not"/>
    <resolvers>
        <chain name="maven-or-not" returnFirst="true">
            <filesystem name="maven-local" m2compatible="true" />
            <ibiblio name="ibiblio" m2compatible="true" />
        </chain>
    </resolvers>
</ivy-settings>

All done, now you can download everything. In Ubuntu, default ivy installation, you have to specify ivy path in ant command line:
ant -lib /usr/share/java/ivy.jar

You will see jars in javalib.

4. Reconfigure start up script

The default Jython start up script is not friendly to external Java dependencies. To include jars, you have to use such dirty command line parameters:
jython -Dpython.path=javalib/commons-lang-2.5.jar:javalib/…

To get rid of this, first rename bin/jython to bin/startJython. (Follow the Groovy naming convention)

Then create a new bin/jython script wraps the old one:

#! /bin/bash

JYTHON_CMD="startJython"
JAVA_LIBS="javalib/*.jar"
PYTHON_PATH=""

for lib in $JAVA_LIBS
do
    PYTHON_PATH="$PYTHON_PATH:$lib"
done

#echo $PYTHON_PATH
$JYTHON_CMD -Dpython.path=$PYTHON_PATH

5. Run

Finally, it comes to an end. Type jython to execute the interactive shell:
jython

Jython 2.5.1 (Release_2_5_1:6813, Sep 26 2009, 13:47:54)
[OpenJDK Server VM (Sun Microsystems Inc.)] on java1.6.0_20
Type “help”, “copyright”, “credits” or “license” for more information.
>>> from org.apache.commons.lang import BitField
>>> import bottle

Both Python and Java dependency is accessible. Now it’s time to get your new start.

My first http server, soldat-http

2008年的冬天,有一天和一个不认识的网友聊天,他说,以前也对这些框架啦什么的感兴趣,后来就变了,他说他当时的目标是写一个HTTP服务器。我当时能够认同他的观点,不过没什么特别的共鸣。到了今天春天,项目开始了,周末老大休假回杭州,留下大伙大眼瞪小眼要联调。出现了问题发现自己一窍不通,然后开玩笑说一定要把老大的NIO框架看懂才跳槽。结果不幸还没来得及怎么看就闪人了,还好老大开源,现在依然还可以偷偷看看。另外,我还隐约记得09年冬天的时候,似乎也给自己定过个类似一年之内写个HTTP服务器之类的目标。

所以我说,自己写程序是一件情怀驱动的事情:就像有些同学一再强调自己再也不写wordpress插件了,可是还是不断有新版本发布

好吧,其实这个目标最后没有实现。目前这个版本仅仅是能够work,在2010年快要结束的时候赶紧把他搬出来,聊以自慰,嗯,我这一年不是一无所获。
Screenshot - 12282010 - 09:17:59 PM

接下来我还会进一步完善它和底层的事件驱动框架,还有意完成一个Jython的WSGI实现。

Hudson tips

Hudson的项目有三种状态,分别是failed, Success, Unstable。当单元测试未能通过时,Hudson不会fail掉整个build而是设置为ubstable,并且继续执行post build scripts和actions。这就为集成的版本带来了一些不可知因素。取消这个设置,可以通过在maven options中添加一个 -Dmaven.test.failure.ignore=false。或者在全局设置,manage hudson -> configure hudson -> Global MAVEN_OPTS。这个方法来自:http://stackoverflow.com/questions/1004540/fail-hudson-build-on-single-unit-test-failure

此外,我们借助hudson来自动完成开发环境的部署。从hudson的插件列表中安装ssh plugin和scp plugin。对打包项目新增一个post build action,使用SCP插件把打包生成的压缩文件上传到开发机上。本想同时配置一个build script,在开发环境机器上执行一个自动化部署脚本,但是使用发现ssh plugin的操作居然先于scp操作,而且这个顺序无法配置!不过不要紧,新建一个freestyle项目,项目只利用ssh插件运行远程脚本,再在打包项目配置中新增一个post build action -> build other project,填写前者的项目名,使之成为打包项目的downstream project即可。

The post is brought to you by lekhonee v0.7

Bason: A BSON Serialization Code Generator

Bason is a code generator for object to bson serialization and deserialization. Different from tranditional reflection way, bason uses an annotation processor to generate serialization manager at compile time. You just add Bason as compilation dependency and drop it in the runtime.

To use Bason, you simply add annotation to JavaBeans:

/**
 *
 */

package info.sunng.bason.example;


import java.util.Date;

import info.sunng.bason.annotations.BsonAlias;
import info.sunng.bason.annotations.BsonDocument;
import info.sunng.bason.annotations.BsonIgnore;

/**
 * @author SunNing
 *
 * @since Aug 18, 2010
 */

@BsonDocument
public class Passenger {

    private double packageWeight;

    private long ticketId;

    private String name;

    private Date createdDate;

    private Flight flight;

    /**
     * @return the packageWeight
     */

    @BsonIgnore
    public double getPackageWeight() {
        return packageWeight;
    }

    /**
     * @param packageWeight the packageWeight to set
     */

    public void setPackageWeight(double packageWeight) {
        this.packageWeight = packageWeight;
    }

    /**
     * @return the ticketId
     */

    @BsonAlias("ticket")
    public long getTicketId() {
        return ticketId;
    }

    /**
     * @param ticketId the ticketId to set
     */

    public void setTicketId(long ticketId) {
        this.ticketId = ticketId;
    }

    /**
     * @return the name
     */

    public String getName() {
        return name;
    }

    /**
     * @param name the name to set
     */

    public void setName(String name) {
        this.name = name;
    }

    /**
     * @param createdDate the createdDate to set
     */

    public void setCreatedDate(Date createdDate) {
        this.createdDate = createdDate;
    }

    /**
     * @return the createdDate
     */

    public Date getCreatedDate() {
        return createdDate;
    }

    /**
     * @param flight the flight to set
     */

    public void setFlight(Flight flight) {
        this.flight = flight;
    }

    /**
     * @return the flight
     */

    public Flight getFlight() {
        return flight;
    }

}
  • @BsonDocument marks this bean to be processed by bason processor. Serialization and deserialization support for this bean will be added to the manager. The bean must follow the Java Bean specification that has a getter and a setter for each property.
  • @BsonAlias on the getter allows user the specify a name for bson document instead of the default java bean property name.
  • @BsonIgnore on the getter marks a property to be transient when serialization and deserialization.

Then you need a bason.properties at the root of classpath which looks like

bason.managerClassName=info.sunng.bason.BasonManager

You specify the manager class name here. This name can not be duplicated if you use Bason in multiple modules.

Take maven configuration as an example:

    <dependencies>
        <dependency>
            <groupId>info.sunng.bason</groupId>
            <artifactId>bason-annotation</artifactId>
            <scope>compile</scope>
        </dependency>
        <dependency>
            <groupId>info.sunng.bason</groupId>
            <artifactId>bason-internal</artifactId>
            <scope>compile</scope>
        </dependency>
        <dependency>
            <groupId>org.mongodb</groupId>
            <artifactId>mongo-java-driver</artifactId>
        </dependency>
    </dependencies>

    <build>
        <finalName>${project.artifactId}</finalName>

        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>

                <configuration>
                    <source>1.6</source>
                    <target>1.6</target>

                    <compilerArguments>
                        <processor>info.sunng.bason.internal.BasonProcessor</processor>
                    </compilerArguments>
                </configuration>
            </plugin>
        </plugins>
    </build>

when everything is ready, run mvn compile to generate the manager source file. By default, in a standard maven project, the generated file will be placed at:
/bason-example/target/generated-sources/annotations/

package info.sunng.bason;
import org.bson.*;
import javax.annotation.Generated;
@Generated({"info.sunng.bason.BasonManager"})
public final class BasonManager{
    public static final BSONObject toBson(info.sunng.bason.example.Passenger o){
        if (o == null) {
            throw new NullPointerException();
        }
        BSONObject bson = new BasicBSONObject();
        bson.put("ticket",o.getTicketId());
        bson.put("name",o.getName());
        bson.put("createdDate",o.getCreatedDate());
        bson.put("flight",toBson(o.getFlight()));
        return bson;
    }
    public static final info.sunng.bason.example.Passenger fromBson(info.sunng.bason.example.Passenger o, BSONObject bson){
        if (o == null || bson == null) {
            throw new NullPointerException();
        }
        o.setTicketId((java.lang.Long)bson.get("ticket"));
        o.setName((java.lang.String)bson.get("name"));
        o.setCreatedDate((java.util.Date)bson.get("createdDate"));
        o.setFlight(fromBson(new info.sunng.bason.example.Flight(), (BSONObject)bson.get("flight")));
        return o;
    }
    public static final BSONObject toBson(info.sunng.bason.example.Flight o){
        if (o == null) {
            throw new NullPointerException();
        }
        BSONObject bson = new BasicBSONObject();
        bson.put("company",o.getCompany());
        bson.put("flightId",o.getFlightId());
        return bson;
    }
    public static final info.sunng.bason.example.Flight fromBson(info.sunng.bason.example.Flight o, BSONObject bson){
        if (o == null || bson == null) {
            throw new NullPointerException();
        }
        o.setCompany((java.lang.String)bson.get("company"));
        o.setFlightId((java.lang.String)bson.get("flightId"));
        return o;
    }
}

The project is hosted at
http://github.com/sunng87/bason

If you have any ideas, just let me know.

The post is brought to you by lekhonee v0.7

RPC, Serialization and Schema

The post is brought to you by lekhonee v0.7

糖果项目的后端用Java编写,我负责service gateway的开发(暂且叫sergent),服务以Java接口+Annotation的形式声明,与Spring集成使用,Java对象被序列化为JSON和XML(通过jackson和castor)与外部系统交互。专门的JSON Schema和XML Schema是可选的,系统交互通过简明的文档和人工确认。

RPC框架是跨进程、跨系统交互的重要工具,RPC框架中又包括远程调用、网络传输和序列化反序列化等等部分。流行的工具包括Facebook的thrift,Google的Protobuf和原先Hadoop项目下的avro。其中thrift包含远程调用、反序列化、网络等等全部的功能。Protobuf本身是一个序列化反序列化库,另有很多第三方RPC实现,avro目前除了序列化和反序列化的功能,也包含了ipc的HTTP Server和SocketServer等实现。在序列化的格式方面,Thrift支持JSON和二进制协议,Protobuf本身仅有二进制支持,但已经存在第三方的其他格式实现。 avro原生支持二进制和JSON格式。

从效率上来说,二进制方式的序列化要比文本方式的快。Google Code上(最近迁往了github)有一个tpc项目(thrift-protobuf-compare),根据这个项目的最新的比较结果(与原先不同):

protobuf成为了三者中耗时最少的框架,之后是thrift和avro,这次avro的耗时甚至超过了文本方式的jackson(主要在反序列化上)。

但是二进制协议通常都需要定义Schema,thrift / protobuf / avro三者各自定义了Schema的格式,没有类似XSD和Json Schema的统一标准,也就是说,当你需要传输一个对象,就要为它编写一个Schema文件。按照通常的习惯,都是先编写Schema,然后通过命令行工具或者自动构建工具来生成Java source。对于新系统还好说,对旧系统这个改造就比较麻烦了。另外,二进制协议不便于调试,所以各个thrift/protobuf/avro先后也都有JSON的实现,在文本的序列化格式上,JSON对XML的优势是全方位的。

所以综合起来,很难说有一种完美的解决方案。二进制协议的效率高,但是改造、编写Schema的代价并不小,还要面对核心Model被绑架到具体框架的风险。文本协议开发简便,不需要Schema,直接POJO就可以序列化和反序列化,但是在时间和空间上都不如二进制的方式。

补充
从tpc项目的结果上看,kryo在时间、空间上都击败了所有对手,而且,kryo的API非常简洁,不需要Schema文件就可以序列化POJO,听起来太完美了,看来以后sergent要借鉴一下的。

补充 2010-06-14
发现avro现在也有ReflectDatumReader和ReflectDatumWriter,可以通过反射内部自动映射生成Schema,可以尝试一下。