Squealer (a test framework for Pig script) is using jip

Squealer is a framework written in Jython to test your Apache Pig scripts, by Mark Roddy. It’s now using jip to resolve Java dependencies. On huge dependencies of Hadoop, jip could be great helpful to setup Squealer.

To get started, it is recommended to create a standalone Jython environment for Squealer:
virtualenv -p /usr/local/bin/jython –no-site-packages squealer-env

Activate the environment
cd squealer-env
. bin/activate

Install jip with pip
pip install jip

Download squealer from bitbucket project page, extract it to somewhere. Install it with jip:
jython setup.py install

Dependencies will be downloaded from Maven Central. You just wait for it to finish.

Start a Jython interpreter with ‘jython-all‘ and now you can import squealer.

jip.embed: On-the-fly classpath resolution for Jython

jip 0.7 introduces a module called jip.embed, which allows you to add libraries to your code in the runtime as you declare them. With jip.embed, you don’t have to download jars manually and append them to your -Dpython.path. You just pick your editor, import jip.embed, code your business, then save and run it.

Code example:

import jip.embed
jip.embed.require('commons-lang:commons-lang:2.6')
from org.apache.commons.lang import StringUtils

print StringUtils.reverse('jip rocks')

Output:

skcor pij

(jip will print some log here if dependencies are included for first time)

So please check out my new released jip, 0.7:
https://github.com/sunng87/jip

For virtualenv user, you can install full-featured jip via pip:
$ pip install jip

To install jip globally, download the package from python cheese shop and run:
$ jython setup.py install

jip 0.5.1 released

I just rolled out jip 0.5.1 as a bugfix version of 0.5. It has been published to pypi and you can install it with pip install jip in your virtualenv.

From 0.5, you have new features available:

  • User-Agent ‘jip/0.5‘ is added to http request.
  • New command `freeze` just like pip.
  • Improved jar downloading. By default, at most 3 jars are downloaded in parall.

In 0.5.1, bugs were fixed:

  • Repositories defined in pom are now included for install command.
  • Placeholder resolving of #{pom.groupId} is corrected. (instance)
  • urllib2.URLError is caught to prevent dump.

For any problem and feature request, please refer to github issue tracker.

Don't repeat yourself: distribute jython package with jip.dist

As a new feature in jip 0.4, we can use some helpers from jip.dist to simplify package distribution. With jip.dist, you can define Java dependencies for your jython package. In an environment with jip, dependencies will be automatically installed when user uses pip to get you package.

We have two different approaches allow you to choose.

Approach 1, Define dependencies in POM

This is the standard maven way. To describe your jython package and its dependencies, create a pom.xml in your project. The directory hierarchy looks like:

├── app
│   ├── module1
│   │   ├── __init__.py
│   ├── core.py
│   ├── __init__.py
├── LICENSE
├── MANIFEST.in
├── pom.xml
├── README
└── setup.py

In pom.xml, just add dependencies as you do with Maven.

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" <br="" />  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"&gt;

    ...

    <dependencies>
        <dependency>
            <groupid>org.slf4j</groupid>
            <artifactid>slf4j-api</artifactid>
            <version>1.6.1</version>
        </dependency>

        <dependency>
            <groupid>org.slf4j</groupid>
            <artifactid>slf4j-log4j12</artifactid>
            <version>1.6.1</version>
        </dependency>

        ...

    </dependencies>

    <repositories>
        <repository>
            <id>sonatype-oss-sonatype</id>
            <url>http://oss.sonatype.org/content/repositories/snapshots/</url>
        </repository>
    </repositories>
</project>

You can also define repositories in pom.xml if you use custom repository. (for example, jboss, java.net)

Remember to add pom.xml in your MANIFEST.in, to ensure the file will be packaged into source package:

include pom.xml

Approach 2, Define dependencies with Python

You may be tired with endless XML configuration. jip allows you to define dependencies with python, just like gradle with groovy.

In your setup.py, add something like:

requires_java = {
    'dependencies':[
        ## (groupdId, artifactId, version)
        ('org.slf4j', 'slf4j-api', '1.6.1'),
        ('org.slf4j', 'slf4j-log4j12', '1.6.1'),
        ('info.sunng.soldat', 'soldat', '1.0-SNAPSHOT'),
        ('org.apache.mina', 'mina-core', '2.0.2')
    ],
    'repositories':[
        ('sonatype-oss-snapshot', 'http://oss.sonatype.org/content/repositories/snapshots/')
    ]
}

Then pass it to setup(). The keyword argument require_java is jip specific.

setup(
    ...
    requires_java=requires_java,
    ...)

Use jip’s setup wrapper

To use jip’s power, the only difference is to use setup() from jip.dist instead of setuptools or distutils.

from jip.dist import setup

Then publish your jython package to Python Cheese Shop:
$ jython setup.py sdist upload

Internally, jip uses setuptools. So you can still do jython setup.py develop .

And jip 0.4 is available under MIT License. You are free to use jip.dist in your code.

For your user

You should write a guide forcing users to use your jython application within virtualenv. And install jip as a prerequisite:
$ pip install jip

Then simply install your package with pip:
$ pip install <your-package-name>

No additional step required!

So please just release your jython package with jip !

For more information:

Enhanced jip to simplify Jython module distribution

As you might notice, the installation of gefr is too complex, requiring several manual actions. User have to remember the long maven coodinator to resolve dependencies. wtf!

So I have been working whole day to simplify the process. 

Currently, gefr’s approach is to upload pom.xml to a public maven repositoy (sonatype oss). And the source is uploaded to pypi. Pip will find the source and install it. Jip will find the plain pom and resolve it. Because once pip finished the installation, the source package will be erased and jip can never find the pom. So I have to distribute them seperately.

It would be better to invoke jip right before pip exits. I did some investigation about post-install script. And lucky enough, distutils allows you to override default install command. We can use it to invoke jip, in the scope of setup scrript. It does make sense.

Another problem is the original design of .jip configuration file. From jip 0.2, .jip is available as environmen-scoped: we define some custom repositories and the whole environment shares them. But if we have multiple projects in the same environment, jip may waste time to find public artifact in a private repository. Even worse, jip may load invalid artifact from private repos. In the new design, private repositories are defined in pom.xml, as project-scoped, just like most java build tools.

With new jip, to distribute a jython package, you should write a pom in the same way of Java, specifying the dependencies and custom repositories if you have. Then modify your MANIFEST.in to include it to your source package. At last, in the setup.py, define a new install command to call jip and pass the command into the setup() . From the new approach, only ‘pip install’ is required for end user. Super easy!

Upon the new usage of jip(as a library), I am also considering to migrate from GPL to LGPL or MIT. I have little knowledge about conflicts between licenses. So if you have some ideas or concerns, feel free to let me know.