Monthly Archives: May 2009

Updating RubyGems in OSX 10.5.7

.7When recently trying to install Sinatra via RubyGems, I got a message that RubyGems was out of date. I figured that gem would be smart enough to have an easy upgrade command, so there had to be a command to easily upgrade. Naturally there is:

gem update --system

I only found this when looking through google, and I got a series of pages warning to be careful when using gem update --system as it can kill existing gems (http://puctuatedproductivity.com/2007/11/01/unistalling-ruby-installed-by-source-on-os-x, http://thenoobonrails.blogspot.com/2008/06/doing-gem-update-system-might-lose-all.html) so I was a bit nervous.  Since I have a periodic use of ruby and I'm lazy enough to make Larry Wall proud, I figured I'd take a punt on just using gem update --system.  Turns out it just works, and I've kept all my old gems.  Hooray.  Given that the posts talking about issues are old, I'm either assuming that they've done things differently to me, or things have been fixed since then… so… if you need to update gems due to a message:

ERROR: Error installing sinatra:
fastthread requires RubyGems version >= 1.2

or similar, just use gem update --system

A Review of 5 Java JSON Libraries

 

json.org lists 18 different Java libraries for working with JSON (Flexjson gets a double mention). These provide varying levels of functionality, from the simplest (the default org.json packages), to more comprehensive solutions like XStream and Jackson. Join me on a quick review of some of these, focusing on those which have friendly licenses, and meet my requirements.  If you are lazy, you can fast forward to my summary

My Requirements

  1. Serialises and Deserialises JSON
  2. Lightweight and Simple
  3. runs on Java 1.4
  4. Friendly license

The contendors

  1. org.json
  2. Jackson
  3. XStream
  4. JsonMarshaller
  5. JSON.simple

Serialises and Deserialises JSON

This might sound like an obvious requirement, but I’ve seen at least one library which was completely focused on spitting out JSON, without any support for reading JSON. I’m actually using this as a pre-requisite for inclusion in my comparison. If a library can’t read AND write JSON, I’m not going to consider it.

Lightweight

I’ll begin by stating that my actual usecase is to operate within a plugin for EditLive!. I don’t need a all singing all dancing JSON serialisation/deserialisation library. There are some very cool libraries out there that do awesome stuff, but all I need to do is read and write JSON data.

Coupled with this is that I’ll want to be able to keep the memory footprint pretty low, so want to work with Java Streams without needing to necessarily pull in the whole serialised object if I don’t need it.

Runs on Java 1.4

Yep it’s still out there. Thankfully Java 1.4.2 has reached it’s EOL, but businesses can still request patches, and there are most definitely still Ephox clients running on this JRE, even though more recent JRE’s work so much better. (side note: If you have the option of upgrading your JRE to Java 6, please do it, the children in Africa will be much happier. Everytime someone runs up a 1.4 JRE a puppy dies). 1.4 is in it’s final death throws, but it is still kicking.

Friendly License

For Ephox to make money from the product/component that uses JSON (gotta think about the $$$ at the end of the day), I’ll need to make sure that the license is non-viral and Enterprise friendly. Apache license good. GPL bad. (sorry FSF)

Assessment 

So having run through the requirements, we can now consider the options. For each library, I’ll provide a simple table.

The metrics I’m using to judge the libraries are included in the table. The most crude metric that I’ve got is the number of classes. I’m more than happy to admit that this is a very crude way to measure how lightweight the library is, but it does provide an ok rough heuristic, particularly given that there are order of magnitude differences.

org.json

The granddady of them all. This comes pretty close to being a reference implementation. It provides a nice simple API (7 classes), doesn’t try and do any magic, and just makes sense. I’ve used it before when working with small amounts of data. Unfortunately it doesn’t provide any streaming goodness.

url http://www.JSON.org/java/index.html
classes 7
Streaming support No
Friendly License Yes
Java 1.4 Yes

Jackson

Jackson advertises itself as a fast powerful conformant JSON processor. It provides heaps of features, and looks to be a good tool for reading and writing JSON in a variety of ways (see the Jackson tutorial for more). The drawback of Jackson for my purposes is that it isn’t exactly svette at 250 classes.

url http://jackson.codehaus.org/
classes ~250
Streaming support Yes
Friendly License Yes
Java 1.4 Yes

XStream

XStream gets a mention because it’s cool :). I haven’t really considered it because it provides more of a direct object serialisation format, which wasn't quite what I'm looking for. Also, it’s heritage as an xml serialisation format shows, and it likes Java 5 much better. The ability to directly go between Javabeans and JSON java classes is cool, but I don't need this magic or the 200+ classes that come with it.

url http://xstream.codehaus.org/
classes >200
Streaming support Yes
Friendly License Yes
Java 1.4 Yes

Json Marshaller

Json Marshaller sells itself (it almost sounds like a bolierplate project description by now) as “Fast, Lightweight, Easy to Use and Type Safe JSON marshalling library for Java”. It’s been under consistent active development for a number of years, and looks to be headed in the right direction. Unfortunately the current version has 3 deal stopping flaws for my environment at the moment.

  1. It requires Java 5
  2. It has a dependancy on ASM (the developers are looking to remove his dependancy)
  3. While it hasn’t quite piled on the bulk of XStream or Jackson, it still has a couple to many classes for me to consider

These constraints make it not quite fit for my purposes, but like all decisions, it depends on your own situation.

url http://code.google.com/p/jsonmarshaller/
classes ~50
Streaming support Yes
Friendly License Yes
Java 1.4 No

JSON.simple

JSON.simple advertises itself as “a simple Java toolkit for JSON”. It provides reading and writing to JSON streams. It’s lightweight and focused on generating JSON from Java code. The critical feature it provides is support for Java IO readers and writers.

url http://code.google.com/p/json-simple/
classes 12
Streaming support Yes
Friendly License Yes
Java 1.4 Yes

Summary 

For the interested, here’s a table that summarises my findings.

  org.json Jackson XStream Json Marshaller JSON.Simple
classes 7 ~250 >200 ~50 12
Streaming support No Yes Yes Yes Yes
Friendly License Yes Yes Yes Yes Yes
Java 1.4 Yes Yes Yes No Yes

 

Conclusion

If you are looking for a simple lightweight Java library that reads and writes JSON, and supports Streams, JSON.simple is probably a good match. It does what it says on the box in 12 classes, and works on legacy (1.4) JREs.

Choosing a data storage format

In case you haven’t noticed, XML is not a silver bullet. (google xml+silver+bullet). It is not, and should not be an automatic choice when thinking of a data storage format. The ubiquitous libraries for working with XML are often hard to use, and are often overkill for a simple storage format. In today’s world, I’d suggest that the following options should be considered (at least briefly).

  1. Native Object Serialisation
  2. Custom format
  3. XML – Extensible Markup Language
  4. YAMLYAML Ain’t a Markup Language (obviously created by geeks with the recursive name)
  5. JSON – JavaScript Object Notation

Join me in having a look at these formats, and I’ll let you know some of the issues to consider. The main problem I’m solving is for data that belongs to your own application. I’m not considering databases or interoperability.

Native Object Serialisation

Consider this briefly before running away. I’m particularly familiar with the idea of Java Object serialisation. I’ve used Prevayler in the past storing java objects, and xml (So while I’m having a dig at Java Object serialisation in general, I’m not specifically having a go at prevayler).

While the use of native object serialisation is often easy, it has costs, making the content unreadable by humans, coupling the data storage to your implementation language, and can create object migration issues. These costs will typically outweigh the benefits. Having human readable data to aid debugging would provide reason for not using native object serialisation if there was nothing else.

Custom Format

The use of a custom simple text format should not be discarded out of hand. The lack of any third party dependancies is a useful feature, and should be considered. That said, if you have a library that does the parsing for you, that should not be sneezed at.

XML

As wikipedia says, “XML is a general-purpose specification for creating custom mark-up languages” (Wikipedia on XML). Parsers and tools exist for many platforms and environments, which makes it a useful tool when you want to share information between different environments. While a good tool, the syntax is verbose, and can be hard for humans to read.

XML has influenced the birth of two of two more recent notations which are useful for data storage: YAML, and JSON

YAML

YAML purports to be “a human friendly data serialization standard for all programming languages” (Yaml.org). It has a well defined specification (YAML Spec), and makes for an easy to understand data storage format. Implementations of YAML exist for a wide range of languages, including Java, C++, Ruby and Javascript. It’s been around for a while, and has a decent amount of uptake. If it wasn’t for JSON, it would probably be a good default choice.

JSON

At first glance JSON seems much less suitable than YAML for languages other than JavaScript. The kicker against it is that it has “JavaScript” in the name, which has always made people feel icky. That said, it does make for a good cross platform format, it is human readable, and is implemented on a wide range of platforms (Json.org).

JSON has also has the advantages of having mindshare, and is slightly more familiar to developers than YAML. Every developer who has had anything to do with the web has done stuff with JavaScript, so the basic format will be familiar to them. Also in JSON’s favour is the fact that JSON and YAML are syntactically very close (see Redhanded). JSON appears to be very close to a subset of YAML(Ajaxian). In addition, the general applicability of JSON is higher, particularly for people who are going to be doing Javascript development. Also, if you have any possibility of playing with JavaScript, JSON is a very good option because of the native support in JavaScript.

These factors combine to make JSON an excellent choice.

Summary

Tim Bray makes a good case for this being an automatic choice based on your circumstances(http://www.tbray.org/ongoing/When/200x/2006/12/21/JSON ). You’ll still need to think about the pros and cons of the different technologies for your situation (see http://webignition.net/articles/xml-vs-yaml-vs-json-a-study-to-find-answers/), but you’ll often find that JSON is a good format to use for data storage.