Apache Commons logo Commons Lang

What's new in Commons Lang 2.4?

Commons Lang 2.4 is out, and the obvious question is: "So what? What's changed?".

This article aims to briefly cover the changes and save you from having to dig through each JIRA issue to see what went on in the year of development between Lang 2.3 and 2.4.

Deprecations

First, let us start with a couple of deprecations. As you can see in the release notes, we chose to deprecate the ObjectUtils.appendIdentityToString(StringBuffer, Object) method as its null handling did not match its design (see LANG-360 for more details. Instead users should use ObjectUtils.identityToString(StringBuffer, Object).

We also deprecated DateUtils.add(java.util.Date, int, int). It should have been private from the beginning; please let us know if you actually use it.

The build

Before we move on, a quick note on the build: we built 2.4 using Maven 2 and Java 1.4. We also tested that the Ant build passed the tests successfully under Java 1.3, and that the classes compiled under Java 1.2. As it's been so long, we stopped building a Java 1.1-compatible jar. Most importantly, it should be a drop in replacement for Lang 2.3, but we recommend testing first, of course. Also, for those of you who work within an OSGi framework, the jar should be ready for OSGi. Now... time to move on.

New classes

Three new classes were added, so let's cover those next.

Firstly, we added an IEEE754rUtils class to the org.apache.commons.lang.math package. This candidate for ugly name of the month was needed to add IEEE-754r semantics for some of the NumberUtils methods. The relevant part of that IEEE specification in this case is the NaN handling for min and max methods, and you can read more about it in LANG-381.

Second and third on our newcomers list are the ExtendedMessageFormat class and its peer FormatFactory interface, both found in the org.apache.commons.lang.text package.

Together they allow you to take the java.text.MessageFormat class further and insert your own formatting elements.

By way of an example, imagine that we have a need for custom formatting of an employee identification number or EIN. Perhaps, simplistically, our EIN is composed of a two-character department code followed by a four-digit number, and that it is customary within our organization to render the EIN with a hyphen following the department identifier. Here we'll represent the EIN as a simple String (of course in real life we would likely create a class composed of department and number). We can create a custom Format class:


public class EINFormat extends Format {
  private char[] idMask;

  public EINFormat() {
  }
  public EINFormat(char maskChar) {
    idMask = new char[4];
    Arrays.fill(idMask, maskChar);
  }
  public StringBuffer format(Object obj, StringBuffer toAppendTo, FieldPosition pos) {
    String ein = (String) obj; //assume or assert length >= 2
    if (idMask == null) {
      return new StringBuffer(ein).insert(2, '-').toString();
    }
    return new StringBuffer(ein.substring(0, 2)).append('-').append(idMask).toString();
  }
  public Object parseObject(String source, ParsePosition pos) {
    int idx = pos.getIndex();
    int endIdx = idx + 7;
    if (source == null || source.length() < endIdx) {
      pos.setErrorIndex(idx);
      return null;
    }
    if (source.charAt(idx + 2) != '-') {
      pos.setErrorIndex(idx);
      return null;
    }
    pos.setIndex(endIdx);
    return source.substring(idx, endIdx).deleteCharAt(2);
  }
}
Our custom EIN format is made available for MessageFormat-style processing by a FormatFactory implementation:

public class EINFormatFactory implements FormatFactory {
  public static final String EIN_FORMAT = "ein";
  public Format getFormat(String name, String arguments, Locale locale) {
    if (EIN_FORMAT.equals(name)) {
      if (arguments == null || "".equals(arguments)) {
        return new EINFormat();
      }
      return new EINFormat(arguments.charAt(0));
    }
    return null;
  }
}
Now you simply provide a java.util.Map<String, FormatFactory> registry (keyed by format type) to ExtendedMessageFormat:

new ExtendedMessageFormat("EIN: {0,ein}", Collections.singletonMap(EINFormatFactory.EIN_FORMAT, new EINFormatFactory()));
As expected, this will render a String EIN "AA9999" as: "EIN: AA-9999".

If we wanted to trigger the EIN masking code, we could trigger that in the format pattern:

new ExtendedMessageFormat("EIN: {0,ein,#}", Collections.singletonMap(EINFormatFactory.EIN_FORMAT, new EINFormatFactory()));
This should render "AA9999" as: "EIN: AA-####".

You can also use ExtendedMessageFormat to override any or all of the built-in formats supported by java.text.MessageFormat. Finally, note that because ExtendedMessageFormat extends MessageFormat it should work in most cases as a true drop-in replacement.

New methods

There were 58 new methods added to existing Commons Lang classes. Going through each one, one at a time would be dull, and fortunately there are some nice groupings that we can discuss instead:

CharSet getInstance(String[]) adds an additional builder method by which you can build a CharSet from multiple sets of characters at the same time. If you weren't aware of the CharSet class, it holds a set of characters created by a simple pattern language allowing constructs such as "a-z" and "^a" (everything but 'a'). It's most used by the CharSetUtils class, and came out of CharSetUtils.translate, a simple variant of the UNIX tr command.

ClassUtils canonical name methods are akin to the non 'Canonical' methods, except they work with the more human readable int[] type names rather than the JVM versions of [I. This makes them useful for parsing input from developer's configuration files.

ClassUtils toClass(String[]) is very easy to explain - it calls toClass on each Object in the array and returns an array of Class objects.

ClassUtils wrapper->primitive conversions are the reflection of the pre-existing primitiveToWrapper methods. Again easy to explain, they turn an array of Integer into an array of int[].

ObjectUtils identityToString(StringBuffer, Object) is the StringBuffer variant of the pre-existing identityToString method. In case you've not met that before, it produces the toString that would have been produced by an Object if it hadn't been overridden.

StringEscapeUtils CSV methods are a new addition to our range of simple parser/printers. These, quite as expected, parse and unparse CSV text as per RFC-4180.

StringUtils has a host of new methods, as always, and we'll leave these for later.

WordUtils abbreviate finds the first space after the lower limit and abbreviates the text.

math.IntRange/LongRange.toArray turn the range into an array of primitive int/longs contained in the range.

text.StrMatch.isMatch(char[], int) is a helper method for checking whether there was a match with the StrMatcher objects.

time.DateFormatUtils format(Calendar, ...) provide Calendar variants for the pre-existing format methods. If these are new to you, they are helper methods to formatting a date.

time.DateUtils getFragment* methods are used to splice the time element out of Date. If you have 2008/12/13 14:57, then these could, for example, pull out the 13.

time.DateUtils setXxx methods round off our walk through the methods - the setXxx variant of the existing addXxx helper methods.

StringUtils methods

The StringUtils class is a little large, isn't it? Sorry, but it's gotten bigger.

Hopefully they are in many cases self-describing. Rather than spend a lot of time describing them, we'll let you read the Javadoc of the ones that interest you.

What's fixed in Lang 2.4?

In addition to new things, there are the bugfixes. As you can tell from the release notes, there are a good few - 24 in fact according to JIRA. Here are some of the interesting ones:

  • LANG-393 - We fixed EqualsBuilder so that it understands that BigDecimals are equal even when they think they're not. It seems very likely that usually you will want "29.0" and "29.00" to be equal, even if BigDecimal disagrees.
  • LANG-380 - Chances are you'll know if you met this one. Fraction.reduce has an infinite loop if the numerator is 0.
  • LANG-369, LANG-367, LANG-334 - Threading bugs - we improved how things work in concurrency situations for ExceptionUtils, FastDateFormat and Enum.
  • LANG-346 - DateUtils.round was getting things wrong for minutes and seconds.
  • LANG-328 - LocaleUtils.toLocale was broken if there was no country code defined.

So long, farewell...

Hopefully that was all of interest. Don't forget to download Lang 2.4, or, for the Maven repository users, upgrade your <version> tag to 2.4. Please feel free to raise any questions you might have on the mailing lists, and report bugs or enhancements in the issue tracker.