Commons Compress – Known Limitations and Problems

This page lists the known limitations and problems of Apache Commons Compress™ grouped by the archiving/compression format they apply to.

General

Several implementations of decompressors and unarchivers will invoke skip on the underlying InputStream which may throw an IOException in some stream implementations. One known case where this happens is when using System.in as input. If you encounter an exception with a message like "Illegal seek" we recommend you wrap your stream in a SkipShieldingInputStream from our utils package before passing it to Compress.
Commons Compress prior to 1.21 cannot be built on JDK 14 or newer.

7Z

the format requires the otherwise optional XZ for Java library.
only Files are supported as input/output, not streams. Starting with Compress 1.13 SeekableByteChannel is supported as well.
In Compress 1.7 ArchiveStreamFactory will not auto-detect 7z archives, starting with 1.8 it will throw a StreamingNotSupportedException when reading from a 7z archive.
Encryption, solid compression and header compression are only supported when reading archives
Commons Compress 1.12 and earlier didn't support writing LZMA.
Several of the "methods" supported by 7z are not implemented in Compress.
No support for writing multi-volume archives. Such archives can be read by simply concatenating the parts, for example by using MultiReadOnlySeekableByteChannel.
Support for some BCJ filters and the DELTA filter has been added with Compress 1.8. Because of a known bug in version 1.4 of the XZ for Java library, archives using BCJ filters will cause an AssertionError when read. If you need support for BCJ filters you must use XZ for Java 1.5 or later.

AR

AR archives can not contain directories - this is a limitation of the format rather than one of Compress' implementation.
file names longer than 16 characters are only fully supported using the BSD dialect, the GNU/SRV4 dialect is only supported when reading archives.

ARJ

read-only support
no support for compression, encryption or multi-volume archives

Brotli

the format requires the otherwise optional Google Brotli dec library.
read-only support
CompressorStreamFactory is not able to auto-detect streams using Brotli compression.

BZIP2

Versions of Compress prior to 1.4.1 are vulnerable to a possible denial of service attack, see the Security Reports page for details.

CPIO

We are not aware of any problems.

DEFLATE

CompressorStreamFactory is not able to auto-detect streams using DEFLATE compression.

DEFLATE64

CompressorStreamFactory is not able to auto-detect streams using DEFLATE64 compression.
read-only support

DUMP

read-only support
only the new-fs format is supported
the only compression algorithm supported is zlib

GZIP

We are not aware of any problems.

JAR

JAR archives are special ZIP archives, all limitations of ZIP apply to JAR as well.

ArchiveStreamFactory cannot tell JAR archives from ZIP archives and will not auto-detect JARs.
Compress doesn't provide special access to the archive's MANIFEST

LZ4

In theory LZ4 compressed streams can contain literals and copies of arbitrary length while Commons Compress only supports sizes up to 2⁶³ - 1 (i.e. ≈ 9.2 EB).

LZMA

the format requires the otherwise optional XZ for Java library.
Commons Compress 1.12 and earlier only support reading the format

PACK200

Pack200 support in Commons Compress prior to 1.21 relies on the Pack200 class of the Java classlib. Java 14 removed support and thus Pack200 will not work at all when running on Java 14 or later.

Starting with Commons Compress 1.21 the classlib implementation is no longer used at all, instead Commons Compress contains the pack200 code of the retired Apache Harmony™ project.

SNAPPY

Commons Compress 1.13 and earlier only support reading the format

TAR

sparse files could not be read in version prior to Compress 1.20
sparse files can not be written
only a subset of the GNU and POSIX extensions are supported
In Compress 1.6 TarArchiveInputStream could fail to read the full contents of an entry unless the stream was wrapped in a buffering stream.

XZ

the format requires the otherwise optional XZ for Java library.

Z

Prior to Compress 1.8.1 CompressorStreamFactory was not able to auto-detect streams using .Z compression.
read-only support

ZIP

ZipArchiveInputStream is limited and may even return false contents in some cases, use ZipFile whenever possible. See the ZIP documentation page for details. This limitation is a result of streaming data vs using random access and not a limitation of Compress' specific implementation.
only a subset of compression methods are supported, including the most common STORED and DEFLATEd. IMPLODE, SHRINK, DEFLATE64 and BZIP2 support is read-only.
no support for encryption
no support for multi-volume archives prior to Compress 1.20
It is currently not possible to write split archives with more than 64k segments. When creating split archives with more than 100 segments you will need to adjust the file names as ZipArchiveOutputStream assumes extensions will be three characters long.
In versions prior to Compress 1.6 ZipArchiveEntries read from an archive will contain non-zero millisecond values when using Java 8 or later rather than the expected two-second granularity.
Compress 1.7 has a known bug where the very first entry of an archive will not be read correctly by ZipArchiveInputStream if it used the STORED method.
ZipArchiveEntry#getLastModifiedDate uses ZipEntry#getTime under the covers which may return different times for the same archive when using different versions of Java.
In versions of Compress prior to 1.16 a specially crafted ZIP archive can be used to cause an infinite loop inside of Compress' extra field parser used by the ZipFile and ZipArchiveInputStream classes. This can be used to mount a denial of service attack against services that use Compress' zip package. See the Security Reports page for details.

Zstandard

the format requires the otherwise optional Zstandard JNI library.