Class ZipArchiveInputStream

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable, InputStreamStatistics
    Direct Known Subclasses:
    JarArchiveInputStream

    public class ZipArchiveInputStream
    extends ArchiveInputStream
    implements InputStreamStatistics
    Implements an input stream that can read Zip archives.

    As of Apache Commons Compress it transparently supports Zip64 extensions and thus individual entries and archives larger than 4 GB or with more than 65536 entries.

    The ZipFile class is preferred when reading from files as ZipArchiveInputStream is limited by not being able to read the central directory header before returning entries. In particular ZipArchiveInputStream

    • may return entries that are not part of the central directory at all and shouldn't be considered part of the archive.
    • may return several entries with the same name.
    • will not return internal or external attributes.
    • may return incomplete extra field data.
    • may return unknown sizes and CRC values for entries until the next entry has been reached if the archive uses the data descriptor feature.
    See Also:
    ZipFile
    • Constructor Summary

      Constructors 
      Constructor Description
      ZipArchiveInputStream​(java.io.InputStream inputStream)
      Create an instance using UTF-8 encoding
      ZipArchiveInputStream​(java.io.InputStream inputStream, java.lang.String encoding)
      Create an instance using the specified encoding
      ZipArchiveInputStream​(java.io.InputStream inputStream, java.lang.String encoding, boolean useUnicodeExtraFields)
      Create an instance using the specified encoding
      ZipArchiveInputStream​(java.io.InputStream inputStream, java.lang.String encoding, boolean useUnicodeExtraFields, boolean allowStoredEntriesWithDataDescriptor)
      Create an instance using the specified encoding
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private boolean bufferContainsSignature​(java.io.ByteArrayOutputStream bos, int offset, int lastRead, int expectedDDLen)
      Checks whether the current buffer contains the signature of a "data descriptor", "local file header" or "central directory entry".
      private int cacheBytesRead​(java.io.ByteArrayOutputStream bos, int offset, int lastRead, int expecteDDLen)
      If the last read bytes could hold a data descriptor and an incomplete signature then save the last bytes to the front of the buffer and cache everything in front of the potential data descriptor into the given ByteArrayOutputStream.
      boolean canReadEntryData​(ArchiveEntry ae)
      Whether this class is able to read the given entry.
      private static boolean checksig​(byte[] signature, byte[] expected)  
      void close()  
      private void closeEntry()
      Closes the current ZIP archive entry and positions the underlying stream to the beginning of the next entry.
      private boolean currentEntryHasOutstandingBytes()
      If the compressed size of the current entry is included in the entry header and there are any outstanding bytes in the underlying stream, then this returns true.
      private void drainCurrentEntryData()
      Read all data of the current entry from the underlying stream that hasn't been read, yet.
      private int fill()  
      private void findEocdRecord()
      Reads forward until the signature of the "End of central directory" record is found.
      private long getBytesInflated()
      Get the number of bytes Inflater has actually processed.
      long getCompressedCount()  
      ArchiveEntry getNextEntry()
      Returns the next Archive Entry in this Stream.
      ZipArchiveEntry getNextZipEntry()  
      long getUncompressedCount()  
      private boolean isApkSigningBlock​(byte[] suspectLocalFileHeader)
      Checks whether this might be an APK Signing Block.
      private boolean isFirstByteOfEocdSig​(int b)  
      static boolean matches​(byte[] signature, int length)
      Checks if the signature matches what is expected for a zip file.
      private void processZip64Extra​(ZipLong size, ZipLong cSize)
      Records whether a Zip64 extra is present and sets the size information from it if sizes are 0xFFFFFFFF and the entry doesn't use a data descriptor.
      private void pushback​(byte[] buf, int offset, int length)  
      int read​(byte[] buffer, int offset, int length)  
      private void readDataDescriptor()  
      private int readDeflated​(byte[] buffer, int offset, int length)
      Implementation of read for DEFLATED entries.
      private void readFirstLocalFileHeader​(byte[] lfh)
      Fills the given array with the first local file header and deals with splitting/spanning markers that may prefix the first LFH.
      private int readFromInflater​(byte[] buffer, int offset, int length)
      Potentially reads more bytes to fill the inflater's buffer and reads from it.
      private void readFully​(byte[] b)  
      private void readFully​(byte[] b, int off)  
      private int readOneByte()
      Reads bytes by reading from the underlying stream rather than the (potentially inflating) archive stream - which read(byte[], int, int) would do.
      private int readStored​(byte[] buffer, int offset, int length)
      Implementation of read for STORED entries.
      private void readStoredEntry()
      Caches a stored entry that uses the data descriptor.
      private void realSkip​(long value)
      Skips bytes by reading from the underlying stream rather than the (potentially inflating) archive stream - which skip(long) would do.
      long skip​(long value)
      Skips over and discards value bytes of data from this input stream.
      private void skipRemainderOfArchive()
      Reads the stream until it find the "End of central directory record" and consumes it as well.
      private boolean supportsCompressedSizeFor​(ZipArchiveEntry entry)
      Whether the compressed size for the entry is either known or not required by the compression method being used.
      private boolean supportsDataDescriptorFor​(ZipArchiveEntry entry)
      Whether this entry requires a data descriptor this library can work with.
      • Methods inherited from class java.io.InputStream

        available, mark, markSupported, nullInputStream, read, readAllBytes, readNBytes, readNBytes, reset, transferTo
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • zipEncoding

        private final ZipEncoding zipEncoding
        The zip encoding to use for file names and the file comment.
      • encoding

        final java.lang.String encoding
      • useUnicodeExtraFields

        private final boolean useUnicodeExtraFields
        Whether to look for and use Unicode extra fields.
      • in

        private final java.io.InputStream in
        Wrapped stream, will always be a PushbackInputStream.
      • inf

        private final java.util.zip.Inflater inf
        Inflater used for all deflated entries.
      • buf

        private final java.nio.ByteBuffer buf
        Buffer used to read from the wrapped stream.
      • closed

        private boolean closed
        Whether the stream has been closed.
      • hitCentralDirectory

        private boolean hitCentralDirectory
        Whether the stream has reached the central directory - and thus found all entries.
      • lastStoredEntry

        private java.io.ByteArrayInputStream lastStoredEntry
        When reading a stored entry that uses the data descriptor this stream has to read the full entry and caches it. This is the cache.
      • allowStoredEntriesWithDataDescriptor

        private boolean allowStoredEntriesWithDataDescriptor
        Whether the stream will try to read STORED entries that use a data descriptor.
      • uncompressedCount

        private long uncompressedCount
        Count decompressed bytes for current entry
      • lfhBuf

        private final byte[] lfhBuf
      • skipBuf

        private final byte[] skipBuf
      • shortBuf

        private final byte[] shortBuf
      • wordBuf

        private final byte[] wordBuf
      • twoDwordBuf

        private final byte[] twoDwordBuf
      • entriesRead

        private int entriesRead
      • USE_ZIPFILE_INSTEAD_OF_STREAM_DISCLAIMER

        private static final java.lang.String USE_ZIPFILE_INSTEAD_OF_STREAM_DISCLAIMER
        See Also:
        Constant Field Values
      • LFH

        private static final byte[] LFH
      • CFH

        private static final byte[] CFH
      • DD

        private static final byte[] DD
      • APK_SIGNING_BLOCK_MAGIC

        private static final byte[] APK_SIGNING_BLOCK_MAGIC
      • LONG_MAX

        private static final java.math.BigInteger LONG_MAX
    • Constructor Detail

      • ZipArchiveInputStream

        public ZipArchiveInputStream​(java.io.InputStream inputStream)
        Create an instance using UTF-8 encoding
        Parameters:
        inputStream - the stream to wrap
      • ZipArchiveInputStream

        public ZipArchiveInputStream​(java.io.InputStream inputStream,
                                     java.lang.String encoding)
        Create an instance using the specified encoding
        Parameters:
        inputStream - the stream to wrap
        encoding - the encoding to use for file names, use null for the platform's default encoding
        Since:
        1.5
      • ZipArchiveInputStream

        public ZipArchiveInputStream​(java.io.InputStream inputStream,
                                     java.lang.String encoding,
                                     boolean useUnicodeExtraFields)
        Create an instance using the specified encoding
        Parameters:
        inputStream - the stream to wrap
        encoding - the encoding to use for file names, use null for the platform's default encoding
        useUnicodeExtraFields - whether to use InfoZIP Unicode Extra Fields (if present) to set the file names.
      • ZipArchiveInputStream

        public ZipArchiveInputStream​(java.io.InputStream inputStream,
                                     java.lang.String encoding,
                                     boolean useUnicodeExtraFields,
                                     boolean allowStoredEntriesWithDataDescriptor)
        Create an instance using the specified encoding
        Parameters:
        inputStream - the stream to wrap
        encoding - the encoding to use for file names, use null for the platform's default encoding
        useUnicodeExtraFields - whether to use InfoZIP Unicode Extra Fields (if present) to set the file names.
        allowStoredEntriesWithDataDescriptor - whether the stream will try to read STORED entries that use a data descriptor
        Since:
        1.1
    • Method Detail

      • getNextZipEntry

        public ZipArchiveEntry getNextZipEntry()
                                        throws java.io.IOException
        Throws:
        java.io.IOException
      • readFirstLocalFileHeader

        private void readFirstLocalFileHeader​(byte[] lfh)
                                       throws java.io.IOException
        Fills the given array with the first local file header and deals with splitting/spanning markers that may prefix the first LFH.
        Throws:
        java.io.IOException
      • processZip64Extra

        private void processZip64Extra​(ZipLong size,
                                       ZipLong cSize)
        Records whether a Zip64 extra is present and sets the size information from it if sizes are 0xFFFFFFFF and the entry doesn't use a data descriptor.
      • getNextEntry

        public ArchiveEntry getNextEntry()
                                  throws java.io.IOException
        Description copied from class: ArchiveInputStream
        Returns the next Archive Entry in this Stream.
        Specified by:
        getNextEntry in class ArchiveInputStream
        Returns:
        the next entry, or null if there are no more entries
        Throws:
        java.io.IOException - if the next entry could not be read
      • canReadEntryData

        public boolean canReadEntryData​(ArchiveEntry ae)
        Whether this class is able to read the given entry.

        May return false if it is set up to use encryption or a compression method that hasn't been implemented yet.

        Overrides:
        canReadEntryData in class ArchiveInputStream
        Parameters:
        ae - the entry to test
        Returns:
        This implementation always returns true.
        Since:
        1.1
      • read

        public int read​(byte[] buffer,
                        int offset,
                        int length)
                 throws java.io.IOException
        Overrides:
        read in class java.io.InputStream
        Throws:
        java.io.IOException
      • getCompressedCount

        public long getCompressedCount()
        Specified by:
        getCompressedCount in interface InputStreamStatistics
        Returns:
        the amount of raw or compressed bytes read by the stream
        Since:
        1.17
      • getUncompressedCount

        public long getUncompressedCount()
        Specified by:
        getUncompressedCount in interface InputStreamStatistics
        Returns:
        the amount of decompressed bytes returned by the stream
        Since:
        1.17
      • readStored

        private int readStored​(byte[] buffer,
                               int offset,
                               int length)
                        throws java.io.IOException
        Implementation of read for STORED entries.
        Throws:
        java.io.IOException
      • readDeflated

        private int readDeflated​(byte[] buffer,
                                 int offset,
                                 int length)
                          throws java.io.IOException
        Implementation of read for DEFLATED entries.
        Throws:
        java.io.IOException
      • readFromInflater

        private int readFromInflater​(byte[] buffer,
                                     int offset,
                                     int length)
                              throws java.io.IOException
        Potentially reads more bytes to fill the inflater's buffer and reads from it.
        Throws:
        java.io.IOException
      • close

        public void close()
                   throws java.io.IOException
        Specified by:
        close in interface java.lang.AutoCloseable
        Specified by:
        close in interface java.io.Closeable
        Overrides:
        close in class java.io.InputStream
        Throws:
        java.io.IOException
      • skip

        public long skip​(long value)
                  throws java.io.IOException
        Skips over and discards value bytes of data from this input stream.

        This implementation may end up skipping over some smaller number of bytes, possibly 0, if and only if it reaches the end of the underlying stream.

        The actual number of bytes skipped is returned.

        Overrides:
        skip in class java.io.InputStream
        Parameters:
        value - the number of bytes to be skipped.
        Returns:
        the actual number of bytes skipped.
        Throws:
        java.io.IOException - - if an I/O error occurs.
        java.lang.IllegalArgumentException - - if value is negative.
      • matches

        public static boolean matches​(byte[] signature,
                                      int length)
        Checks if the signature matches what is expected for a zip file. Does not currently handle self-extracting zips which may have arbitrary leading content.
        Parameters:
        signature - the bytes to check
        length - the number of bytes to check
        Returns:
        true, if this stream is a zip archive stream, false otherwise
      • checksig

        private static boolean checksig​(byte[] signature,
                                        byte[] expected)
      • closeEntry

        private void closeEntry()
                         throws java.io.IOException
        Closes the current ZIP archive entry and positions the underlying stream to the beginning of the next entry. All per-entry variables and data structures are cleared.

        If the compressed size of this entry is included in the entry header, then any outstanding bytes are simply skipped from the underlying stream without uncompressing them. This allows an entry to be safely closed even if the compression method is unsupported.

        In case we don't know the compressed size of this entry or have already buffered too much data from the underlying stream to support uncompression, then the uncompression process is completed and the end position of the stream is adjusted based on the result of that process.

        Throws:
        java.io.IOException - if an error occurs
      • currentEntryHasOutstandingBytes

        private boolean currentEntryHasOutstandingBytes()
        If the compressed size of the current entry is included in the entry header and there are any outstanding bytes in the underlying stream, then this returns true.
        Returns:
        true, if current entry is determined to have outstanding bytes, false otherwise
      • drainCurrentEntryData

        private void drainCurrentEntryData()
                                    throws java.io.IOException
        Read all data of the current entry from the underlying stream that hasn't been read, yet.
        Throws:
        java.io.IOException
      • getBytesInflated

        private long getBytesInflated()
        Get the number of bytes Inflater has actually processed.

        for Java < Java7 the getBytes* methods in Inflater/Deflater seem to return unsigned ints rather than longs that start over with 0 at 2^32.

        The stream knows how many bytes it has read, but not how many the Inflater actually consumed - it should be between the total number of bytes read for the entry and the total number minus the last read operation. Here we just try to make the value close enough to the bytes we've read by assuming the number of bytes consumed must be smaller than (or equal to) the number of bytes read but not smaller by more than 2^32.

      • fill

        private int fill()
                  throws java.io.IOException
        Throws:
        java.io.IOException
      • readFully

        private void readFully​(byte[] b)
                        throws java.io.IOException
        Throws:
        java.io.IOException
      • readFully

        private void readFully​(byte[] b,
                               int off)
                        throws java.io.IOException
        Throws:
        java.io.IOException
      • readDataDescriptor

        private void readDataDescriptor()
                                 throws java.io.IOException
        Throws:
        java.io.IOException
      • supportsDataDescriptorFor

        private boolean supportsDataDescriptorFor​(ZipArchiveEntry entry)
        Whether this entry requires a data descriptor this library can work with.
        Returns:
        true if allowStoredEntriesWithDataDescriptor is true, the entry doesn't require any data descriptor or the method is DEFLATED or ENHANCED_DEFLATED.
      • supportsCompressedSizeFor

        private boolean supportsCompressedSizeFor​(ZipArchiveEntry entry)
        Whether the compressed size for the entry is either known or not required by the compression method being used.
      • readStoredEntry

        private void readStoredEntry()
                              throws java.io.IOException
        Caches a stored entry that uses the data descriptor.
        • Reads a stored entry until the signature of a local file header, central directory header or data descriptor has been found.
        • Stores all entry data in lastStoredEntry.

        • Rewinds the stream to position at the data descriptor.
        • reads the data descriptor

        After calling this method the entry should know its size, the entry's data is cached and the stream is positioned at the next local file or central directory header.

        Throws:
        java.io.IOException
      • bufferContainsSignature

        private boolean bufferContainsSignature​(java.io.ByteArrayOutputStream bos,
                                                int offset,
                                                int lastRead,
                                                int expectedDDLen)
                                         throws java.io.IOException
        Checks whether the current buffer contains the signature of a "data descriptor", "local file header" or "central directory entry".

        If it contains such a signature, reads the data descriptor and positions the stream right after the data descriptor.

        Throws:
        java.io.IOException
      • cacheBytesRead

        private int cacheBytesRead​(java.io.ByteArrayOutputStream bos,
                                   int offset,
                                   int lastRead,
                                   int expecteDDLen)
        If the last read bytes could hold a data descriptor and an incomplete signature then save the last bytes to the front of the buffer and cache everything in front of the potential data descriptor into the given ByteArrayOutputStream.

        Data descriptor plus incomplete signature (3 bytes in the worst case) can be 20 bytes max.

      • pushback

        private void pushback​(byte[] buf,
                              int offset,
                              int length)
                       throws java.io.IOException
        Throws:
        java.io.IOException
      • skipRemainderOfArchive

        private void skipRemainderOfArchive()
                                     throws java.io.IOException
        Reads the stream until it find the "End of central directory record" and consumes it as well.
        Throws:
        java.io.IOException
      • findEocdRecord

        private void findEocdRecord()
                             throws java.io.IOException
        Reads forward until the signature of the "End of central directory" record is found.
        Throws:
        java.io.IOException
      • realSkip

        private void realSkip​(long value)
                       throws java.io.IOException
        Skips bytes by reading from the underlying stream rather than the (potentially inflating) archive stream - which skip(long) would do. Also updates bytes-read counter.
        Throws:
        java.io.IOException
      • readOneByte

        private int readOneByte()
                         throws java.io.IOException
        Reads bytes by reading from the underlying stream rather than the (potentially inflating) archive stream - which read(byte[], int, int) would do. Also updates bytes-read counter.
        Throws:
        java.io.IOException
      • isFirstByteOfEocdSig

        private boolean isFirstByteOfEocdSig​(int b)
      • isApkSigningBlock

        private boolean isApkSigningBlock​(byte[] suspectLocalFileHeader)
                                   throws java.io.IOException
        Checks whether this might be an APK Signing Block.

        Unfortunately the APK signing block does not start with some kind of signature, it rather ends with one. It starts with a length, so what we do is parse the suspect length, skip ahead far enough, look for the signature and if we've found it, return true.

        Parameters:
        suspectLocalFileHeader - the bytes read from the underlying stream in the expectation that they would hold the local file header of the next entry.
        Returns:
        true if this looks like a APK signing block
        Throws:
        java.io.IOException
        See Also:
        https://source.android.com/security/apksigning/v2