Class ZipArchiveInputStream
- java.lang.Object
-
- java.io.InputStream
-
- org.apache.commons.compress.archivers.ArchiveInputStream
-
- org.apache.commons.compress.archivers.zip.ZipArchiveInputStream
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
,InputStreamStatistics
- Direct Known Subclasses:
JarArchiveInputStream
public class ZipArchiveInputStream extends ArchiveInputStream implements InputStreamStatistics
Implements an input stream that can read Zip archives.As of Apache Commons Compress it transparently supports Zip64 extensions and thus individual entries and archives larger than 4 GB or with more than 65536 entries.
The
ZipFile
class is preferred when reading from files asZipArchiveInputStream
is limited by not being able to read the central directory header before returning entries. In particularZipArchiveInputStream
- may return entries that are not part of the central directory at all and shouldn't be considered part of the archive.
- may return several entries with the same name.
- will not return internal or external attributes.
- may return incomplete extra field data.
- may return unknown sizes and CRC values for entries until the next entry has been reached if the archive uses the data descriptor feature.
- See Also:
ZipFile
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private class
ZipArchiveInputStream.BoundedInputStream
Bounded input stream adapted from commons-ioprivate static class
ZipArchiveInputStream.CurrentEntry
Structure collecting information for the entry that is currently being read.
-
Field Summary
Fields Modifier and Type Field Description private boolean
allowStoredEntriesWithDataDescriptor
Whether the stream will try to read STORED entries that use a data descriptor.private static byte[]
APK_SIGNING_BLOCK_MAGIC
private java.nio.ByteBuffer
buf
Buffer used to read from the wrapped stream.private static byte[]
CFH
private static int
CFH_LEN
private boolean
closed
Whether the stream has been closed.private ZipArchiveInputStream.CurrentEntry
current
The entry that is currently being read.private static byte[]
DD
(package private) java.lang.String
encoding
private int
entriesRead
private boolean
hitCentralDirectory
Whether the stream has reached the central directory - and thus found all entries.private java.io.InputStream
in
Wrapped stream, will always be a PushbackInputStream.private java.util.zip.Inflater
inf
Inflater used for all deflated entries.private java.io.ByteArrayInputStream
lastStoredEntry
When reading a stored entry that uses the data descriptor this stream has to read the full entry and caches it.private static byte[]
LFH
private static int
LFH_LEN
private byte[]
lfhBuf
private static java.math.BigInteger
LONG_MAX
private byte[]
shortBuf
private byte[]
skipBuf
private static long
TWO_EXP_32
private byte[]
twoDwordBuf
private long
uncompressedCount
Count decompressed bytes for current entryprivate static java.lang.String
USE_ZIPFILE_INSTEAD_OF_STREAM_DISCLAIMER
private boolean
useUnicodeExtraFields
Whether to look for and use Unicode extra fields.private byte[]
wordBuf
private ZipEncoding
zipEncoding
The zip encoding to use for file names and the file comment.
-
Constructor Summary
Constructors Constructor Description ZipArchiveInputStream(java.io.InputStream inputStream)
Create an instance using UTF-8 encodingZipArchiveInputStream(java.io.InputStream inputStream, java.lang.String encoding)
Create an instance using the specified encodingZipArchiveInputStream(java.io.InputStream inputStream, java.lang.String encoding, boolean useUnicodeExtraFields)
Create an instance using the specified encodingZipArchiveInputStream(java.io.InputStream inputStream, java.lang.String encoding, boolean useUnicodeExtraFields, boolean allowStoredEntriesWithDataDescriptor)
Create an instance using the specified encoding
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private boolean
bufferContainsSignature(java.io.ByteArrayOutputStream bos, int offset, int lastRead, int expectedDDLen)
Checks whether the current buffer contains the signature of a "data descriptor", "local file header" or "central directory entry".private int
cacheBytesRead(java.io.ByteArrayOutputStream bos, int offset, int lastRead, int expecteDDLen)
If the last read bytes could hold a data descriptor and an incomplete signature then save the last bytes to the front of the buffer and cache everything in front of the potential data descriptor into the given ByteArrayOutputStream.boolean
canReadEntryData(ArchiveEntry ae)
Whether this class is able to read the given entry.private static boolean
checksig(byte[] signature, byte[] expected)
void
close()
private void
closeEntry()
Closes the current ZIP archive entry and positions the underlying stream to the beginning of the next entry.private boolean
currentEntryHasOutstandingBytes()
If the compressed size of the current entry is included in the entry header and there are any outstanding bytes in the underlying stream, then this returns true.private void
drainCurrentEntryData()
Read all data of the current entry from the underlying stream that hasn't been read, yet.private int
fill()
private void
findEocdRecord()
Reads forward until the signature of the "End of central directory" record is found.private long
getBytesInflated()
Get the number of bytes Inflater has actually processed.long
getCompressedCount()
ArchiveEntry
getNextEntry()
Returns the next Archive Entry in this Stream.ZipArchiveEntry
getNextZipEntry()
long
getUncompressedCount()
private boolean
isApkSigningBlock(byte[] suspectLocalFileHeader)
Checks whether this might be an APK Signing Block.private boolean
isFirstByteOfEocdSig(int b)
static boolean
matches(byte[] signature, int length)
Checks if the signature matches what is expected for a zip file.private void
processZip64Extra(ZipLong size, ZipLong cSize)
Records whether a Zip64 extra is present and sets the size information from it if sizes are 0xFFFFFFFF and the entry doesn't use a data descriptor.private void
pushback(byte[] buf, int offset, int length)
int
read(byte[] buffer, int offset, int length)
private void
readDataDescriptor()
private int
readDeflated(byte[] buffer, int offset, int length)
Implementation of read for DEFLATED entries.private void
readFirstLocalFileHeader(byte[] lfh)
Fills the given array with the first local file header and deals with splitting/spanning markers that may prefix the first LFH.private int
readFromInflater(byte[] buffer, int offset, int length)
Potentially reads more bytes to fill the inflater's buffer and reads from it.private void
readFully(byte[] b)
private void
readFully(byte[] b, int off)
private int
readOneByte()
Reads bytes by reading from the underlying stream rather than the (potentially inflating) archive stream - whichread(byte[], int, int)
would do.private int
readStored(byte[] buffer, int offset, int length)
Implementation of read for STORED entries.private void
readStoredEntry()
Caches a stored entry that uses the data descriptor.private void
realSkip(long value)
Skips bytes by reading from the underlying stream rather than the (potentially inflating) archive stream - whichskip(long)
would do.long
skip(long value)
Skips over and discards value bytes of data from this input stream.private void
skipRemainderOfArchive()
Reads the stream until it find the "End of central directory record" and consumes it as well.private boolean
supportsCompressedSizeFor(ZipArchiveEntry entry)
Whether the compressed size for the entry is either known or not required by the compression method being used.private boolean
supportsDataDescriptorFor(ZipArchiveEntry entry)
Whether this entry requires a data descriptor this library can work with.-
Methods inherited from class org.apache.commons.compress.archivers.ArchiveInputStream
count, count, getBytesRead, getCount, pushedBackBytes, read
-
-
-
-
Field Detail
-
zipEncoding
private final ZipEncoding zipEncoding
The zip encoding to use for file names and the file comment.
-
encoding
final java.lang.String encoding
-
useUnicodeExtraFields
private final boolean useUnicodeExtraFields
Whether to look for and use Unicode extra fields.
-
in
private final java.io.InputStream in
Wrapped stream, will always be a PushbackInputStream.
-
inf
private final java.util.zip.Inflater inf
Inflater used for all deflated entries.
-
buf
private final java.nio.ByteBuffer buf
Buffer used to read from the wrapped stream.
-
current
private ZipArchiveInputStream.CurrentEntry current
The entry that is currently being read.
-
closed
private boolean closed
Whether the stream has been closed.
-
hitCentralDirectory
private boolean hitCentralDirectory
Whether the stream has reached the central directory - and thus found all entries.
-
lastStoredEntry
private java.io.ByteArrayInputStream lastStoredEntry
When reading a stored entry that uses the data descriptor this stream has to read the full entry and caches it. This is the cache.
-
allowStoredEntriesWithDataDescriptor
private boolean allowStoredEntriesWithDataDescriptor
Whether the stream will try to read STORED entries that use a data descriptor.
-
uncompressedCount
private long uncompressedCount
Count decompressed bytes for current entry
-
LFH_LEN
private static final int LFH_LEN
- See Also:
- Constant Field Values
-
CFH_LEN
private static final int CFH_LEN
- See Also:
- Constant Field Values
-
TWO_EXP_32
private static final long TWO_EXP_32
- See Also:
- Constant Field Values
-
lfhBuf
private final byte[] lfhBuf
-
skipBuf
private final byte[] skipBuf
-
shortBuf
private final byte[] shortBuf
-
wordBuf
private final byte[] wordBuf
-
twoDwordBuf
private final byte[] twoDwordBuf
-
entriesRead
private int entriesRead
-
USE_ZIPFILE_INSTEAD_OF_STREAM_DISCLAIMER
private static final java.lang.String USE_ZIPFILE_INSTEAD_OF_STREAM_DISCLAIMER
- See Also:
- Constant Field Values
-
LFH
private static final byte[] LFH
-
CFH
private static final byte[] CFH
-
DD
private static final byte[] DD
-
APK_SIGNING_BLOCK_MAGIC
private static final byte[] APK_SIGNING_BLOCK_MAGIC
-
LONG_MAX
private static final java.math.BigInteger LONG_MAX
-
-
Constructor Detail
-
ZipArchiveInputStream
public ZipArchiveInputStream(java.io.InputStream inputStream)
Create an instance using UTF-8 encoding- Parameters:
inputStream
- the stream to wrap
-
ZipArchiveInputStream
public ZipArchiveInputStream(java.io.InputStream inputStream, java.lang.String encoding)
Create an instance using the specified encoding- Parameters:
inputStream
- the stream to wrapencoding
- the encoding to use for file names, use null for the platform's default encoding- Since:
- 1.5
-
ZipArchiveInputStream
public ZipArchiveInputStream(java.io.InputStream inputStream, java.lang.String encoding, boolean useUnicodeExtraFields)
Create an instance using the specified encoding- Parameters:
inputStream
- the stream to wrapencoding
- the encoding to use for file names, use null for the platform's default encodinguseUnicodeExtraFields
- whether to use InfoZIP Unicode Extra Fields (if present) to set the file names.
-
ZipArchiveInputStream
public ZipArchiveInputStream(java.io.InputStream inputStream, java.lang.String encoding, boolean useUnicodeExtraFields, boolean allowStoredEntriesWithDataDescriptor)
Create an instance using the specified encoding- Parameters:
inputStream
- the stream to wrapencoding
- the encoding to use for file names, use null for the platform's default encodinguseUnicodeExtraFields
- whether to use InfoZIP Unicode Extra Fields (if present) to set the file names.allowStoredEntriesWithDataDescriptor
- whether the stream will try to read STORED entries that use a data descriptor- Since:
- 1.1
-
-
Method Detail
-
getNextZipEntry
public ZipArchiveEntry getNextZipEntry() throws java.io.IOException
- Throws:
java.io.IOException
-
readFirstLocalFileHeader
private void readFirstLocalFileHeader(byte[] lfh) throws java.io.IOException
Fills the given array with the first local file header and deals with splitting/spanning markers that may prefix the first LFH.- Throws:
java.io.IOException
-
processZip64Extra
private void processZip64Extra(ZipLong size, ZipLong cSize)
Records whether a Zip64 extra is present and sets the size information from it if sizes are 0xFFFFFFFF and the entry doesn't use a data descriptor.
-
getNextEntry
public ArchiveEntry getNextEntry() throws java.io.IOException
Description copied from class:ArchiveInputStream
Returns the next Archive Entry in this Stream.- Specified by:
getNextEntry
in classArchiveInputStream
- Returns:
- the next entry,
or
null
if there are no more entries - Throws:
java.io.IOException
- if the next entry could not be read
-
canReadEntryData
public boolean canReadEntryData(ArchiveEntry ae)
Whether this class is able to read the given entry.May return false if it is set up to use encryption or a compression method that hasn't been implemented yet.
- Overrides:
canReadEntryData
in classArchiveInputStream
- Parameters:
ae
- the entry to test- Returns:
- This implementation always returns true.
- Since:
- 1.1
-
read
public int read(byte[] buffer, int offset, int length) throws java.io.IOException
- Overrides:
read
in classjava.io.InputStream
- Throws:
java.io.IOException
-
getCompressedCount
public long getCompressedCount()
- Specified by:
getCompressedCount
in interfaceInputStreamStatistics
- Returns:
- the amount of raw or compressed bytes read by the stream
- Since:
- 1.17
-
getUncompressedCount
public long getUncompressedCount()
- Specified by:
getUncompressedCount
in interfaceInputStreamStatistics
- Returns:
- the amount of decompressed bytes returned by the stream
- Since:
- 1.17
-
readStored
private int readStored(byte[] buffer, int offset, int length) throws java.io.IOException
Implementation of read for STORED entries.- Throws:
java.io.IOException
-
readDeflated
private int readDeflated(byte[] buffer, int offset, int length) throws java.io.IOException
Implementation of read for DEFLATED entries.- Throws:
java.io.IOException
-
readFromInflater
private int readFromInflater(byte[] buffer, int offset, int length) throws java.io.IOException
Potentially reads more bytes to fill the inflater's buffer and reads from it.- Throws:
java.io.IOException
-
close
public void close() throws java.io.IOException
- Specified by:
close
in interfacejava.lang.AutoCloseable
- Specified by:
close
in interfacejava.io.Closeable
- Overrides:
close
in classjava.io.InputStream
- Throws:
java.io.IOException
-
skip
public long skip(long value) throws java.io.IOException
Skips over and discards value bytes of data from this input stream.This implementation may end up skipping over some smaller number of bytes, possibly 0, if and only if it reaches the end of the underlying stream.
The actual number of bytes skipped is returned.
- Overrides:
skip
in classjava.io.InputStream
- Parameters:
value
- the number of bytes to be skipped.- Returns:
- the actual number of bytes skipped.
- Throws:
java.io.IOException
- - if an I/O error occurs.java.lang.IllegalArgumentException
- - if value is negative.
-
matches
public static boolean matches(byte[] signature, int length)
Checks if the signature matches what is expected for a zip file. Does not currently handle self-extracting zips which may have arbitrary leading content.- Parameters:
signature
- the bytes to checklength
- the number of bytes to check- Returns:
- true, if this stream is a zip archive stream, false otherwise
-
checksig
private static boolean checksig(byte[] signature, byte[] expected)
-
closeEntry
private void closeEntry() throws java.io.IOException
Closes the current ZIP archive entry and positions the underlying stream to the beginning of the next entry. All per-entry variables and data structures are cleared.If the compressed size of this entry is included in the entry header, then any outstanding bytes are simply skipped from the underlying stream without uncompressing them. This allows an entry to be safely closed even if the compression method is unsupported.
In case we don't know the compressed size of this entry or have already buffered too much data from the underlying stream to support uncompression, then the uncompression process is completed and the end position of the stream is adjusted based on the result of that process.
- Throws:
java.io.IOException
- if an error occurs
-
currentEntryHasOutstandingBytes
private boolean currentEntryHasOutstandingBytes()
If the compressed size of the current entry is included in the entry header and there are any outstanding bytes in the underlying stream, then this returns true.- Returns:
- true, if current entry is determined to have outstanding bytes, false otherwise
-
drainCurrentEntryData
private void drainCurrentEntryData() throws java.io.IOException
Read all data of the current entry from the underlying stream that hasn't been read, yet.- Throws:
java.io.IOException
-
getBytesInflated
private long getBytesInflated()
Get the number of bytes Inflater has actually processed.for Java < Java7 the getBytes* methods in Inflater/Deflater seem to return unsigned ints rather than longs that start over with 0 at 2^32.
The stream knows how many bytes it has read, but not how many the Inflater actually consumed - it should be between the total number of bytes read for the entry and the total number minus the last read operation. Here we just try to make the value close enough to the bytes we've read by assuming the number of bytes consumed must be smaller than (or equal to) the number of bytes read but not smaller by more than 2^32.
-
fill
private int fill() throws java.io.IOException
- Throws:
java.io.IOException
-
readFully
private void readFully(byte[] b) throws java.io.IOException
- Throws:
java.io.IOException
-
readFully
private void readFully(byte[] b, int off) throws java.io.IOException
- Throws:
java.io.IOException
-
readDataDescriptor
private void readDataDescriptor() throws java.io.IOException
- Throws:
java.io.IOException
-
supportsDataDescriptorFor
private boolean supportsDataDescriptorFor(ZipArchiveEntry entry)
Whether this entry requires a data descriptor this library can work with.- Returns:
- true if allowStoredEntriesWithDataDescriptor is true, the entry doesn't require any data descriptor or the method is DEFLATED or ENHANCED_DEFLATED.
-
supportsCompressedSizeFor
private boolean supportsCompressedSizeFor(ZipArchiveEntry entry)
Whether the compressed size for the entry is either known or not required by the compression method being used.
-
readStoredEntry
private void readStoredEntry() throws java.io.IOException
Caches a stored entry that uses the data descriptor.- Reads a stored entry until the signature of a local file header, central directory header or data descriptor has been found.
- Stores all entry data in lastStoredEntry.
- Rewinds the stream to position at the data descriptor.
- reads the data descriptor
After calling this method the entry should know its size, the entry's data is cached and the stream is positioned at the next local file or central directory header.
- Throws:
java.io.IOException
-
bufferContainsSignature
private boolean bufferContainsSignature(java.io.ByteArrayOutputStream bos, int offset, int lastRead, int expectedDDLen) throws java.io.IOException
Checks whether the current buffer contains the signature of a "data descriptor", "local file header" or "central directory entry".If it contains such a signature, reads the data descriptor and positions the stream right after the data descriptor.
- Throws:
java.io.IOException
-
cacheBytesRead
private int cacheBytesRead(java.io.ByteArrayOutputStream bos, int offset, int lastRead, int expecteDDLen)
If the last read bytes could hold a data descriptor and an incomplete signature then save the last bytes to the front of the buffer and cache everything in front of the potential data descriptor into the given ByteArrayOutputStream.Data descriptor plus incomplete signature (3 bytes in the worst case) can be 20 bytes max.
-
pushback
private void pushback(byte[] buf, int offset, int length) throws java.io.IOException
- Throws:
java.io.IOException
-
skipRemainderOfArchive
private void skipRemainderOfArchive() throws java.io.IOException
Reads the stream until it find the "End of central directory record" and consumes it as well.- Throws:
java.io.IOException
-
findEocdRecord
private void findEocdRecord() throws java.io.IOException
Reads forward until the signature of the "End of central directory" record is found.- Throws:
java.io.IOException
-
realSkip
private void realSkip(long value) throws java.io.IOException
Skips bytes by reading from the underlying stream rather than the (potentially inflating) archive stream - whichskip(long)
would do. Also updates bytes-read counter.- Throws:
java.io.IOException
-
readOneByte
private int readOneByte() throws java.io.IOException
Reads bytes by reading from the underlying stream rather than the (potentially inflating) archive stream - whichread(byte[], int, int)
would do. Also updates bytes-read counter.- Throws:
java.io.IOException
-
isFirstByteOfEocdSig
private boolean isFirstByteOfEocdSig(int b)
-
isApkSigningBlock
private boolean isApkSigningBlock(byte[] suspectLocalFileHeader) throws java.io.IOException
Checks whether this might be an APK Signing Block.Unfortunately the APK signing block does not start with some kind of signature, it rather ends with one. It starts with a length, so what we do is parse the suspect length, skip ahead far enough, look for the signature and if we've found it, return true.
- Parameters:
suspectLocalFileHeader
- the bytes read from the underlying stream in the expectation that they would hold the local file header of the next entry.- Returns:
- true if this looks like a APK signing block
- Throws:
java.io.IOException
- See Also:
- https://source.android.com/security/apksigning/v2
-
-