Class BZip2CompressorOutputStream
- java.lang.Object
-
- java.io.OutputStream
-
- org.apache.commons.compress.compressors.CompressorOutputStream
-
- org.apache.commons.compress.compressors.bzip2.BZip2CompressorOutputStream
-
- All Implemented Interfaces:
java.io.Closeable
,java.io.Flushable
,java.lang.AutoCloseable
,BZip2Constants
public class BZip2CompressorOutputStream extends CompressorOutputStream implements BZip2Constants
An output stream that compresses into the BZip2 format into another stream.The compression requires large amounts of memory. Thus you should call the
close()
method as soon as possible, to forceBZip2CompressorOutputStream
to release the allocated memory.You can shrink the amount of allocated memory and maybe raise the compression speed by choosing a lower blocksize, which in turn may cause a lower compression ratio. You can avoid unnecessary memory allocation by avoiding using a blocksize which is bigger than the size of the input.
You can compute the memory usage for compressing by the following formula:
<code>400k + (9 * blocksize)</code>.
To get the memory required for decompression by
BZip2CompressorInputStream
use<code>65k + (5 * blocksize)</code>.
Memory usage by blocksize Blocksize Compression
memory usageDecompression
memory usage100k 1300k 565k 200k 2200k 1065k 300k 3100k 1565k 400k 4000k 2065k 500k 4900k 2565k 600k 5800k 3065k 700k 6700k 3565k 800k 7600k 4065k 900k 8500k 4565k For decompression
BZip2CompressorInputStream
allocates less memory if the bzipped input is smaller than one block.Instances of this class are not threadsafe.
TODO: Update to BZip2 1.0.1
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static class
BZip2CompressorOutputStream.Data
-
Field Summary
Fields Modifier and Type Field Description private int
allowableBlockSize
private int
blockCRC
private int
blockSize100k
Always: in the range 0 ..private BlockSort
blockSorter
private int
bsBuff
private int
bsLive
private boolean
closed
private int
combinedCRC
private CRC
crc
private int
currentChar
private BZip2CompressorOutputStream.Data
data
All memory intensive stuff.private static int
GREATER_ICOST
private int
last
Index of the last char in the block, so the block size == last + 1.private static int
LESSER_ICOST
static int
MAX_BLOCKSIZE
The maximum supported blocksize== 9
.static int
MIN_BLOCKSIZE
The minimum supported blocksize== 1
.private int
nInUse
private int
nMTF
private java.io.OutputStream
out
private int
runLength
-
Fields inherited from interface org.apache.commons.compress.compressors.bzip2.BZip2Constants
BASEBLOCKSIZE, G_SIZE, MAX_ALPHA_SIZE, MAX_CODE_LEN, MAX_SELECTORS, N_GROUPS, N_ITERS, NUM_OVERSHOOT_BYTES, RUNA, RUNB
-
-
Constructor Summary
Constructors Constructor Description BZip2CompressorOutputStream(java.io.OutputStream out)
Constructs a newBZip2CompressorOutputStream
with a blocksize of 900k.BZip2CompressorOutputStream(java.io.OutputStream out, int blockSize)
Constructs a newBZip2CompressorOutputStream
with specified blocksize.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private void
blockSort()
private void
bsFinishedWithStream()
private void
bsPutInt(int u)
private void
bsPutUByte(int c)
private void
bsW(int n, int v)
static int
chooseBlockSize(long inputLength)
Chooses a blocksize based on the given length of the data to compress.void
close()
private void
endBlock()
private void
endCompression()
protected void
finalize()
Overriden to warn about an unclosed stream.void
finish()
void
flush()
private void
generateMTFValues()
int
getBlockSize()
Returns the blocksize parameter specified at construction time.private static void
hbAssignCodes(int[] code, byte[] length, int minLen, int maxLen, int alphaSize)
private static void
hbMakeCodeLengths(byte[] len, int[] freq, BZip2CompressorOutputStream.Data dat, int alphaSize, int maxLen)
private void
init()
Writes magic bytes like BZ on the first position of the stream and bytes indiciating the file-format, which is huffmanised, followed by a digit indicating blockSize100k.private void
initBlock()
private void
moveToFrontCodeAndSend()
private void
sendMTFValues()
private void
sendMTFValues0(int nGroups, int alphaSize)
private int
sendMTFValues1(int nGroups, int alphaSize)
private void
sendMTFValues2(int nGroups, int nSelectors)
private void
sendMTFValues3(int nGroups, int alphaSize)
private void
sendMTFValues4()
private void
sendMTFValues5(int nGroups, int nSelectors)
private void
sendMTFValues6(int nGroups, int alphaSize)
private void
sendMTFValues7()
void
write(byte[] buf, int offs, int len)
void
write(int b)
private void
write0(int b)
Keeps track of the last bytes written and implicitly performs run-length encoding as the first step of the bzip2 algorithm.private void
writeRun()
Writes the current byte to the buffer, run-length encoding it if it has been repeated at least four times (the first step RLEs sequences of four identical bytes).
-
-
-
Field Detail
-
MIN_BLOCKSIZE
public static final int MIN_BLOCKSIZE
The minimum supported blocksize== 1
.- See Also:
- Constant Field Values
-
MAX_BLOCKSIZE
public static final int MAX_BLOCKSIZE
The maximum supported blocksize== 9
.- See Also:
- Constant Field Values
-
GREATER_ICOST
private static final int GREATER_ICOST
- See Also:
- Constant Field Values
-
LESSER_ICOST
private static final int LESSER_ICOST
- See Also:
- Constant Field Values
-
last
private int last
Index of the last char in the block, so the block size == last + 1.
-
blockSize100k
private final int blockSize100k
Always: in the range 0 .. 9. The current block size is 100000 * this number.
-
bsBuff
private int bsBuff
-
bsLive
private int bsLive
-
crc
private final CRC crc
-
nInUse
private int nInUse
-
nMTF
private int nMTF
-
currentChar
private int currentChar
-
runLength
private int runLength
-
blockCRC
private int blockCRC
-
combinedCRC
private int combinedCRC
-
allowableBlockSize
private final int allowableBlockSize
-
data
private BZip2CompressorOutputStream.Data data
All memory intensive stuff.
-
blockSorter
private BlockSort blockSorter
-
out
private java.io.OutputStream out
-
closed
private volatile boolean closed
-
-
Constructor Detail
-
BZip2CompressorOutputStream
public BZip2CompressorOutputStream(java.io.OutputStream out) throws java.io.IOException
Constructs a newBZip2CompressorOutputStream
with a blocksize of 900k.- Parameters:
out
- the destination stream.- Throws:
java.io.IOException
- if an I/O error occurs in the specified stream.java.lang.NullPointerException
- ifout == null
.
-
BZip2CompressorOutputStream
public BZip2CompressorOutputStream(java.io.OutputStream out, int blockSize) throws java.io.IOException
Constructs a newBZip2CompressorOutputStream
with specified blocksize.- Parameters:
out
- the destination stream.blockSize
- the blockSize as 100k units.- Throws:
java.io.IOException
- if an I/O error occurs in the specified stream.java.lang.IllegalArgumentException
- if(blockSize < 1) || (blockSize > 9)
.java.lang.NullPointerException
- ifout == null
.- See Also:
MIN_BLOCKSIZE
,MAX_BLOCKSIZE
-
-
Method Detail
-
hbMakeCodeLengths
private static void hbMakeCodeLengths(byte[] len, int[] freq, BZip2CompressorOutputStream.Data dat, int alphaSize, int maxLen)
-
chooseBlockSize
public static int chooseBlockSize(long inputLength)
Chooses a blocksize based on the given length of the data to compress.- Parameters:
inputLength
- The length of the data which will be compressed byBZip2CompressorOutputStream
.- Returns:
- The blocksize, between
MIN_BLOCKSIZE
andMAX_BLOCKSIZE
both inclusive. For a negativeinputLength
this method returnsMAX_BLOCKSIZE
always.
-
write
public void write(int b) throws java.io.IOException
- Specified by:
write
in classjava.io.OutputStream
- Throws:
java.io.IOException
-
writeRun
private void writeRun() throws java.io.IOException
Writes the current byte to the buffer, run-length encoding it if it has been repeated at least four times (the first step RLEs sequences of four identical bytes).Flushes the current block before writing data if it is full.
"write to the buffer" means adding to data.buffer starting two steps "after" this.last - initially starting at index 1 (not 0) - and updating this.last to point to the last index written minus 1.
- Throws:
java.io.IOException
-
finalize
protected void finalize() throws java.lang.Throwable
Overriden to warn about an unclosed stream.- Overrides:
finalize
in classjava.lang.Object
- Throws:
java.lang.Throwable
-
finish
public void finish() throws java.io.IOException
- Throws:
java.io.IOException
-
close
public void close() throws java.io.IOException
- Specified by:
close
in interfacejava.lang.AutoCloseable
- Specified by:
close
in interfacejava.io.Closeable
- Overrides:
close
in classjava.io.OutputStream
- Throws:
java.io.IOException
-
flush
public void flush() throws java.io.IOException
- Specified by:
flush
in interfacejava.io.Flushable
- Overrides:
flush
in classjava.io.OutputStream
- Throws:
java.io.IOException
-
init
private void init() throws java.io.IOException
Writes magic bytes like BZ on the first position of the stream and bytes indiciating the file-format, which is huffmanised, followed by a digit indicating blockSize100k.- Throws:
java.io.IOException
- if the magic bytes could not been written
-
initBlock
private void initBlock()
-
endBlock
private void endBlock() throws java.io.IOException
- Throws:
java.io.IOException
-
endCompression
private void endCompression() throws java.io.IOException
- Throws:
java.io.IOException
-
getBlockSize
public final int getBlockSize()
Returns the blocksize parameter specified at construction time.- Returns:
- the blocksize parameter specified at construction time
-
write
public void write(byte[] buf, int offs, int len) throws java.io.IOException
- Overrides:
write
in classjava.io.OutputStream
- Throws:
java.io.IOException
-
write0
private void write0(int b) throws java.io.IOException
Keeps track of the last bytes written and implicitly performs run-length encoding as the first step of the bzip2 algorithm.- Throws:
java.io.IOException
-
hbAssignCodes
private static void hbAssignCodes(int[] code, byte[] length, int minLen, int maxLen, int alphaSize)
-
bsFinishedWithStream
private void bsFinishedWithStream() throws java.io.IOException
- Throws:
java.io.IOException
-
bsW
private void bsW(int n, int v) throws java.io.IOException
- Throws:
java.io.IOException
-
bsPutUByte
private void bsPutUByte(int c) throws java.io.IOException
- Throws:
java.io.IOException
-
bsPutInt
private void bsPutInt(int u) throws java.io.IOException
- Throws:
java.io.IOException
-
sendMTFValues
private void sendMTFValues() throws java.io.IOException
- Throws:
java.io.IOException
-
sendMTFValues0
private void sendMTFValues0(int nGroups, int alphaSize)
-
sendMTFValues1
private int sendMTFValues1(int nGroups, int alphaSize)
-
sendMTFValues2
private void sendMTFValues2(int nGroups, int nSelectors)
-
sendMTFValues3
private void sendMTFValues3(int nGroups, int alphaSize)
-
sendMTFValues4
private void sendMTFValues4() throws java.io.IOException
- Throws:
java.io.IOException
-
sendMTFValues5
private void sendMTFValues5(int nGroups, int nSelectors) throws java.io.IOException
- Throws:
java.io.IOException
-
sendMTFValues6
private void sendMTFValues6(int nGroups, int alphaSize) throws java.io.IOException
- Throws:
java.io.IOException
-
sendMTFValues7
private void sendMTFValues7() throws java.io.IOException
- Throws:
java.io.IOException
-
moveToFrontCodeAndSend
private void moveToFrontCodeAndSend() throws java.io.IOException
- Throws:
java.io.IOException
-
blockSort
private void blockSort()
-
generateMTFValues
private void generateMTFValues()
-
-