TODO - xz - Git at Google


 XZ Utils To-Do List
 ===================

 Known bugs
 ----------

     The test suite is too incomplete.

     If the memory usage limit is less than about 13 MiB, xz is unable to
     automatically scale down the compression settings enough even though
     it would be  possible by switching from BT2/BT3/BT4 match finder to
     HC3/HC4.

     The code to detect number of CPU cores doesn't count hyperthreading
     as multiple cores. In context of xz, it probably should.
     Hyperthreading is good at least with p7zip.

     XZ Utils compress some files significantly worse than LZMA Utils.
     This is due to faster compression presets used by XZ Utils, and
     can often be worked around by using "xz --extreme". With some files
     --extreme isn't enough though: it's most likely with files that
     compress extremely well, so going from compression ratio of 0.003
     to 0.004 means big relative increase in the compressed file size.

     xz doesn't quote unprintable characters when it displays file names
     given on the command line.

     tuklib_exit() doesn't block signals => EINTR is possible.

     SIGTSTP is not handled. If xz is stopped, the estimated remaining
     time and calculated (de)compression speed won't make sense in the
     progress indicator (xz --verbose).

     If liblzma has created threads and fork() gets called, liblzma
     code will break in the child process unless it calls exec() and
     doesn't touch liblzma.


 Missing features
 ----------------

     Support LZMA_FINISH in raw decoder to indicate end of LZMA1 and
     other streams that don't have an end of payload marker.

     Adjust dictionary size when the input file size is known.
     Maybe do this only if an option is given.

     xz doesn't support copying extended attributes, access control
     lists etc. from source to target file.

     Multithreaded compression:
       - Reduce memory usage of the current method.
       - Implement threaded match finders.
       - Implement pigz-style threading in LZMA2.

     Multithreaded decompression

     Buffer-to-buffer coding could use less RAM (especially when
     decompressing LZMA1 or LZMA2).

     I/O library is not implemented (similar to gzopen() in zlib).
     It will be a separate library that supports uncompressed, .gz,
     .bz2, .lzma, and .xz files.

     Support changing lzma_options_lzma.mode with lzma_filters_update().

     Support LZMA_FULL_FLUSH for lzma_stream_decoder() to stop at
     Block and Stream boundaries.

     lzma_strerror() to convert lzma_ret to human readable form?
     This is tricky, because the same error codes are used with
     slightly different meanings, and this cannot be fixed anymore.


 Documentation
 -------------

     Some tutorial is needed for liblzma. I have planned to write some
     extremely well commented example programs, which would work as
     a tutorial. I suppose the Doxygen tags are quite OK as a quick
     reference once one is familiar with the liblzma API.

     Document the LZMA1 and LZMA2 algorithms.

	XZ Utils To-Do List
	===================

	Known bugs
	----------

	The test suite is too incomplete.

	If the memory usage limit is less than about 13 MiB, xz is unable to
	automatically scale down the compression settings enough even though
	it would be possible by switching from BT2/BT3/BT4 match finder to
	HC3/HC4.

	The code to detect number of CPU cores doesn't count hyperthreading
	as multiple cores. In context of xz, it probably should.
	Hyperthreading is good at least with p7zip.

	XZ Utils compress some files significantly worse than LZMA Utils.
	This is due to faster compression presets used by XZ Utils, and
	can often be worked around by using "xz --extreme". With some files
	--extreme isn't enough though: it's most likely with files that
	compress extremely well, so going from compression ratio of 0.003
	to 0.004 means big relative increase in the compressed file size.

	xz doesn't quote unprintable characters when it displays file names
	given on the command line.

	tuklib_exit() doesn't block signals => EINTR is possible.

	SIGTSTP is not handled. If xz is stopped, the estimated remaining
	time and calculated (de)compression speed won't make sense in the
	progress indicator (xz --verbose).

	If liblzma has created threads and fork() gets called, liblzma
	code will break in the child process unless it calls exec() and
	doesn't touch liblzma.


	Missing features
	----------------

	Support LZMA_FINISH in raw decoder to indicate end of LZMA1 and
	other streams that don't have an end of payload marker.

	Adjust dictionary size when the input file size is known.
	Maybe do this only if an option is given.

	xz doesn't support copying extended attributes, access control
	lists etc. from source to target file.

	Multithreaded compression:
	- Reduce memory usage of the current method.
	- Implement threaded match finders.
	- Implement pigz-style threading in LZMA2.

	Multithreaded decompression

	Buffer-to-buffer coding could use less RAM (especially when
	decompressing LZMA1 or LZMA2).

	I/O library is not implemented (similar to gzopen() in zlib).
	It will be a separate library that supports uncompressed, .gz,
	.bz2, .lzma, and .xz files.

	Support changing lzma_options_lzma.mode with lzma_filters_update().

	Support LZMA_FULL_FLUSH for lzma_stream_decoder() to stop at
	Block and Stream boundaries.

	lzma_strerror() to convert lzma_ret to human readable form?
	This is tricky, because the same error codes are used with
	slightly different meanings, and this cannot be fixed anymore.


	Documentation
	-------------

	Some tutorial is needed for liblzma. I have planned to write some
	extremely well commented example programs, which would work as
	a tutorial. I suppose the Doxygen tags are quite OK as a quick
	reference once one is familiar with the liblzma API.

	Document the LZMA1 and LZMA2 algorithms.