l********k 发帖数: 14844 | 1 目前有海量(T量级)的小文件(~512kB)要备份,想要压缩,但是又有错容性和访问
便捷性的顾虑。主要是考虑到万一有错误的bit,或者物理损坏(当然是在有备份的情
况下),打包出来的大文件会不会整个都毁了?如果根本不打包,那么出错的话,损失
也不会扩大到整个数据。
tar出来的文件,如果不压缩,对文件中间出现错位、物理损坏的错容性如何? | b*s 发帖数: 82482 | 2 我用tar打包过300-400 GB的 1 MB左右的小文件,好像没有出过问题。不打包copy 速
度太难以忍受了……
【在 l********k 的大作中提到】 : 目前有海量(T量级)的小文件(~512kB)要备份,想要压缩,但是又有错容性和访问 : 便捷性的顾虑。主要是考虑到万一有错误的bit,或者物理损坏(当然是在有备份的情 : 况下),打包出来的大文件会不会整个都毁了?如果根本不打包,那么出错的话,损失 : 也不会扩大到整个数据。 : tar出来的文件,如果不压缩,对文件中间出现错位、物理损坏的错容性如何?
| l********k 发帖数: 14844 | 3 自己顶一个:afio
docstore.mik.ua/orelly/unix3/upt/ch38_05.htm
There are good arguments both for and against compression of tar archives
when making backups. The overall problem is that neither tar nor gzip is
particularly fault-tolerant, no matter how convenient they are. Although
compression using gzip can greatly reduce the amount of backup media
required to store an archive, compressing entire tar files as they are
written to floppy or tape makes the backup prone to complete loss if one
block of the archive is corrupted, say, through a media error (not uncommon
in the case of floppies and tapes). Most compression algorithms, gzip
included, depend on the coherency of data across many bytes to achieve
compression. If any data within a compressed archive is corrupt, gunzip may
not be able to uncompress the file at all, making it completely unreadable
to tar. The same applies to bzip2. It may compress things better than gzip,
but it has the same lack of fault-tolerance.
This is much worse than if the tar file were uncompressed on the tape.
Although tar doesn't provide much protection against data corruption within
an archive, if there is minimal corruption within a tar file, you can
usually recover most of the archived files with little trouble, or at least
those files up until the corruption occurs. Although far from perfect, it's
better than losing your entire backup.
A better solution would be to use an archiving tool other than tar to make
backups. There are several options available. cpio (Section 38.13) is an
archiving utility that packs files together, much like tar. However, because
of the simpler storage method used by cpio, it recovers cleanly from data
corruption in an archive. (It still doesn't handle errors well on gzipped
files.)
The best solution may be to use a tool such as afio. afio supports
multivolume backups and is similar in some respects to cpio. However, afio
includes compression and is more reliable because each individual file is
compressed. This means that if data on an archive is corrupted, the damage
can be isolated to individual files, instead of to the entire backup. |
|