-
Notifications
You must be signed in to change notification settings - Fork 1
/
README
86 lines (64 loc) · 3.61 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
================================================================================
This is a modified version of tar that supports transformation between tar
and mtar. The tar format wasn't designed with deduplication in mind and
thus does not deduplicate well. To solve this problem, we propose a
new format mtar (short for Migratory tar). One can transform a tar file
into mtar, by migrating a tar file. An mtar file can be restored into a tar
file. For more details, please check out our paper (Metadata Considered Harmful ... to Deduplication), published at hotstorage15.
What has been changed?
We added a few files that does the migration and restoration. They are
migrate.c and restore.c, included in the src dir. Both of them are
based on extract.c (which implements extraction of a tar file).
migrate.c scans a tar file and outputs data blocks. The header blocks
are stored in a temporary file (.h) and then appended to the
output, to build the final mtar file. restore.c works in a similar
way but in a reverse direction as migration. It reads the first block
of a mtar file, from which it can extract the number of header blocks.
Then, it seeks to the first header block and start to reconstruct
the original tar file, by outputing a header block and then data blocks
for the first data file. This process repeats until all header blocks
are processed.
In addition, we added filter.c, which removes the padded blocks at the
end of a tar file. Padding was not implemented in migration and restoration.
Supported file types
We only implement and test migration/restoration for tars which
contain dirs, regular files and symbolic links. We have not tested other
types, such as hard links.
How to compile it in Ubuntu14?
If you git clone from my github repository, it will contain a .git dir. This
will turn on all gcc warnings and treat warnings as errors. To turn it off, we can pass ``--disable-gcc-warnings'' to ./configure.
$ ./configure --disable-gcc-warnings --prefix=/usr/local; make -j12; make install
How to use it?
First, we need to remove padded bytes in a tar file, since migrate/restore
does not do padding.
To remove padded bytes, do
$ /usr/local/bin/tar --filter -f tar-1.27.1.tar
It will produce a .tar.f file (.f means filtered)
To migrate a tar file, do
$ /usr/local/bin/tar --migrate -f tar-1.27.1.tar.f
It will produce a .tar.f.m file (tar-1.27.1.tar.f.m).
To restore a tar file, do
$ /usr/local/bin/tar --restore -f tar-1.27.1.tar.f.m
It will produce a .tar.f file (the original tar-1.27.1.tar.f file will be
overwritten.)
The restored tar-1.27.1.tar.f should be exactly the same as the
tar-1.27.1.tar.f that is created by migration.
NOTE:
+ migrate.c is based on extract.c implementation. It is invoked
in the same way as --list.
+ blocking-factor: tar writes output in records. Each record is
blocking-factor * BLOCKSIZE. By default, blocking-factor is 10.
To get an accurate size comparison between migrated tar and tar, we can set
blocking-factor to be 1 and then the difference between migrated tar and
normal tar will be 1024 bytes (tar ends with two 512 blocks).
$ tar c -b 1 -f test.tar test
AUTHORS:
Xing Lin (University of Utah)
Email: [email protected]
If you have any question or find a bug in this software, please feel
free to contact me.
COPYING:
This software is covered under the same policy as GNU tar 1.27.1.
Read COPYING for more details.
================================================================================
Read README.original for README from the gnu tar distribution.