Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

muti-thread parallel diff with -m & -s, used C++11 #314

Merged
merged 20 commits into from
Oct 9, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,15 @@
# HDiffPatch Change Log

full changelog at: https://github.com/sisong/HDiffPatch/commits


## [v4.4.0](https://github.com/sisong/HDiffPatch/tree/v4.4.0) - 2022-10-09
### Changed
* optimize diff -m & -s speed by muti-thread parallel, requires C++11.

## [v4.3.0](https://github.com/sisong/HDiffPatch/tree/v4.3.0) - 2022-09-23
### Changed
* recode some patch error code: decompresser errors, file error, disk space full error, jni error

## [v4.2.0](https://github.com/sisong/HDiffPatch/tree/v4.2.0) - 2022-05-15
### Added
* add function create_lite_diff() & hpatch_lite_open(),hpatch_lite_patch(); optimized hpatch on MCU,NB-IoT... (demo [HPatchLite](https://github.com/sisong/HPatchLite))
Expand Down
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
MIT License

HDiffPatch
Copyright (c) 2012-2021 housisong
Copyright (c) 2012-2022 housisong

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -291,7 +291,7 @@ else
endif

CFLAGS += $(DEF_FLAGS)
CXXFLAGS += $(DEF_FLAGS)
CXXFLAGS += $(DEF_FLAGS) -std=c++11

.PHONY: all install clean

Expand Down
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# [HDiffPatch](https://github.com/sisong/HDiffPatch)
[![release](https://img.shields.io/badge/release-v4.3.0-blue.svg)](https://github.com/sisong/HDiffPatch/releases)
[![release](https://img.shields.io/badge/release-v4.4.0-blue.svg)](https://github.com/sisong/HDiffPatch/releases)
[![license](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/sisong/HDiffPatch/blob/master/LICENSE)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-blue.svg)](https://github.com/sisong/HDiffPatch/pulls)
[![+issue Welcome](https://img.shields.io/github/issues-raw/sisong/HDiffPatch?color=green&label=%2Bissue%20welcome)](https://github.com/sisong/HDiffPatch/issues)
Expand Down Expand Up @@ -75,7 +75,7 @@ memory options:
matchScore>=0, DEFAULT -m-6, recommended bin: 0--4 text: 4--9 etc...
-s[-matchBlockSize]
all file load as Stream; fast;
requires O(oldFileSize*16/matchBlockSize+matchBlockSize*5) bytes of memory;
requires O(oldFileSize*16/matchBlockSize+matchBlockSize*5*parallelThreadNumber)bytes of memory;
matchBlockSize>=4, DEFAULT -s-64, recommended 16,32,48,1k,64k,1m etc...
special options:
-block[-fastMatchBlockSize]
Expand All @@ -99,7 +99,7 @@ special options:
if parallelThreadNumber>1 then open multi-thread Parallel mode;
DEFAULT -p-4; requires more memory!
-c-compressType[-compressLevel]
set outDiffFile Compress type & level, DEFAULT uncompress;
set outDiffFile Compress type, DEFAULT uncompress;
for resave diffFile,recompress diffFile to outDiffFile by new set;
support compress type & level & dict:
(re. https://github.com/sisong/lzbench/blob/master/lzbench171_sorted.md )
Expand All @@ -121,7 +121,7 @@ special options:
support run by multi-thread parallel, fast!
WARNING: code not compatible with it compressed by -c-lzma!
-c-zstd[-{0..22}[-dictBits]] DEFAULT level 20
dictBits can 10--31, DEFAULT 24.
dictBits can 10--31, DEFAULT 23.
support run by multi-thread parallel, fast!
-C-checksumType
set outDiffFile Checksum type for directory diff, DEFAULT -C-fadler64;
Expand Down Expand Up @@ -164,9 +164,9 @@ special options:
if used -f and write path is exist directory, will always return error.
--patch
swap to hpatchz mode.
-h or -?
output Help info (this usage).
-v output Version info.
-h (or -?)
output usage info.
```

## **patch** command line usage:
Expand Down Expand Up @@ -217,9 +217,9 @@ special options:
if patch output file, will always return error;
if patch output directory, will overwrite, but not delete
needless existing files in directory.
-h or -?
output Help info (this usage).
-v output Version info.
-h (or -?)
output usage info.
```

---
Expand Down
2 changes: 1 addition & 1 deletion README_cmdline_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
数据的可压缩性相关,一般输入数据的可压缩性越大,这个值就可以越大。
-s[-matchBlockSize]
所有文件当作文件流加载;一般速度比较快;
需要的内存大小: O(旧版本文件大小*16/matchBlockSize+matchBlockSize*5);
需要的内存大小: O(旧版本文件大小*16/matchBlockSize+matchBlockSize*5*parallelThreadNumber);
匹配块大小matchBlockSize>=4, 默认为64, 推荐16,32,48,1k,64k,1m等;
一般匹配块越大,内存占用越小,速度越快,但补丁包可能变大。
其他选项:
Expand Down
32 changes: 20 additions & 12 deletions bsdiff_wrapper/bsdiff_wrapper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -152,10 +152,11 @@ static void serialize_bsdiff(const unsigned char* newData,const unsigned char* n
void _create_bsdiff(const unsigned char* newData,const unsigned char* cur_newData_end,const unsigned char* newData_end,
const unsigned char* oldData,const unsigned char* cur_oldData_end,const unsigned char* oldData_end,
const hpatch_TStreamOutput* out_diff,const hdiff_TCompress* compressPlugin,
int kMinSingleMatchScore,bool isUseBigCacheMatch,ICoverLinesListener* coverLinesListener){
int kMinSingleMatchScore,bool isUseBigCacheMatch,
ICoverLinesListener* coverLinesListener,size_t threadNum){
std::vector<hpatch_TCover_sz> covers;
get_match_covers_by_sstring(newData,cur_newData_end,oldData,cur_oldData_end,covers,
kMinSingleMatchScore,isUseBigCacheMatch,coverLinesListener);
kMinSingleMatchScore,isUseBigCacheMatch,coverLinesListener,threadNum);
if (covers.empty()||(covers[0].newPos!=0)||(covers[0].oldPos!=0)){//begin cover
hpatch_TCover_sz lc;
lc.newPos=0;
Expand Down Expand Up @@ -186,47 +187,54 @@ using namespace hdiff_private;
void create_bsdiff(const unsigned char* newData,const unsigned char* newData_end,
const unsigned char* oldData,const unsigned char* oldData_end,
const hpatch_TStreamOutput* out_diff,const hdiff_TCompress* compressPlugin,
int kMinSingleMatchScore,bool isUseBigCacheMatch,ICoverLinesListener* coverLinesListener){
int kMinSingleMatchScore,bool isUseBigCacheMatch,
ICoverLinesListener* coverLinesListener,size_t threadNum){
_create_bsdiff(newData,newData_end,newData_end,oldData,oldData_end,oldData_end,
out_diff,compressPlugin,kMinSingleMatchScore,isUseBigCacheMatch,coverLinesListener);
out_diff,compressPlugin,kMinSingleMatchScore,isUseBigCacheMatch,
coverLinesListener,threadNum);
}
void create_bsdiff(const hpatch_TStreamInput* newData,const hpatch_TStreamInput* oldData,
const hpatch_TStreamOutput* out_diff,const hdiff_TCompress* compressPlugin,
int kMinSingleMatchScore,bool isUseBigCacheMatch,ICoverLinesListener* coverLinesListener){
int kMinSingleMatchScore,bool isUseBigCacheMatch,
ICoverLinesListener* coverLinesListener,size_t threadNum){
TAutoMem oldAndNewData;
loadOldAndNewStream(oldAndNewData,oldData,newData);
size_t old_size=oldData?(size_t)oldData->streamSize:0;
unsigned char* pOldData=oldAndNewData.data();
unsigned char* pNewData=pOldData+old_size;
unsigned char* pNewDataEnd=pNewData+(size_t)newData->streamSize;
_create_bsdiff(pNewData,pNewDataEnd,pNewDataEnd,pOldData,pOldData+old_size,pOldData+old_size,
out_diff,compressPlugin,kMinSingleMatchScore,isUseBigCacheMatch,coverLinesListener);
out_diff,compressPlugin,kMinSingleMatchScore,isUseBigCacheMatch,
coverLinesListener,threadNum);
}

void create_bsdiff_block(unsigned char* newData,unsigned char* newData_end,
unsigned char* oldData,unsigned char* oldData_end,
const hpatch_TStreamOutput* out_diff,const hdiff_TCompress* compressPlugin,
int kMinSingleMatchScore,bool isUseBigCacheMatch,size_t matchBlockSize){
int kMinSingleMatchScore,bool isUseBigCacheMatch,
size_t matchBlockSize,size_t threadNum){
if (matchBlockSize==0){
_create_bsdiff(newData,newData_end,newData_end,oldData,oldData_end,oldData_end,
out_diff,compressPlugin,kMinSingleMatchScore,isUseBigCacheMatch,0);
out_diff,compressPlugin,kMinSingleMatchScore,isUseBigCacheMatch,0,threadNum);
return;
}
TCoversOptimMB<TMatchBlock> coversOp(newData,newData_end,oldData,oldData_end,matchBlockSize);
TCoversOptimMB<TMatchBlock> coversOp(newData,newData_end,oldData,oldData_end,matchBlockSize,threadNum);
_create_bsdiff(newData,coversOp.matchBlock->newData_end_cur,newData_end,
oldData,coversOp.matchBlock->oldData_end_cur,oldData_end,
out_diff,compressPlugin,kMinSingleMatchScore,isUseBigCacheMatch,&coversOp);
out_diff,compressPlugin,kMinSingleMatchScore,isUseBigCacheMatch,&coversOp,threadNum);
}
void create_bsdiff_block(const hpatch_TStreamInput* newData,const hpatch_TStreamInput* oldData,
const hpatch_TStreamOutput* out_diff,const hdiff_TCompress* compressPlugin,
int kMinSingleMatchScore,bool isUseBigCacheMatch,size_t matchBlockSize){
int kMinSingleMatchScore,bool isUseBigCacheMatch,
size_t matchBlockSize,size_t threadNum){
TAutoMem oldAndNewData;
loadOldAndNewStream(oldAndNewData,oldData,newData);
size_t old_size=oldData?(size_t)oldData->streamSize:0;
unsigned char* pOldData=oldAndNewData.data();
unsigned char* pNewData=pOldData+old_size;
create_bsdiff_block(pNewData,pNewData+(size_t)newData->streamSize,pOldData,pOldData+old_size,
out_diff,compressPlugin,kMinSingleMatchScore,isUseBigCacheMatch,matchBlockSize);
out_diff,compressPlugin,kMinSingleMatchScore,isUseBigCacheMatch,
matchBlockSize,threadNum);
}

bool get_is_bsdiff(const hpatch_TStreamInput* diffData){
Expand Down
14 changes: 10 additions & 4 deletions bsdiff_wrapper/bsdiff_wrapper.h
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,13 @@ void create_bsdiff(const unsigned char* newData,const unsigned char* newData_end
const unsigned char* oldData,const unsigned char* oldData_end,
const hpatch_TStreamOutput* out_diff,const hdiff_TCompress* compressPlugin,
int kMinSingleMatchScore=kMinSingleMatchScore_default,
bool isUseBigCacheMatch=false,ICoverLinesListener* coverLinesListener=0);
bool isUseBigCacheMatch=false,ICoverLinesListener* coverLinesListener=0,
size_t threadNum=1);
void create_bsdiff(const hpatch_TStreamInput* newData,const hpatch_TStreamInput* oldData,
const hpatch_TStreamOutput* out_diff,const hdiff_TCompress* compressPlugin,
int kMinSingleMatchScore=kMinSingleMatchScore_default,
bool isUseBigCacheMatch=false,ICoverLinesListener* coverLinesListener=0);
bool isUseBigCacheMatch=false,ICoverLinesListener* coverLinesListener=0,
size_t threadNum=1);

bool get_is_bsdiff(const unsigned char* diffData,const unsigned char* diffData_end);
bool get_is_bsdiff(const hpatch_TStreamInput* diffData);
Expand All @@ -55,10 +57,14 @@ void create_bsdiff_block(unsigned char* newData,unsigned char* newData_end,
unsigned char* oldData,unsigned char* oldData_end,
const hpatch_TStreamOutput* out_diff,const hdiff_TCompress* compressPlugin,
int kMinSingleMatchScore=kMinSingleMatchScore_default,
bool isUseBigCacheMatch=false,size_t matchBlockSize=kDefaultFastMatchBlockSize);
bool isUseBigCacheMatch=false,
size_t matchBlockSize=kDefaultFastMatchBlockSize,
size_t threadNum=1);
void create_bsdiff_block(const hpatch_TStreamInput* newData,const hpatch_TStreamInput* oldData,
const hpatch_TStreamOutput* out_diff,const hdiff_TCompress* compressPlugin,
int kMinSingleMatchScore=kMinSingleMatchScore_default,
bool isUseBigCacheMatch=false,size_t matchBlockSize=kDefaultFastMatchBlockSize);
bool isUseBigCacheMatch=false,
size_t matchBlockSize=kDefaultFastMatchBlockSize,
size_t threadNum=1);

#endif
1 change: 1 addition & 0 deletions builds/android_ndk_jni_mk/build_libs_zstd.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ndk-build NDK_PROJECT_PATH=. APP_BUILD_SCRIPT=Android.mk NDK_APPLICATION_MK=Application.mk ZSTD=1
1 change: 1 addition & 0 deletions builds/android_ndk_jni_mk/build_libs_zstd.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ndk-build NDK_PROJECT_PATH=. APP_BUILD_SCRIPT=Android.mk NDK_APPLICATION_MK=Application.mk ZSTD=1
12 changes: 4 additions & 8 deletions builds/codeblocks/HDiffZ.cbp
Original file line number Diff line number Diff line change
Expand Up @@ -320,22 +320,18 @@
<Unit filename="../../libHDiffPatch/HDiff/match_block.cpp" />
<Unit filename="../../libHDiffPatch/HDiff/private_diff/bytes_rle.cpp" />
<Unit filename="../../libHDiffPatch/HDiff/private_diff/compress_detect.cpp" />
<Unit filename="../../libHDiffPatch/HDiff/private_diff/libdivsufsort/divsufsort.c">
<Option compilerVar="CC" />
</Unit>
<Unit filename="../../libHDiffPatch/HDiff/private_diff/libdivsufsort/divsufsort64.c">
<Option compilerVar="CC" />
</Unit>
<Unit filename="../../libHDiffPatch/HDiff/private_diff/libdivsufsort/divsufsort.cpp" />
<Unit filename="../../libHDiffPatch/HDiff/private_diff/libdivsufsort/divsufsort64.cpp" />
<Unit filename="../../libHDiffPatch/HDiff/private_diff/limit_mem_diff/adler_roll.c">
<Option compilerVar="CC" />
</Unit>
<Unit filename="../../libHDiffPatch/HDiff/private_diff/limit_mem_diff/digest_matcher.cpp" />
<Unit filename="../../libHDiffPatch/HDiff/private_diff/limit_mem_diff/stream_serialize.cpp" />
<Unit filename="../../libHDiffPatch/HDiff/private_diff/suffix_string.cpp" />
<Unit filename="../../libHDiffPatch/HPatchLite/hpatch_lite.c">
<Unit filename="../../libHDiffPatch/HPatch/patch.c">
<Option compilerVar="CC" />
</Unit>
<Unit filename="../../libHDiffPatch/HPatch/patch.c">
<Unit filename="../../libHDiffPatch/HPatchLite/hpatch_lite.c">
<Option compilerVar="CC" />
</Unit>
<Unit filename="../../libParallel/parallel_channel.cpp" />
Expand Down
15 changes: 7 additions & 8 deletions builds/codeblocks/unitTest.cbp
Original file line number Diff line number Diff line change
Expand Up @@ -35,28 +35,27 @@
<Linker>
<Add library="libz" />
<Add library="libbz2" />
<Add library="libpthread" />
</Linker>
<Unit filename="../../libHDiffPatch/HDiff/diff.cpp" />
<Unit filename="../../libHDiffPatch/HDiff/private_diff/bytes_rle.cpp" />
<Unit filename="../../libHDiffPatch/HDiff/private_diff/compress_detect.cpp" />
<Unit filename="../../libHDiffPatch/HDiff/private_diff/libdivsufsort/divsufsort.c">
<Option compilerVar="CC" />
</Unit>
<Unit filename="../../libHDiffPatch/HDiff/private_diff/libdivsufsort/divsufsort64.c">
<Option compilerVar="CC" />
</Unit>
<Unit filename="../../libHDiffPatch/HDiff/private_diff/libdivsufsort/divsufsort.cpp" />
<Unit filename="../../libHDiffPatch/HDiff/private_diff/libdivsufsort/divsufsort64.cpp" />
<Unit filename="../../libHDiffPatch/HDiff/private_diff/limit_mem_diff/adler_roll.c">
<Option compilerVar="CC" />
</Unit>
<Unit filename="../../libHDiffPatch/HDiff/private_diff/limit_mem_diff/digest_matcher.cpp" />
<Unit filename="../../libHDiffPatch/HDiff/private_diff/limit_mem_diff/stream_serialize.cpp" />
<Unit filename="../../libHDiffPatch/HDiff/private_diff/suffix_string.cpp" />
<Unit filename="../../libHDiffPatch/HPatchLite/hpatch_lite.c">
<Unit filename="../../libHDiffPatch/HPatch/patch.c">
<Option compilerVar="CC" />
</Unit>
<Unit filename="../../libHDiffPatch/HPatch/patch.c">
<Unit filename="../../libHDiffPatch/HPatchLite/hpatch_lite.c">
<Option compilerVar="CC" />
</Unit>
<Unit filename="../../libParallel/parallel_channel.cpp" />
<Unit filename="../../libParallel/parallel_import.cpp" />
<Unit filename="../../test/unit_test.cpp" />
<Extensions>
<envvars />
Expand Down
8 changes: 6 additions & 2 deletions builds/vc/HDiffZ.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -30,26 +30,30 @@
<UseDebugLibraries>true</UseDebugLibraries>
<CLRSupport>false</CLRSupport>
<CharacterSet>Unicode</CharacterSet>
<PlatformToolset>v110</PlatformToolset>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'" Label="Configuration">
<ConfigurationType>Application</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
<CLRSupport>false</CLRSupport>
<CharacterSet>Unicode</CharacterSet>
<PlatformToolset>v110</PlatformToolset>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'" Label="Configuration">
<ConfigurationType>Application</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
<CLRSupport>false</CLRSupport>
<CharacterSet>Unicode</CharacterSet>
<WholeProgramOptimization>true</WholeProgramOptimization>
<PlatformToolset>v110</PlatformToolset>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'" Label="Configuration">
<ConfigurationType>Application</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
<CLRSupport>false</CLRSupport>
<CharacterSet>Unicode</CharacterSet>
<WholeProgramOptimization>true</WholeProgramOptimization>
<PlatformToolset>v110</PlatformToolset>
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.props" />
<ImportGroup Label="ExtensionSettings">
Expand Down Expand Up @@ -201,8 +205,8 @@
<ClCompile Include="..\..\libHDiffPatch\HDiff\match_block.cpp" />
<ClCompile Include="..\..\libHDiffPatch\HDiff\private_diff\bytes_rle.cpp" />
<ClCompile Include="..\..\libHDiffPatch\HDiff\private_diff\compress_detect.cpp" />
<ClCompile Include="..\..\libHDiffPatch\HDiff\private_diff\libdivsufsort\divsufsort.c" />
<ClCompile Include="..\..\libHDiffPatch\HDiff\private_diff\libdivsufsort\divsufsort64.c" />
<ClCompile Include="..\..\libHDiffPatch\HDiff\private_diff\libdivsufsort\divsufsort.cpp" />
<ClCompile Include="..\..\libHDiffPatch\HDiff\private_diff\libdivsufsort\divsufsort64.cpp" />
<ClCompile Include="..\..\libHDiffPatch\HDiff\private_diff\limit_mem_diff\adler_roll.c" />
<ClCompile Include="..\..\libHDiffPatch\HDiff\private_diff\limit_mem_diff\digest_matcher.cpp" />
<ClCompile Include="..\..\libHDiffPatch\HDiff\private_diff\limit_mem_diff\stream_serialize.cpp" />
Expand Down
4 changes: 4 additions & 0 deletions builds/vc/HPatchZ.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -30,26 +30,30 @@
<UseDebugLibraries>true</UseDebugLibraries>
<CLRSupport>false</CLRSupport>
<CharacterSet>Unicode</CharacterSet>
<PlatformToolset>v110</PlatformToolset>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'" Label="Configuration">
<ConfigurationType>Application</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
<CLRSupport>false</CLRSupport>
<CharacterSet>Unicode</CharacterSet>
<PlatformToolset>v110</PlatformToolset>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'" Label="Configuration">
<ConfigurationType>Application</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
<CLRSupport>false</CLRSupport>
<CharacterSet>Unicode</CharacterSet>
<WholeProgramOptimization>true</WholeProgramOptimization>
<PlatformToolset>v110</PlatformToolset>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'" Label="Configuration">
<ConfigurationType>Application</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
<CLRSupport>false</CLRSupport>
<CharacterSet>Unicode</CharacterSet>
<WholeProgramOptimization>true</WholeProgramOptimization>
<PlatformToolset>v110</PlatformToolset>
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.props" />
<ImportGroup Label="ExtensionSettings">
Expand Down
Loading