An open source web search engine and spider/crawler. This was once the codebase for a search engine called Gigablast, but the site is no longer operational. This is a fork of the original codebase located at https://github.com/gigablast/open-source-search-engine
To experiment, you can quickly launch via docker by running:
docker run -p 8000:8000 -it --rm moldybits/open-source-search-engine
If you wish to preserve data between runs, you can:
docker run -p 8000:8000 -it --rm -v $(pwd)/data:/var/gigablast/data0 moldybits/open-source-search-engine
- cleanup! - Moved sources that are actually used into
src
dir. Everything else has been stuffed in thejunkdrawer
dir. - More cleanup - formatting, removing TONS of commented code, fixing some segfaults. This is ongoing...
- I have replaced the original
Makefile
with CMake. This now installs the correct files required so you can execute./gb
in thebuild
directory and run a test server there without it borking your source dir. - Stubbed out some testing functionality for building tests if this ever gets cleaned up enough to start making "real" changes.
This does not build on ARM and does not work correctly on modern versions of MacOS, though it looks like there once was support at one point in time.
git clone https://github.com/catchorg/Catch2.git
cd Catch2
cmake -Bbuild -H. -DBUILD_TESTING=OFF
sudo cmake --build build/ --target install
sudo apt-get install make g++ libssl-dev libz-dev cmake
Last tried with AlmaLinux 9
sudo yum install gcc-c++ openssl-devel libz-devel cmake
cd open-source-search-engine
cmake -Bbuild
cmake --build build/
Should be filed at https://github.com/twistdroach/open-source-search-engine
Tests can be put in the tests directory. I have written a few simple examples just to make sure it (mostly) works.
There are various docs located in the html directory. The FAQ & developer.html are particularly interesting.