From d099c968607135385fbc7abdfd281fbe32fee8ac Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Sat, 26 Mar 2022 00:41:37 +0100 Subject: [PATCH] docs: ways one can access mirror (#124) * docs: ways one can access mirror * docs: reword the existing mirrors section (#125) Co-authored-by: Marcin Rataj Co-authored-by: Piotr Galar --- README.md | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index 3d69500..09dbfce 100644 --- a/README.md +++ b/README.md @@ -10,6 +10,12 @@ Putting Wikipedia Snapshots on IPFS and working towards making it fully read-wri ## Existing Mirrors +There are various ways one can access the mirrors: through a [DNSLink](https://docs.ipfs.io/concepts/glossary/#dnslink), public [gateway](https://docs.ipfs.io/concepts/glossary/#gateway) or directly with a [CID](https://docs.ipfs.io/concepts/glossary/#cid). + +You can [read all about the available methods here](https://blog.ipfs.io/2021-05-31-distributed-wikipedia-mirror-update/#improved-access-to-wikipedia-mirrors). + +### DNSLinks + - https://en.wikipedia-on-ipfs.org - https://tr.wikipedia-on-ipfs.org - https://my.wikipedia-on-ipfs.org @@ -19,7 +25,13 @@ Putting Wikipedia Snapshots on IPFS and working towards making it fully read-wri - https://ru.wikipedia-on-ipfs.org - https://fa.wikipedia-on-ipfs.org -Each mirror has a link to original [Kiwix](https://kiwix.org) ZIM archive in the footer. +### CIDs + +The latest CIDs that the DNSLinks point at can be found in [snapshot-hashes.yml](snapshot-hashes.yml). + +--- + +Each mirror has a link to the original [Kiwix](https://kiwix.org) ZIM archive in the footer. It can be dowloaded and opened offline with the [Kiwix Reader](https://www.kiwix.org/en/download/). ## Table of Contents @@ -144,7 +156,7 @@ Make sure you use go-ipfs 0.12 or later, it has automatic sharding of big direct ### Step 3: Download the latest snapshot from kiwix.org -Source of ZIM files is at https://download.kiwix.org/zim/wikipedia/ +Source of ZIM files is at https://download.kiwix.org/zim/wikipedia/ Make sure you download `_all_maxi_` snapshots, as those include images. To automate this, you can also use the `getzim.sh` script: @@ -172,8 +184,8 @@ $ zimdump dump ./snapshots/wikipedia_tr_all_maxi_2021-01.zim --dir ./tmp/wikiped > ### ℹ️ ZIM's main page > -> Each ZIM file has "main page" attribute which defines the landing page set for the ZIM archive. -> It is often different than the "main page" of upstream Wikipedia. +> Each ZIM file has "main page" attribute which defines the landing page set for the ZIM archive. +> It is often different than the "main page" of upstream Wikipedia. > Kiwix Main page needs to be passed in the next step, so until there is an automated way to determine "main page" of ZIM, you need to open ZIM in Kiwix reader and eyeball the name of the landing page. ### Step 5: Convert the unpacked zim directory to a website with mirror info @@ -250,7 +262,7 @@ Make sure at least two full reliable copies exist before updating DNSLink. ## mirrorzim.sh -It is possible to automate steps 3-6 via a wrapper script named `mirrorzim.sh`. +It is possible to automate steps 3-6 via a wrapper script named `mirrorzim.sh`. It will download the latest snapshot of specified language (if needed), unpack it, and add it to IPFS. To see how the script behaves try running it on one of the smallest wikis, such as `cu`: @@ -261,9 +273,9 @@ $ ./mirrorzim.sh --languagecode=cu --wikitype=wikipedia --hostingdnsdomain=cu.wi ## Docker build -A `Dockerfile` with all the software requirements is provided. +A `Dockerfile` with all the software requirements is provided. For now it is only a handy container for running the process on non-Linux -systems or if you don't want to pollute your system with all the dependencies. +systems or if you don't want to pollute your system with all the dependencies. In the future it will be end-to-end blackbox that takes ZIM and spits out CID and repo.