C# .Net Core 2.0 Web crawler to create PDF and Epub version of online light novels.
It's a portage of the .Net 4.5 version available here.
This is a core library. It will need a proper interface to work. For now, there is only a CLI available.
You program need to call the ConfigTools.InitConf
function with the Config.xml filename to load the default parameters, and initialize global variables. You can call the function multiple times with different files to override values.
When it's done, call ConfigTools.InitLightNovels
function with the LightNovels.xml filename to load the LN list. You can call this function multiple times with different files. By default it will add LNs to a global list. If you call the function with true
as the second parameter, the list will be cleared before file import.
Then, use the WebCrawler
class as entry point to download chapters (either defined in the LightNovels.xml
or LightNovels_user.xml
file, or dynamically instantiated in your program)
To allow the library to interact with the user or an external programm (ask information, or output some progress), classes implementing the IInput
and IOutput
interface are required for the WebCrawler
constructor.
New parsers (to handle new websites) can be added either to this project in the Web/Parser
folder with a pull request, or dynamically loaded in the ParserFactory
.
In any case, classes need to implement the IParser
interface.
Translations can be added to the project by adding a Strings.**Code**.resx
file to Resource folder (where Code is a CultureInfo code like "en-US" or "fr-FR"...). Feel free to submit your corrections, suggestions or addition in a PR.
Web parser use HtmlAgilityPack to parse string to HtmlNodes.
The library Ionic.Zip.dll is part of the DotNetZip library which have been cloned in this repo because of some needed modification
PDF files are generated by the iTextSharp library