You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In kiwix/operations#286 we had two misspelled yet undetected metadata: tags and scraper.
I think accepting extra metadata in this method defeats the purpose of having them all exposed. I also think it's use is marginal and that additional metadata can still be added by other means.
benoit74
changed the title
[next major] remove **extra from Creator.config_indexing
[next major] remove **extra from Creator.config_metadata
Oct 21, 2024
I would consider to split it in two: config_std_metadata (to be used by default) and config_extra_metadata (for those scraper like warc2zim who want to add custom metadata). This seems important to me so that both method can still benefit from same logic (currently we remove control characters for instance, but we might add more logic in the future). I recommend to even force config_extra_metadata to force the X- prefix we used in warc2zim for X-ContentDate, so that we limit even further the risks of strange metadata. WDYT?
And obviously we need to keep config_indexing till the next major.
I recommend to even force config_extra_metadata to force the X- prefix we used in warc2zim for X-ContentDate, so that we limit even further the risks of strange metadata. WDYT?
Works for me, as long as there's still the possibility to add non-prefixed metadata (via add_metadata()).
In kiwix/operations#286 we had two misspelled yet undetected metadata:
tags
andscraper
.I think accepting extra metadata in this method defeats the purpose of having them all exposed. I also think it's use is marginal and that additional metadata can still be added by other means.
@benoit74 Can we get rid of this?
The text was updated successfully, but these errors were encountered: