Skip to content
chocolatechipcats edited this page Apr 7, 2023 · 13 revisions

replace_metadata

## Use regular expressions to find and replace (or remove) metadata.
## For example, you could change Sci-Fi=>SF, remove *-Centered tags,
## etc.  See http://docs.python.org/library/re.html (look for re.sub)
## for regexp details.

## Make sure to keep at least one space at the start of each line and
## to escape % to %%, if used.

## Two, three or five part lines.  Two part effect everything.
## Three part effect only those key(s) lists.
## *Five* part lines.  Effect only when trailing conditional key=>regexp matches
## metakey[,metakey]=>pattern=>replacement[&&conditionalkey=>regexp]

## Note that if metakey == conditionalkey the conditional is ignored.
## You can use \s in the replacement to add explicit spaces.  (The config parser
## tends to discard trailing spaces.)
## replace_metadata <entry>_LIST options: FanFicFare replace_metadata lines
## operate on individual list items for list entries. But if you
## want to do a replacement on the joined string for the whole list,
## you can by using <entry>_LIST. Example, if you added
## calibre_author: calibre_author_LIST=>^(.{,100}).*$=>\1

replace_metadata:
 genre,category=>Sci-Fi=>SF
 Puella Magi Madoka Magica.*=>Madoka
 Comedy=>Humor
 Crossover: (.*)=>\1
 title=>(.*)Great(.*)=>\1Moderate\2
 .*-Centered=>
 characters=>Sam W\.=>Sam Witwicky&&category=>Transformers
 characters=>Sam W\.=>Sam Winchester&&category=>Supernatural

A more wordy explanation

replace_metadata lines can take one of three different forms.

The first and simplest is: pattern=>replacement

All metadata items that matches regexp pattern will be replaced using the standard Python regexp library like so: value = re.sub(pattern,replacement,value)

So, for example, if you are offended by the word Furbie and never want to see it anywhere in your metadata, you do:

Furbie=>F*rbie

The second form is: metakey[,metakey]=>pattern=>replacement

The only difference is that you are limiting which metadata items the line will apply do by including one or more metakeys.

Metakey is one of the metadata items defined by FanFicFare (category, genre, characters, etc) or added by extra_valid_entries.

So, for example, if you what Humor converted to Comedy in genre and category:

genre,category=>Humor=>Comedy

The third form is: metakey[,metakey]=>pattern=>replacement&&conditionalkey=>condregexp

This essentially says, "For metadata items metakey, if metadata item conditionalkey matches conditionalkey, replace pattern with replacement.

Now there are three conditions that must be true before the replacement is done.

  1. It must be a metadata item metakey,
  2. the value must match pattern and
  3. the value of metadata item conditionalkey must match condregexp.
 characters=>Sam W\.=>Sam Witwicky&&category=>Transformers
 characters=>Sam W\.=>Sam Winchester&&category=>Supernatural

Starting 2019Jan, replace_metadata conditionals can use ==, =!, != and !~ instead of => (which is equivalent to =~).

Also as of 2019Jan, conditional checks are done against each item in metadata lists. Before that, they checked the string made from the list. So &&category==Transformers will now match a story with category = The Lord of the Rings, Transformers whereas before it would not. If you want to be able to still use the whole string method, you can use <entry>_LIST, eg, category_LIST to get The Lord of the Rings, Transformers as a single string.

You can use conditionals_use_lists:false to get the old behavior.

Replacements are applied in order--so plan accordingly.

Another tip: If you want to be able to test your patterns without hitting your favorite stories again and again, you can use the fake test site and the extracharacters, extracategories, etc parameters. test1.com URLs will generate stories, but not go out to the network.

URL: http://test1.com?sid=12345

[test1.com]
extracharacters:Reginald Smythe-Smythe,Mokona,Harry P.

replace_metadata:
 characters=>Harry P\.=>Harry Potter

Why isn't my replacement working?

Do you understand the regular expression you're trying to use? This a good regex quick start guide.

Using a regex tester such as regex101 can also help with figuring out how to match specific patterns. Note that FanFicFare uses Python as its regexp engine -- while the basics are generally the same, other languages have some subtle differences that may affect things.

A couple common 'gotcha's:

If regex special characters appear in the string you're trying to match, you need to escape them by putting a \ before them. Those characters are: [\^$.|?*+()

So for example, if you want to match 'Harry P.' but not 'Harry Pa', your regex should be Harry P\., not just Harry P. because the . is special and matches any character.

Another is '&' (and less commonly '<' and '>'). Because FanFicFare keeps things internally as (X)HTML valid strings, while it looks like that string is just 'This & That', to match it you need the pattern This &amp; That.

Similarly, use &lt; and &gt; instead of '<' and '>'.

For the same reason (FanFicFare keeps things internally as (X)HTML valid strings), when you want & you should substitute to &amp; as well or there may be subtle problems with your files down the road. Example:

replace_metadata:
 category=>Starsky and Hutch=>Starsky &amp; Hutch

Capitalization

Regular expressions are case-sensitive. There are two ways around this. One is to put an [Aa] wherever you may expect varying capitalization:

## Matches 'mass effect big bang' and 'Mass Effect Big Bang'
 collections=>[Mm]ass [Ee]ffect [Bb]ig [Bb]ang=>Mass Effect Big Bang

The other is to add (?i) (for insensitive) to the beginning of your match. This is a regex flag that ignores casing.

## Matches 'mass effect big bang' and 'Mass Effect Big Bang' but also 'MASS EFFECT BIG BANG'
 collections=>(?i)Mass Effect Big Bang=>Mass Effect Big Bang

Note that entries are always case-sensitive. Your replacement will fail if you use Category=> instead of category=>.

Splitting / Adding Items to a List

Another wrinkle is that you can split one list item into multiple list entries by using \, in the replacement string. Or add new items depending on what's already there. Example:

replace_metadata:
 category=>Bitextual=>M/M\,F/M

If category previously contained ['A', 'Bitextual', 'Z'] it now contains ['A', 'M/M', 'F/M', 'Z']. (Remember lists will usually be de-duped and alphabetically ordered automatically.)

Also note that each split item has the replacements run on it, too. Example:

replace_metadata:
 category=>Bitextual=>M/M\,F/M
 category=>^M/M$=>Gay
 category=>^F/M$=>Hetro

category ['A', 'Bitextual', 'Z'] now contains ['A', 'Gay', 'Hetro', 'Z'].

Clone this wiki locally