This LoRA model aims to reproduce the unique art style seen in the eyecatches from the 2007 anime "Tengen Toppa Gurren Lagann".
Eyecatch Examples
Eyecatch Illustration Credits:
- Akira Amemiya (eps 5, 7, 22)
- Atsushi Nishigori (4 episodes)
- Chikashi Kubota (ep 14)
- Hiroyuki Imaishi (eps 1, 18)
- Ikuo Kuwana (ep 24)
- Katsuzo Hirata (ep 10)
- Kazuhiro Takamura (ep 12)
- Kikuko Sadakata (eps 5, 22)
- Kouichi Motomura (4 episodes)
- Mitsuru Ishihara (ep 3)
- Osamu Kobayashi (ep 4)
- Satoshi Yamaguchi (eps 20, 25)
- Shingo Abe (eps 13, 21, 26)
- Shōko Nakamura (ep 10)
- Sunao Chikaoka (ep 3)
- Sushio (ep 15)
- Tadashi Hiramatsu (ep 26)
- Takashi Mukouda (ep 9)
- Yamato Kojima (ep 23)
- Yoh Yoshinari (ep 27)
- Yuka Shibata (eps 6, 21)
- Easiest way to get set up is to use AUTO1111
- There are plenty of guides on this online
- Download the model from Civit.ai
- Best results on AnyLoRA checkpoint
AnyLora Checkpoint | ttgl-eyecatch |
---|---|
I extracted 28 512x512
images from ~50% of the eyecatches used in the show. The remaining eyecatches used a different style or were tricky to caption so they weren't included in the training set.
Cleaning the data involved:
- removing watermarks
- manual captioning (training for a general art style not a specific style or object)
- Describe EVERYTHING in the image EXCEPT for the style
- Still on the fence as to whether I should include "soft" triggers
- Omit names from all captions
- inpainting
- cropping (a few cases of extracting multiple images from a single eyecatch)
Stable Diffusion checkpoint: anyloraCheckpoint_bakedvaeFtmseFp16NO.
Recommended LoRA optimizer args gave me the best results, though I probably haven't played around with these parameters enough yet.
Training config used for [Kohya-ss](https://github.com/kohya-ss/sd-scripts)
[additional_network_arguments]
no_metadata = false
unet_lr = 0.0001
text_encoder_lr = 5e-5
network_module = "networks.lora"
network_dim = 32
network_alpha = 16
network_train_unet_only = false
network_train_text_encoder_only = false
[optimizer_arguments]
optimizer_type = "AdamW8bit"
learning_rate = 0.0001
max_grad_norm = 1.0
lr_scheduler = "constant"
lr_warmup_steps = 0
[dataset_arguments]
debug_dataset = false
in_json = ***
train_data_dir = ***
dataset_repeats = 15
shuffle_caption = true
keep_tokens = 0
resolution = "512,512"
caption_dropout_rate = 0
caption_tag_dropout_rate = 0
caption_dropout_every_n_epochs = 0
color_aug = false
token_warmup_min = 1
token_warmup_step = 0
[training_arguments]
output_dir = ***
output_name = "ttgl-eyecatch-original"
save_precision = "fp16"
save_every_n_epochs = 1
train_batch_size = 6
max_token_length = 225
mem_eff_attn = false
xformers = true
max_train_epochs = 10
max_data_loader_n_workers = 8
persistent_data_loader_workers = true
gradient_checkpointing = false
gradient_accumulation_steps = 1
mixed_precision = "fp16"
clip_skip = 2
logging_dir = ***
log_prefix = "ttgl-eyecatch-original"
lowram = true
In my opinion, the eyecatch art style I'm looking for is most prominent in models (columns) 000005
and 000007
from strengths 0.5
to 1
.
I'll go with the weaker one for now.
Choosing one over the other is probably not a big deal as something like LoRA Block Weight extension for AUTO1111 allows setting the strength at each individual block as opposed to applying the same strength across all blocks.
This model is trained purely on the original eyecatches. There are only 54 of them in total which isn't a whole lot. Fortunately, there are hundreds if not thousands of these Gurren Lagann eyecatch parodies created by fans.
Fun sidenote: these "parodies" are really just memes but this was a time before memes were called '"memes" - so the name stuck.
I've done a little experimenting with including these community eyecatches in the training set but the resulting models definitely need some more tweaking.