INDEX
Explanations
words related to antagonists or villains in a narrative context
references to villains in storytelling and media
New Auto-Interp
Negative Logits
dating
-0.75
independent
-0.75
ordering
-0.72
olen
-0.72
ollen
-0.71
aternity
-0.71
kept
-0.71
press
-0.71
obar
-0.68
ensation
-0.67
POSITIVE LOGITS
villain
1.29
villains
1.09
mastermind
0.95
ous
0.87
antagonist
0.84
Bane
0.83
esses
0.78
satir
0.77
strugg
0.74
hattan
0.70
Activations Density 0.009%