INDEX
Explanations
references to powerful and intimidating entities or creatures
references to 'beasts' or animal-like metaphors
New Auto-Interp
Negative Logits
pai
-0.83
arters
-0.79
encing
-0.79
psons
-0.76
bered
-0.75
enza
-0.75
monds
-0.73
licted
-0.71
endiary
-0.70
earcher
-0.67
POSITIVE LOGITS
beasts
1.04
Beasts
0.95
beast
0.89
carc
0.88
mong
0.85
lords
0.82
bags
0.80
Beast
0.80
zilla
0.78
ishly
0.77
Activations Density 0.020%