INDEX
Explanations
references to animals, especially lions
references to lions and related big cats
New Auto-Interp
Negative Logits
Seym
-0.72
Indie
-0.71
irlf
-0.71
Republic
-0.70
Publishers
-0.68
Store
-0.67
Hancock
-0.67
mble
-0.67
Forbes
-0.66
Medium
-0.66
POSITIVE LOGITS
lions
1.31
lion
1.29
esses
1.04
gorilla
0.93
ormal
0.93
umbers
0.91
fish
0.87
ess
0.85
stal
0.85
osaurs
0.84
Activations Density 0.008%