INDEX
Explanations
mentions of the animal "lion" within text
references to lions
New Auto-Interp
Negative Logits
mble
-0.83
lying
-0.79
ilk
-0.79
ACTION
-0.70
Ñı
-0.69
chell
-0.69
McKenna
-0.68
kson
-0.67
Morse
-0.65
ADRA
-0.64
POSITIVE LOGITS
esses
1.27
lions
1.17
fish
1.10
lion
1.05
ess
1.01
eye
1.00
osaurs
0.94
ous
0.92
odon
0.89
iasis
0.87
Activations Density 0.010%