INDEX
Explanations
references to lions and associated terms within various contexts
New Auto-Interp
Negative Logits
alian
-0.20
undry
-0.19
ipv
-0.18
aign
-0.15
ÙĥاÙħ
-0.15
anson
-0.15
urent
-0.15
ustin
-0.15
sit
-0.15
unik
-0.14
POSITIVE LOGITS
ess
0.33
esses
0.32
cub
0.26
ardo
0.24
fish
0.23
el
0.23
Cub
0.22
Lion
0.22
mane
0.22
heart
0.22
Activations Density 0.010%