INDEX
Explanations
words related to animals and communication
New Auto-Interp
Negative Logits
979
-0.17
Hearth
-0.16
Pron
-0.16
hear
-0.15
umbed
-0.15
ekim
-0.14
umptech
-0.14
баÑĩ
-0.14
.Script
-0.13
Brock
-0.13
POSITIVE LOGITS
shr
0.23
0.21
ble
0.20
hon
0.20
calls
0.19
nasal
0.19
grow
0.19
mia
0.19
sque
0.18
ras
0.18
Activations Density 0.081%