INDEX
Explanations
the word "so" in various contexts
New Auto-Interp
Negative Logits
bih
-0.15
ners
-0.14
ли
-0.14
ại
-0.14
kill
-0.14
Oops
-0.14
=*
-0.14
ette
-0.13
ÙĪØ§Ø±Ùĩ
-0.13
uet
-0.13
POSITIVE LOGITS
so
0.21
many
0.20
ething
0.19
ovit
0.19
aks
0.18
vielen
0.16
MANY
0.15
ÐļÑĢÑĸм
0.15
alan
0.15
_MANY
0.15
Activations Density 0.023%