INDEX
Explanations
the word "actually" and its various forms indicating emphasis or confirmation
New Auto-Interp
Negative Logits
ese
-0.18
esh
-0.18
iesel
-0.16
ess
-0.15
elong
-0.15
ens
-0.15
erge
-0.15
arms
-0.15
ishly
-0.14
ers
-0.14
POSITIVE LOGITS
ity
0.25
ités
0.22
actual
0.19
mente
0.18
actual
0.17
-ÑĤаки
0.17
nels
0.17
ty
0.17
ITY
0.16
ignment
0.16
Activations Density 0.045%