INDEX
Explanations
second-person pronouns and related context
New Auto-Interp
Negative Logits
adel
-0.16
ower
-0.15
éc
-0.14
urn
-0.14
stre
-0.14
enn
-0.14
mage
-0.13
agos
-0.13
ulse
-0.13
nof
-0.13
POSITIVE LOGITS
must
0.17
should
0.17
can
0.16
ãģıãĤĵ
0.15
å¢
0.15
SHOULD
0.15
shouldn
0.14
天åłĤ
0.14
rending
0.14
shall
0.14
Activations Density 0.066%