INDEX
Explanations
phrases indicating uncertainty, such as modal verbs and adverbs that suggest possibility or frequency
New Auto-Interp
Negative Logits
ayet
-0.17
ichel
-0.16
ifen
-0.16
-Compatible
-0.16
strup
-0.15
.fx
-0.15
ettings
-0.15
taÅŁ
-0.14
ighton
-0.14
dik
-0.14
POSITIVE LOGITS
even
0.28
sogar
0.19
also
0.19
even
0.18
même
0.17
akan
0.17
cả
0.17
даже
0.16
Even
0.16
è¿ĺæľī
0.16
Activations Density 0.094%