INDEX
Explanations
questions and expressions of opinion or judgment
New Auto-Interp
Negative Logits
only
-0.18
only
-0.15
pretty
-0.14
ceased
-0.14
lut
-0.14
Only
-0.14
weets
-0.14
=
-0.14
seulement
-0.14
CB
-0.13
POSITIVE LOGITS
onet
0.17
exactly
0.17
оÑĢи
0.17
enet
0.16
(æľ¨
0.16
Exactly
0.15
FileVersion
0.15
ÄįÃŃ
0.14
gain
0.14
ELSE
0.14
Activations Density 0.077%