INDEX
Explanations
negative statements or rejections
New Auto-Interp
Negative Logits
Efq
-0.63
ViewFeatures
-0.58
cinoma
-0.56
♂
-0.56
незавершена
-0.56
expandindo
-0.56
ftagPool
-0.56
quæ
-0.56
PARATUS
-0.55
ANY
-0.55
POSITIVE LOGITS
longer
0.81
doubt
0.61
tica
0.58
different
0.53
lon
0.53
bodies
0.53
isome
0.53
isier
0.52
beda
0.50
exception
0.50
Activations Density 0.118%