INDEX
Explanations
repeated assertions of insistence or persistence in opinions or actions
New Auto-Interp
Negative Logits
vez
-0.17
ugs
-0.16
ita
-0.16
UCH
-0.15
itel
-0.14
мага
-0.14
yms
-0.14
žal
-0.14
uch
-0.14
bla
-0.14
POSITIVE LOGITS
ently
0.20
/assert
0.17
uire
0.17
shire
0.16
enta
0.15
combe
0.15
ly
0.15
against
0.15
insisted
0.15
/request
0.15
Activations Density 0.063%