INDEX
Explanations
negative expressions or sentiments
New Auto-Interp
Negative Logits
arer
-0.17
inki
-0.15
LEM
-0.14
redo
-0.14
Nez
-0.14
.require
-0.14
akers
-0.14
opak
-0.13
ryo
-0.13
abwe
-0.13
POSITIVE LOGITS
å¼ı
0.15
ejs
0.15
iman
0.15
imson
0.15
Sahara
0.14
fools
0.14
-uppercase
0.13
Mec
0.13
&view
0.13
127
0.13
Activations Density 0.042%