INDEX
Explanations
expressions of evaluation or judgment about quality or performance
New Auto-Interp
Negative Logits
quite
-0.29
probably
-0.27
Quite
-0.25
Probably
-0.24
probably
-0.23
almost
-0.22
pretty
-0.20
quite
-0.20
rất
-0.20
almost
-0.19
POSITIVE LOGITS
slightest
0.30
ANY
0.29
yoksa
0.28
anything
0.26
anything
0.26
вдÑĢÑĥг
0.25
anywhere
0.25
EVER
0.24
nÃło
0.23
truly
0.23
Activations Density 0.420%