INDEX
Explanations
sentences indicating disapproval or criticism
expressions of rejection or disapproval
New Auto-Interp
Negative Logits
utenberg
-0.85
bottleneck
-0.72
umping
-0.70
umped
-0.68
clut
-0.67
looms
-0.64
©¶æ¥µ
-0.64
moil
-0.64
rolled
-0.64
azard
-0.64
POSITIVE LOGITS
acceptable
1.11
decency
1.09
acceptable
1.07
sane
0.99
civilized
0.99
ensible
0.98
condone
0.93
patriotism
0.93
credible
0.92
lawful
0.92
Activations Density 0.386%