INDEX
Explanations
instances of strong opposition or disagreement
New Auto-Interp
Negative Logits
vert
-0.17
Ỽt
-0.16
empo
-0.15
incible
-0.15
ÑģÑĤÑĥд
-0.15
Boy
-0.15
urnished
-0.15
cust
-0.14
Balls
-0.14
vinc
-0.14
POSITIVE LOGITS
abble
0.19
salt
0.15
intr
0.15
gor
0.15
lang
0.15
solid
0.15
atura
0.14
aber
0.14
tender
0.14
tide
0.14
Activations Density 0.000%