INDEX
Explanations
phrases indicating avoidance and the desire for personal space
New Auto-Interp
Negative Logits
ingles
-0.20
rna
-0.18
somewhat
-0.17
khá
-0.17
rather
-0.17
bastante
-0.17
quite
-0.16
both
-0.16
574
-0.15
almost
-0.15
POSITIVE LOGITS
nor
0.29
necessarily
0.26
anymore
0.24
ecessarily
0.23
Nor
0.21
Nor
0.19
WXYZ
0.17
every
0.17
nor
0.17
every
0.16
Activations Density 0.374%