INDEX
Explanations
comparative phrases related to rankings or measurements
New Auto-Interp
Negative Logits
azon
-0.16
олж
-0.14
él
-0.14
crud
-0.13
çµĮ
-0.13
anno
-0.13
OND
-0.13
ä¹Ĺ
-0.13
culus
-0.13
blat
-0.13
POSITIVE LOGITS
runner
0.37
behind
0.36
close
0.34
closely
0.33
runners
0.32
distant
0.32
second
0.30
followed
0.29
close
0.29
Close
0.28
Activations Density 0.109%