INDEX
Explanations
phrases related to hierarchy or ranking
references to "second-class" status or treatment
New Auto-Interp
Negative Logits
vez
-0.78
kee
-0.72
tti
-0.64
ãĤĭ
-0.64
gerald
-0.64
ihil
-0.63
velt
-0.62
bleacher
-0.61
enegger
-0.61
¶ħ
-0.60
POSITIVE LOGITS
hand
1.27
arily
1.12
baseman
1.06
aries
1.03
guessing
0.92
glance
0.90
ary
0.86
chance
0.83
glances
0.82
halves
0.80
Activations Density 0.050%