INDEX
Explanations
names or titles related to prominent leadership positions
references to individuals in positions of authority, specifically chairs of committees
New Auto-Interp
Negative Logits
asca
-0.75
uder
-0.71
Cooldown
-0.70
ENC
-0.66
ilar
-0.65
اÙĦ
-0.64
IMAGES
-0.64
Sharp
-0.63
Sensor
-0.62
XP
-0.62
POSITIVE LOGITS
IAL
1.04
ority
1.01
doms
1.01
pins
0.94
woman
0.90
person
0.90
emer
0.89
mans
0.80
chair
0.79
drawer
0.79
Activations Density 0.017%