INDEX
Explanations
the title "chairman" or "chairwoman".
New Auto-Interp
Negative Logits
leet
-0.61
esh
-0.56
wydd
-0.54
ery
-0.51
Nish
-0.51
ges
-0.50
generalized
-0.50
landı
-0.50
Į
-0.50
\|^{-0.50
POSITIVE LOGITS
CHAIRMAN
1.72
chairmen
1.63
Chairman
1.48
chairman
1.43
chairman
1.40
Chairman
1.34
irmanship
1.23
CHAIR
1.20
Chair
1.19
CHAIR
1.19
Activations Density 0.005%