INDEX
Explanations
terms related to individuals, specifically focusing on titles like "man" and "person."
references to individuals, particularly males
New Auto-Interp
Negative Logits
iolet
-0.66
Advertisement
-0.63
arna
-0.60
Pact
-0.59
iterranean
-0.59
rontal
-0.59
ominium
-0.57
erald
-0.57
Despair
-0.56
illus
-0.56
POSITIVE LOGITS
closest
1.14
liest
1.06
nearest
1.06
responsible
1.04
iest
0.90
responsible
0.89
furthe
0.85
pictured
0.84
who
0.83
osphere
0.82
Activations Density 0.279%