INDEX
Explanations
references to a specific individual or concept associated with the term "man."
New Auto-Interp
Negative Logits
']")
-1.11
BibitemShut
-0.98
^(@)
-0.97
')")
-0.90
))}
-0.88
"]}
-0.87
"])
-0.86
","","
-0.86
Theſe
-0.86
}}]{-0.86
POSITIVE LOGITS
Man
1.64
man
1.64
Man
1.53
MAN
1.51
man
1.44
Men
1.31
men
1.31
MAN
1.27
MEN
1.18
Men
1.12
Activations Density 0.087%