INDEX
Explanations
female names
proper names, particularly focusing on the names of individuals
New Auto-Interp
Negative Logits
ounding
-0.78
direction
-0.76
awar
-0.74
istani
-0.73
papers
-0.69
zag
-0.69
haps
-0.69
cffff
-0.69
tumblr
-0.68
roph
-0.67
POSITIVE LOGITS
Gomez
1.06
Olson
1.00
Kelley
0.99
Roe
0.97
Roberts
0.97
Robinson
0.96
Moore
0.95
Rivera
0.92
Hansen
0.91
Carey
0.91
Activations Density 0.145%