INDEX
Explanations
proper nouns or names, specifically last names but may include any kind of names
New Auto-Interp
Negative Logits
coni
-0.70
rower
-0.68
ejac
-0.67
ende
-0.67
kefeller
-0.66
berman
-0.65
opausal
-0.65
disproportionate
-0.65
iltration
-0.64
stros
-0.63
POSITIVE LOGITS
atchewan
1.43
ansas
1.06
atoon
1.05
Rapids
1.05
edIn
1.01
aign
0.97
oln
0.97
ings
0.89
erness
0.89
erville
0.89
Activations Density 0.927%