INDEX
Explanations
surnames, potentially from news articles or academic works
names of people and places
New Auto-Interp
Negative Logits
referen
-0.62
$$$$
-0.52
dime
-0.51
thous
-0.51
ancest
-0.49
redistributed
-0.49
sovere
-0.47
notch
-0.45
predec
-0.45
taxp
-0.44
POSITIVE LOGITS
ank
0.61
hus
0.58
ats
0.57
ias
0.56
ius
0.56
oop
0.56
bek
0.56
gal
0.56
oli
0.55
ema
0.54
Activations Density 1.068%