INDEX
Explanations
words related to specific locations or institutions
proper nouns or names, particularly of people and places
New Auto-Interp
Negative Logits
arding
-0.79
orate
-0.76
arded
-0.74
iard
-0.71
jamin
-0.70
naissance
-0.70
oard
-0.68
gettable
-0.67
urate
-0.66
umen
-0.66
POSITIVE LOGITS
plings
0.87
ustain
0.85
atchewan
0.84
rano
0.77
pling
0.75
udo
0.75
eways
0.73
Account
0.72
arin
0.70
atoon
0.70
Activations Density 0.144%