INDEX
Explanations
names of people or places
proper nouns related to specific individuals or organizations
New Auto-Interp
Negative Logits
arial
-1.00
uate
-0.81
inates
-0.79
ially
-0.78
ional
-0.74
sucker
-0.74
antine
-0.73
otto
-0.71
egal
-0.70
uated
-0.70
POSITIVE LOGITS
nesday
1.48
restling
0.96
tip
0.89
ashington
0.85
esome
0.82
fare
0.76
izard
0.75
edge
0.74
atts
0.74
houses
0.73
Activations Density 0.076%