INDEX
Explanations
proper nouns, especially related to specific individuals
names of individuals and companies
New Auto-Interp
Negative Logits
ocker
-0.75
ative
-0.69
ohyd
-0.69
Kimmel
-0.67
MLG
-0.65
Kamp
-0.64
Cambodia
-0.63
Barkley
-0.63
Donkey
-0.63
[+
-0.62
POSITIVE LOGITS
inarily
0.92
scl
0.83
bent
0.81
lapt
0.81
gaard
0.78
sites
0.77
rez
0.75
oir
0.74
need
0.74
Coul
0.74
Activations Density 0.069%