INDEX
Explanations
people's names
proper nouns and names
New Auto-Interp
Negative Logits
ccording
-0.64
ĺħ
-0.62
actionDate
-0.62
dylib
-0.61
drm
-0.61
medic
-0.60
natureconservancy
-0.60
href
-0.60
https
-0.60
solder
-0.60
POSITIVE LOGITS
zyk
0.93
Samar
0.84
Ramsey
0.79
Anders
0.77
Morris
0.74
igans
0.73
osaurus
0.73
ians
0.72
oulos
0.72
akis
0.71
Activations Density 0.242%