INDEX
Explanations
names of places or people that include specific letter sequences
New Auto-Interp
Negative Logits
yip
-0.61
independents
-0.58
srfAttach
-0.57
flight
-0.56
karma
-0.56
acebook
-0.55
Liberals
-0.54
Grad
-0.54
pedia
-0.54
Buk
-0.54
POSITIVE LOGITS
roe
0.73
ember
0.71
ethy
0.67
agement
0.66
ĸļ
0.65
vre
0.65
itchie
0.65
aney
0.64
oyer
0.64
ciating
0.62
Activations Density 1.033%