INDEX
Explanations
URLs and domain names
punctuations and university names
New Auto-Interp
Negative Logits
ococ
-0.74
Malk
-0.72
Osh
-0.70
roup
-0.70
683
-0.69
asbestos
-0.67
Shapiro
-0.67
Oslo
-0.67
Leh
-0.66
Palestine
-0.66
POSITIVE LOGITS
D
1.42
DD
1.38
d
1.34
Ds
1.31
dor
1.26
ds
1.26
dt
1.25
DR
1.23
DM
1.23
da
1.22
Activations Density 1.213%