INDEX
Explanations
structural elements or formatting indicators within the text
New Auto-Interp
Negative Logits
retard
-0.81
entitle
-0.73
tenants
-0.72
spatial
-0.72
creations
-0.70
ordinary
-0.69
disappear
-0.68
dens
-0.66
everyday
-0.66
furnish
-0.64
POSITIVE LOGITS
pictured
1.08
formerly
1.07
tie
1.05
who
1.03
aka
1.03
pron
1.01
six
1.01
via
1.00
sports
1.00
four
1.00
Activations Density 0.052%