INDEX
Explanations
parentheses indicating explanations or additional information
the presence of opening parentheses in text
New Auto-Interp
Negative Logits
lull
-0.74
millenn
-0.69
therap
-0.68
quished
-0.68
pale
-0.68
ris
-0.66
wre
-0.66
Lumpur
-0.66
cyan
-0.65
entitle
-0.65
POSITIVE LOGITS
sic
1.34
emphasis
1.29
see
1.20
possibly
1.17
...)
1.17
including
1.16
which
1.15
via
1.13
â̦)
1.12
formerly
1.11
Activations Density 0.191%