INDEX
Explanations
mentions of specific organizations or entities within a particular context or scenario
the presence of opening parentheses
New Auto-Interp
Negative Logits
iazep
-0.75
irds
-0.74
entitle
-0.71
lull
-0.71
everyday
-0.68
quished
-0.68
olicy
-0.66
pale
-0.64
uly
-0.64
nightly
-0.63
POSITIVE LOGITS
emphasis
1.20
sic
1.15
see
1.10
including
1.06
formerly
1.05
...)
1.05
excluding
1.01
also
1.01
via
1.00
which
0.98
Activations Density 0.204%