INDEX
Explanations
phrases indicating concession or contrast
New Auto-Interp
Negative Logits
esco
-0.20
rab
-0.16
uler
-0.16
anne
-0.15
quindi
-0.15
ERC
-0.15
InSeconds
-0.14
hta
-0.14
alc
-0.14
gamber
-0.14
POSITIVE LOGITS
adow
0.15
share
0.15
ileaks
0.14
gaard
0.14
share
0.14
recio
0.14
adows
0.14
enny
0.14
-share
0.14
arith
0.13
Activations Density 0.011%