INDEX
Explanations
explicit references indicating the attribution of statements or opinions to specific individuals
occurrences of the word "that."
New Auto-Interp
Negative Logits
arest
-0.71
cept
-0.71
Pont
-0.68
aukee
-0.64
IELD
-0.63
estern
-0.62
Tank
-0.61
oses
-0.60
EMBER
-0.60
andem
-0.59
POSITIVE LOGITS
"[
1.04
although
1.03
'[
0.83
whilst
0.77
"...
0.76
soever
0.75
they
0.74
"â̦
0.74
while
0.73
"#
0.70
Activations Density 0.196%