INDEX
Explanations
punctuation and formatting cues in the text
New Auto-Interp
Negative Logits
Our
-0.22
Whilst
-0.22
We
-0.21
whilst
-0.20
Our
-0.18
ourselves
-0.18
Please
-0.17
our
-0.17
.Our
-0.16
ÐľÑĭ
-0.16
POSITIVE LOGITS
Officials
0.23
Officials
0.23
officials
0.22
Gov
0.19
Asked
0.19
About
0.19
That
0.19
"(
0.18
Asked
0.18
But
0.18
Activations Density 0.283%