INDEX
Explanations
specific references to a city or location within longer passages
occurrences of the word "the."
New Auto-Interp
Negative Logits
.</
-0.79
thereof
-0.79
.''
-0.78
.","
-0.75
.�
-0.73
."
-0.72
thereby
-0.72
elaide
-0.71
.[
-0.70
thood
-0.67
POSITIVE LOGITS
oret
1.02
resa
0.99
Clintons
0.96
simplest
0.93
latest
0.90
aforementioned
0.90
biggest
0.89
toughest
0.85
odore
0.85
hardest
0.85
Activations Density 1.034%