INDEX
Explanations
mentions of metro systems or significant dates, particularly Saturdays
New Auto-Interp
Negative Logits
'\\;'
-0.93
Administrativna
-0.91
AddTagHelper
-0.91
ſeine
-0.91
[@BOS@]
-0.91
<unused28>
-0.90
<unused8>
-0.90
<unused47>
-0.90
<unused43>
-0.90
<unused14>
-0.90
POSITIVE LOGITS
↵↵
0.59
0.50
.
0.49
1
0.48
6
0.48
,
0.47
2
0.45
3
0.44
and
0.43
4
0.42
Activations Density 0.233%